Cristiano Capone, Luca Falorsi, Andrea Ciardiello
et al.
Rapid adaptation in complex control systems remains a central challenge in reinforcement learning. We introduce a framework in which policy and value functions share a low-dimensional coefficient vector - a goal embedding - that captures task identity and enables immediate adaptation to novel tasks without retraining representations. During pretraining, we jointly learn structured value bases and compatible policy bases through a bilinear actor-critic decomposition. The critic factorizes as Q = sum_k G_k(g) y_k(s,a), where G_k(g) is a goal-conditioned coefficient vector and y_k(s,a) are learned value basis functions. This multiplicative gating - where a context signal scales a set of state-dependent bases - is reminiscent of gain modulation observed in Layer 5 pyramidal neurons, where top-down inputs modulate the gain of sensory-driven responses without altering their tuning. Building on Successor Features, we extend the decomposition to the actor, which composes a set of primitive policies weighted by the same coefficients G_k(g). At test time the bases are frozen and G_k(g) is estimated zero-shot via a single forward pass, enabling immediate adaptation to novel tasks without any gradient update. We train a Soft Actor-Critic agent on the MuJoCo Ant environment under a multi-directional locomotion objective, requiring the agent to walk in eight directions specified as continuous goal vectors. The bilinear structure allows each policy head to specialize to a subset of directions, while the shared coefficient layer generalizes across them, accommodating novel directions by interpolating in goal embedding space. Our results suggest that shared low-dimensional goal embeddings offer a general mechanism for rapid, structured adaptation in high-dimensional control, and highlight a potentially biologically plausible principle for efficient transfer in complex reinforcement learning systems.
Christian A. Kothe, Sean Mullen, Michael V. Bronstein
et al.
Objective. We establish a principled method for inferring mental health related psychometric variables from neural and behavioral data using the Implicit Association Test (IAT) as the data generation engine, aiming to overcome the limited predictive performance (typically under 0.7 AUC) of the gold-standard D-score method, which relies solely on reaction times. Approach. We propose a sparse hierarchical Bayesian model that leverages multi-modal data to predict experiences related to mental illness symptoms in new participants. The model is a multivariate generalization of the D-score with trainable parameters, engineered for parameter efficiency in the small-cohort regime typical of IAT studies. Data from two IAT variants were analyzed: a suicidality-related E-IAT ($n=39$) and a psychosis-related PSY-IAT ($n=34$). Main Results. Our approach overcomes a high inter-individual variability and low within-session effect size in the dataset, reaching AUCs of 0.73 (E-IAT) and 0.76 (PSY-IAT) in the best modality configurations, though corrected 95% confidence intervals are wide ($\pm 0.18$) and results are marginally significant after FDR correction ($q=0.10$). Restricting the E-IAT to MDD participants improves AUC to 0.79 $[0.62, 0.97]$ (significant at $q=0.05$). Performance is on par with the best reference methods (shrinkage LDA and EEGNet) for each task, even when the latter were adapted to the task, while the proposed method was not. Accuracy was substantially above near-chance D-scores (0.50-0.53 AUC) in both tasks, with more consistent cross-task performance than any single reference method. Significance. Our framework shows promise for enhancing IAT-based assessment of experiences related to entrapment and psychosis, and potentially other mental health conditions, though further validation on larger and independent cohorts will be needed to establish clinical utility.
DAMPE satellite has directly measured the cosmic ray proton spectrum from 40 GeV to 100 TeV and revealed a new feature at about 13.6 TeV. The precise measurement of the spectrum of protons, the most abundant component of the cosmic radiation, is necessary to understand the source and acceleration of cosmic rays in the Milky Way. This work reports the measurement of the cosmic ray proton fluxes with kinetic energies from 40 GeV to 100 TeV, with 2 1/2 years of data recorded by the DArk Matter Particle Explorer (DAMPE). This is the first time that an experiment directly measures the cosmic ray protons up to ~100 TeV with high statistics. The measured spectrum confirms the spectral hardening at ~300 GeV found by previous experiments and reveals a softening at ~13.6 TeV, with the spectral index changing from ~2.60 to ~2.85. Our result suggests the existence of a new spectral feature of cosmic rays at energies lower than the so-called knee and sheds new light on the origin of Galactic cosmic rays.
In his opening OFC plenary talk back in 2021, Alibaba Group's Yiqun Cai notably added in the follow-up Q&A that today's complex networks are more than computer science - they grow, they are life. This entails that future networks may be better viewed as techno-social systems that resemble biological superorganisms with brain-like cognitive capabilities. Fast-forwarding, there is now growing awareness that we have to completely change our networks from being static into being a living entity that would act as an AI-powered network `brain', as recently stated by Bruno Zerbib, Chief Technology and Innovation Officer of France's Orange, at the Mobile World Congress (MWC) 2025. Even though AI was front and center at both MWC and OFC 2025 and has been widely studied in the context of optical networks, there are currently no publications on active inference in optical (and less so mobile) networks available. Active inference is an ideal methodology for developing more advanced AI systems by biomimicking the way living intelligent systems work, while overcoming the limitations of today's AI related to training, learning, and explainability. Active inference is considered the key to true AI: Less artificial, more intelligent. The goal of this paper is twofold. First, we aim at enabling optical network researchers to conceptualize new research lines for future optical networks with human-AI interaction capabilities by introducing them to the main mathematical concepts of the active inference framework. Second, we demonstrate how to move AI research beyond the human brain toward the 6G world brain by exploring the role of mycorrhizal networks, the largest living organism on planet Earth, in the AI vision and R&D roadmap for the next decade and beyond laid out by Karl Friston, the father of active inference.
This paper introduces a cognitive Retrieval-Augmented Generator (RAG) architecture that transcends transformer context-length limitations through phase-coded memory and morphological-semantic resonance. Instead of token embeddings, the system encodes meaning as complex wave patterns with amplitude-phase structure. A three-tier design is presented: a Morphological Mapper that transforms inputs into semantic waveforms, a Field Memory Layer that stores knowledge as distributed holographic traces and retrieves it via phase interference, and a Non-Contextual Generator that produces coherent output guided by resonance rather than fixed context. This approach eliminates sequential token dependence, greatly reduces memory and computational overhead, and enables unlimited effective context through frequency-based semantic access. The paper outlines theoretical foundations, pseudocode implementation, and experimental evidence from related complex-valued neural models, emphasizing substantial energy, storage, and time savings.
We report a measurement of electron antineutrino oscillation from the Daya Bay Reactor Neutrino Experiment with nearly 4 million reactor ν[over ¯]_{e} inverse β decay candidates observed over 1958 days of data collection. The installation of a flash analog-to-digital converter readout system and a special calibration campaign using different source enclosures reduce uncertainties in the absolute energy calibration to less than 0.5% for visible energies larger than 2 MeV. The uncertainty in the cosmogenic ^{9}Li and ^{8}He background is reduced from 45% to 30% in the near detectors. A detailed investigation of the spent nuclear fuel history improves its uncertainty from 100% to 30%. Analysis of the relative ν[over ¯]_{e} rates and energy spectra among detectors yields sin^{2}2θ_{13}=0.0856±0.0029 and Δm_{32}^{2}=(2.471_{-0.070}^{+0.068})×10^{-3} eV^{2} assuming the normal hierarchy, and Δm_{32}^{2}=-(2.575_{-0.070}^{+0.068})×10^{-3} eV^{2} assuming the inverted hierarchy.
Traditional methods of controlling prosthetics frequently encounter difficulties regarding flexibility and responsiveness, which can substantially impact people with varying cognitive and physical abilities. Advancements in computational neuroscience and machine learning (ML) have recently led to the development of highly advanced brain-computer interface (BCI) systems that may be customized to meet individual requirements. To address these issues, we propose NeuroAssist, a sophisticated method for analyzing EEG data that merges state-of-the-art BCI technology with adaptable artificial intelligence (AI) algorithms. NeuroAssist's hybrid neural network design efficiently overcomes the constraints of conventional EEG data processing. Our methodology combines a Natural Language Processing (NLP) BERT model to extract complex features from numerical EEG data and utilizes LSTM networks to handle temporal dynamics. In addition, we integrate spiking neural networks (SNNs) and deep Q-networks (DQN) to improve decision-making and flexibility. Our preprocessing method classifies motor imagery (MI) one-versus-the-rest using a common spatial pattern (CSP) while preserving EEG temporal characteristics. The hybrid architecture of NeuroAssist serves as the DQN's Q-network, enabling continuous feedback-based improvement and adaptability. This enables it to acquire optimal actions through trial and error. This experimental analysis has been conducted on the GigaScience and BCI-competition-IV-2a datasets, which have shown exceptional effectiveness in categorizing MI-EEG signals, obtaining an impressive classification accuracy of 99.17%. NeuroAssist offers a crucial approach to current assistive technology by potentially enhancing the speed and versatility of BCI systems.
We report high statistics measurements of inclusive charged hadron production in Au+Au and p+p collisions at sqrt[s(NN)]=200 GeV. A large, approximately constant hadron suppression is observed in central Au+Au collisions for 5<p(T)<12 GeV/c. The collision energy dependence of the yields and the centrality and p(T) dependence of the suppression provide stringent constraints on theoretical models of suppression. Models incorporating initial-state gluon saturation or partonic energy loss in dense matter are largely consistent with observations. We observe no evidence of p(T)-dependent suppression, which may be expected from models incorporating jet attenuation in cold nuclear matter or scattering of fragmentation hadrons.
AMS-02 is wide acceptance high-energy physics experiment installed on the International Space Station in May 2011 and operating continuously since then. AMS-02 is able to precisely separate cosmic rays light nuclei (1≤ Z ≤ 8) with contaminations less than 10−3. The light nuclei cosmic ray Boron to Carbon flux ratio is very well known sensitive observable for the understanding of the propagation of cosmic rays in the Galaxy, being Boron a secondary product of spallation on the interstellar medium of heavier primary elements such as Carbon and Oxygen. A precision measurement based on 10 million events of the Boron to Carbon ratio in the rigidity range from 2 GV to 1.8 TV is presented.
A measurement of electron antineutrino oscillation by the Daya Bay Reactor Neutrino Experiment is described in detail. Six 2.9-GWth nuclear power reactors of the Daya Bay and Ling Ao nuclear power facilities served as intense sources of νe’s. Comparison of the νe rate and energy spectrum measured by antineutrino detectors far from the nuclear reactors (∼1500–1950 m ) relative to detectors near the reactors (∼350–600 m ) allowed a precise measurement of νe disappearance. More than 2.5 million νe inverse beta-decay interactions were observed, based on the combination of 217 days of operation of six antineutrino detectors (December, 2011–July, 2012) with a subsequent 1013 days using the complete configuration of eight detectors (October, 2012–July, 2015). The νe rate observed at the far detectors relative to the near detectors showed a significant deficit, R=0.949±0.002(stat)±0.002(syst). The energy dependence of νe disappearance showed the distinct variation predicted by neutrino oscillation. Analysis using an approximation for the three-flavor oscillation probability yielded the flavor-mixing angle sin^2 2θ_(13)=0.0841±0.0027(stat)±0.0019(syst) and the effective neutrino mass-squared difference of |Δm^2_(ee)|=(2.50±0.06(stat)±0.06(syst))×10^(−3) eV^2. Analysis using the exact three-flavor probability found Δm^2_(32)=(2.45±0.06(stat)±0.06(syst))×10^(−3) eV^2 assuming the normal neutrino mass hierarchy and Δm^2_(32)=(−2.56±0.06(stat)±0.06(syst))×10^(−3) eV^2 for the inverted hierarchy.
A search for supersymmetry is presented based on multijet events with large missing transverse momentum produced in proton-proton collisions at a center-of-mass energy of root s = 13 TeV. The data, corresponding to an integrated luminosity of 35.9 fb(-1), were collected with the CMS detector at the CERN LHC in 2016. The analysis utilizes four-dimensional exclusive search regions defined in terms of the number of jets, the number of tagged bottom quark jets, the scalar sum of jet transverse momenta, and the magnitude of the vector sum of jet transverse momenta. No evidence for a significant excess of events is observed relative to the expectation from the standard model. Limits on the cross sections for the pair production of gluinos and squarks are derived in the context of simplified models. Assuming the lightest supersymmetric particle to be a weakly interacting neutralino, 95% confidence level lower limits on the gluino mass as large as 1800 to 1960 GeV are derived, and on the squark mass as large as 960 to 1390 GeV, depending on the production and decay scenario.
Connor Bybee, Alexander Belsten, Friedrich T. Sommer
An open problem in neuroscience is to explain the functional role of oscillations in neural networks, contributing, for example, to perception, attention, and memory. Cross-frequency coupling (CFC) is associated with information integration across populations of neurons. Impaired CFC is linked to neurological disease. It is unclear what role CFC has in information processing and brain functional connectivity. We construct a model of CFC which predicts a computational role for observed $θ- γ$ oscillatory circuits in the hippocampus and cortex. Our model predicts that the complex dynamics in recurrent and feedforward networks of coupled oscillators performs robust information storage and pattern retrieval. Based on phasor associative memories (PAM), we present a novel oscillator neural network (ONN) model that includes subharmonic injection locking (SHIL) and which reproduces experimental observations of CFC. We show that the presence of CFC increases the memory capacity of a population of neurons connected by plastic synapses. CFC enables error-free pattern retrieval whereas pattern retrieval fails without CFC. In addition, the trade-offs between sparse connectivity, capacity, and information per connection are identified. The associative memory is based on a complex-valued neural network, or phasor neural network (PNN). We show that for values of $Q$ which are the same as the ratio of $γ$ to $θ$ oscillations observed in the hippocampus and the cortex, the associative memory achieves greater capacity and information storage than previous models. The novel contributions of this work are providing a computational framework based on oscillator dynamics which predicts the functional role of neural oscillations and connecting concepts in neural network theory and dynamical system theory.
Savannah A. Lynn, Flavie Soubigou, Jennifer M. Dewing
et al.
Matrix metalloproteinase-9 (MMP9) and total amyloid-beta (Aβ) are prospective biomarkers of ocular ageing and retinopathy. These were quantified by ELISA in the vitreous and blood from controls (n = 55) and in a subset of age-related macular degeneration (AMD) patients (n = 12) for insights and possible additional links between the ocular and systemic compartments. Vitreous MMP9 levels in control and AMD groups were 932.5 ± 240.9 pg/mL and 813.7 ± 157.6 pg/mL, whilst serum levels were 2228 ± 193 pg/mL and 2386.8 ± 449.4 pg/mL, respectively. Vitreous Aβ in control and AMD groups were 1173.5 ± 117.1 pg/mL and 1275.6 ± 332.9 pg/mL, whilst plasma Aβ were 574.3 ± 104.8 pg/mL and 542.2 ± 139.9 pg/mL, respectively. MMP9 and Aβ showed variable levels across the lifecourse, indicating no correlation to each other or with age nor AMD status, though the smaller AMD cohort was a limiting factor. Aβ and MMP9 levels in the vitreous and blood were unrelated to mean arterial pressure. Smoking, another modifiable risk, showed no association with vitreous Aβ. However, smoking may be linked with vitreous (p = 0.004) and serum (p = 0.005) MMP9 levels in control and AMD groups, though this did not reach our elevated (p = 0.001) significance. A bioinformatics analysis revealed promising MMP9 and APP/Aβ partners for further scrutiny, many of which are already linked with retinopathy.
Can neural networks learn goal-directed behaviour using similar strategies to the brain, by combining the relationships between the current state of the organism and the consequences of future actions? Recent work has shown that recurrent neural networks trained on goal based tasks can develop representations resembling those found in the brain, entorhinal cortex grid cells, for instance. Here we explore the evolution of the dynamics of their internal representations and compare this with experimental data. We observe that once a recurrent network is trained to learn the structure of its environment solely based on sensory prediction, an attractor based landscape forms in the network's representation, which parallels hippocampal place cells in structure and function. Next, we extend the predictive objective to include Q-learning for a reward task, where rewarding actions are dependent on delayed cue modulation. Mirroring experimental findings in hippocampus recordings in rodents performing the same task, this training paradigm causes nonlocal neural activity to sweep forward in space at decision points, anticipating the future path to a rewarded location. Moreover, prevalent choice and cue-selective neurons form in this network, again recapitulating experimental findings. Together, these results indicate that combining predictive, unsupervised learning of the structure of an environment with reinforcement learning can help understand the formation of hippocampus-like representations containing both spatial and task-relevant information.
Neurofeedback is a non-invasive brain training with long-term medical and non-medical applications. Despite the existence of several emotion regulation studies using neurofeedback, further investigation is needed to understand interactions of the brain regions involved in the process. We implemented EEG neurofeedback with simultaneous fMRI using a modified happiness-inducing task through autobiographical memories to upregulate positive emotion. The results showed increased activity of prefrontal, occipital, parietal, and limbic regions and increased functional connectivity between prefrontal, parietal, limbic system, and insula in the experimental group. New connectivity links were identified by comparing the functional connectivity of different experimental conditions within the experimental group and between the experimental and control groups. The proposed multimodal approach quantified the changes in the brain activity (up to 1.9% increase) and connectivity (FDR-corrected for multiple comparison, q = 0.05) during emotion regulation in/between prefrontal, parietal, limbic, and insula regions. Psychometric assessments confirmed significant changes in positive and negative mood states by neurofeedback with a p-value smaller than 0.002 in the experimental group. This study quantifies the effects of EEG neurofeedback in changing functional connectivity of all brain regions involved in emotion regulation. For the brain regions involved in emotion regulation, we found significant BOLD and functional connectivity increases due to neurofeedback in the experimental group but no learning effect was observed in the control group. The results reveal the neurobiological substrate of emotion regulation by the EEG neurofeedback and separate the effect of the neurofeedback and the recall of the autobiographical memories.
Knowing how the effects of directed actions generalise to new situations (e.g. moving North, South, East and West, or turning left, right, etc.) is key to rapid generalisation across new situations. Markovian tasks can be characterised by a state space and a transition matrix and recent work has proposed that neural grid codes provide an efficient representation of the state space, as eigenvectors of a transition matrix reflecting diffusion across states, that allows efficient prediction of future state distributions. Here we extend the eigenbasis prediction model, utilising tools from Fourier analysis, to prediction over arbitrary translation-invariant directed transition structures (i.e. displacement and diffusion), showing that a single set of eigenvectors can support predictions over arbitrary directed actions via action-specific eigenvalues. We show how to define a "sense of direction" to combine actions to reach a target state (ignoring task-specific deviations from translation-invariance), and demonstrate that adding the Fourier representations to a deep Q network aids policy learning in continuous control tasks. We show the equivalence between the generalised prediction framework and traditional models of grid cell firing driven by self-motion to perform path integration, either using oscillatory interference (via Fourier components as velocity-controlled oscillators) or continuous attractor networks (via analysis of the update dynamics). We thus provide a unifying framework for the role of the grid system in predictive planning, sense of direction and path integration: supporting generalisable inference over directed actions across different tasks.
There is increasing realization in neuroscience that information is represented in the brain, e.g., neocortex, hippocampus, in the form sparse distributed codes (SDCs), a kind of cell assembly. Two essential questions are: a) how are such codes formed on the basis of single trials, and how is similarity preserved during learning, i.e., how do more similar inputs get mapped to more similar SDCs. I describe a novel Modular Sparse Distributed Code (MSDC) that provides simple, neurally plausible answers to both questions. An MSDC coding field (CF) consists of Q WTA competitive modules (CMs), each comprised of K binary units (analogs of principal cells). The modular nature of the CF makes possible a single-trial, unsupervised learning algorithm that approximately preserves similarity and crucially, runs in fixed time, i.e., the number of steps needed to store an item remains constant as the number of stored items grows. Further, once items are stored as MSDCs in superposition and such that their intersection structure reflects input similarity, both fixed time best-match retrieval and fixed time belief update (updating the probabilities of all stored items) also become possible. The algorithm's core principle is simply to add noise into the process of choosing a code, i.e., choosing a winner in each CM, which is proportional to the novelty of the input. This causes the expected intersection of the code for an input, X, with the code of each previously stored input, Y, to be proportional to the similarity of X and Y. Results demonstrating these capabilities for spatial patterns are given in the appendix.
AbstractAberrant activation of fibroblast growth factor receptor (FGFR) signalling contributes to progression and metastasis of many types of cancers including breast cancer. Accordingly, FGFR targeted tyrosine kinase inhibitors (TKIs) are currently under development. However, the efficacy of FGFR TKIs in the bone microenvironment where breast cancer cells most frequently metastasize and also where FGFR is biologically active, has not been clearly investigated. We investigated the FGFR-mediated interactions among cancer and the bone microenvironment stromal cells (osteoblasts and osteoclasts), and also the effects of FGFR inhibition in bone metastasis. We showed that addition of culture supernatant from the MDA-MB-134-VI FGFR-amplified breast cancer cells-activated FGFR siganalling in osteoblasts, including increased expression of RANKL, M-CSF, and osteoprotegerin (OPG). Further in vitro analyses showed that AZD4547, an FGFR TKI currently in clinical trials for breast cancer, decreased RANKL and M-CSF, and subsequently RANKL and M-CSF-dependent osteoclastogenesis of murine bone marrow monocytes. Moreover, AZD4547 suppressed osteoclastogenesis and tumor-induced osteolysis in an orthotopic breast cancer bone metastasis mouse model using FGFR non-amplified MDA-MB-231 cells. Collectively, our results support that FGFR inhibitors inhibit the bone microenvironment stromal cells including osteoblasts and osteoclasts, and effectively suppress both tumor and stromal compartments of bone metastasis.
Complementary Learning Systems (CLS) theory suggests that the brain uses a 'neocortical' and a 'hippocampal' learning system to achieve complex behavior. These two systems are complementary in that the 'neocortical' system relies on slow learning of distributed representations while the 'hippocampal' system relies on fast learning of pattern-separated representations. Both of these systems project to the striatum, which is a key neural structure in the brain's implementation of Reinforcement Learning (RL). Current deep RL approaches share similarities with a 'neocortical' system because they slowly learn distributed representations through backpropagation in Deep Neural Networks (DNNs). An ongoing criticism of such approaches is that they are data inefficient and lack flexibility. CLS theory suggests that the addition of a 'hippocampal' system could address these criticisms. In the present study we propose a novel algorithm known as Complementary Temporal Difference Learning (CTDL), which combines a DNN with a Self-Organising Map (SOM) to obtain the benefits of both a 'neocortical' and a 'hippocampal' system. Key features of CTDL include the use of Temporal Difference (TD) error to update a SOM and the combination of a SOM and DNN to calculate action values. We evaluate CTDL on grid worlds and the Cart-Pole environment, and show several benefits over the classic Deep Q-Network (DQN) approach. These results demonstrate (1) the utility of complementary learning systems for the evaluation of actions, (2) that the TD error signal is a useful form of communication between the two systems and (3) the biological plausibility of the proposed approach.
Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf
et al.
Drawing an inspiration from behavioral studies of human decision making, we propose here a more general and flexible parametric framework for reinforcement learning that extends standard Q-learning to a two-stream model for processing positive and negative rewards, and allows to incorporate a wide range of reward-processing biases -- an important component of human decision making which can help us better understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems, as well as various neuropsychiatric conditions associated with disruptions in normal reward processing. From the computational perspective, we observe that the proposed Split-QL model and its clinically inspired variants consistently outperform standard Q-Learning and SARSA methods, as well as recently proposed Double Q-Learning approaches, on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the Pac-Man game in a lifelong learning setting across different reward stationarities.