This report describes the physics case, the resulting detector requirements, and the evolving detector concepts for the experimental program at the Electron-Ion Collider (EIC). The EIC will be a powerful new high-luminosity facility in the United States with the capability to collide high-energy electron beams with high-energy proton and ion beams, providing access to those regions in the nucleon and nuclei where their structure is dominated by gluons. Moreover, polarized beams in the EIC will give unprecedented access to the spatial and spin structure of the proton, neutron, and light ions. The studies leading to this document were commissioned and organized by the EIC User Group with the objective of advancing the state and detail of the physics program and developing detector concepts that meet the emerging requirements in preparation for the realization of the EIC. The effort aims to provide the basis for further development of concepts for experimental equipment best suited for the science needs, including the importance of two complementary detectors and interaction regions. This report consists of three volumes. Volume I is an executive summary of our findings and developed concepts. In Volume II we describe studies of a wide range of physics measurements and the emerging requirements on detector acceptance and performance. Volume III discusses general-purpose detector concepts and the underlying technologies to meet the physics requirements. These considerations will form the basis for a world-class experimental program that aims to increase our understanding of the fundamental structure of all visible matter
This article summarizes technical advances contained in the fifth major release of the Q-Chem quantum chemistry program package, covering developments since 2015. A comprehensive library of exchange–correlation functionals, along with a suite of correlated many-body methods, continues to be a hallmark of the Q-Chem software. The many-body methods include novel variants of both coupled-cluster and configuration-interaction approaches along with methods based on the algebraic diagrammatic construction and variational reduced density-matrix methods. Methods highlighted in Q-Chem 5 include a suite of tools for modeling core-level spectroscopy, methods for describing metastable resonances, methods for computing vibronic spectra, the nuclear–electronic orbital method, and several different energy decomposition analysis techniques. High-performance capabilities including multithreaded parallelism and support for calculations on graphics processing units are described. Q-Chem boasts a community of well over 100 active academic developers, and the continuing evolution of the software is supported by an “open teamware” model and an increasingly modular design.
A precise measurement of the proton flux in primary cosmic rays with rigidity (momentum/charge) from 1 GV to 1.8 TV is presented based on 300 million events. Knowledge of the rigidity dependence of the proton flux is important in understanding the origin, acceleration, and propagation of cosmic rays. We present the detailed variation with rigidity of the flux spectral index for the first time. The spectral index progressively hardens at high rigidities.
High quality factor resonances are extremely promising for designing ultra-sensitive refractive index label-free sensors, since it allows intense interaction between electromagnetic waves and the analyte material. Metamaterial and plasmonic sensing have recently attracted a lot of attention due to subwavelength confinement of electromagnetic fields in the resonant structures. However, the excitation of high quality factor resonances in these systems has been a challenge. We excite an order of magnitude higher quality factor resonances in planar terahertz metamaterials that we exploit for ultrasensitive sensing. The low-loss quadrupole and Fano resonances with extremely narrow linewidths enable us to measure the minute spectral shift caused due to the smallest change in the refractive index of the surrounding media. We achieve sensitivity levels of 7.75 × 103 nm/refractive index unit (RIU) with quadrupole and 5.7 × 104 nm/RIU with the Fano resonances which could be further enhanced by using thinner substrates. These findings would facilitate the design of ultrasensitive real time chemical and biomolecular sensors in the fingerprint region of the terahertz regime.
Reinforcement Learning From Human Feedback (RLHF) has been critical to the success of the latest generation of generative AI models. In response to the complex nature of the classical RLHF pipeline, direct alignment algorithms such as Direct Preference Optimization (DPO) have emerged as an alternative approach. Although DPO solves the same objective as the standard RLHF setup, there is a mismatch between the two approaches. Standard RLHF deploys reinforcement learning in a specific token-level MDP, while DPO is derived as a bandit problem in which the whole response of the model is treated as a single arm. In this work we rectify this difference. We theoretically show that we can derive DPO in the token-level MDP as a general inverse Q-learning algorithm, which satisfies the Bellman equation. Using our theoretical results, we provide three concrete empirical insights. First, we show that because of its token level interpretation, DPO is able to perform some type of credit assignment. Next, we prove that under the token level formulation, classical search-based algorithms, such as MCTS, which have recently been applied to the language generation space, are equivalent to likelihood-based search on a DPO policy. Empirically we show that a simple beam search yields meaningful improvement over the base DPO policy. Finally, we show how the choice of reference policy causes implicit rewards to decline during training. We conclude by discussing applications of our work, including information elicitation in multi-turn dialogue, reasoning, agentic applications and end-to-end training of multi-model systems.
Fotios Anagnostopoulos, S. Basilakos, E. Saridakis
Fotios K. Anagnostopoulos, Spyros Basilakos, 3 and Emmanuel N. Saridakis 4, 5 Department of Physics, National & Kapodistrian University of Athens, Zografou Campus GR 157 73, Athens, Greece National Observatory of Athens, Lofos Nymfon, 11852 Athens, Greece Academy of Athens, Research Center for Astronomy and Applied Mathematics, Soranou Efesiou 4, 11527, Athens, Greece CAS Key Laboratory for Researches in Galaxies and Cosmology, Department of Astronomy, University of Science and Technology of China, Hefei, Anhui 230026, P.R. China School of Astronomy, School of Physical Sciences, University of Science and Technology of China, Hefei 230026, P.R. China
In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We therefore refer to the method as Q-Transformer. By discretizing each action dimension and representing the Q-value of each action dimension as separate tokens, we can apply effective high-capacity sequence modeling techniques for Q-learning. We present several design decisions that enable good performance with offline RL training, and show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite. The project's website and videos can be found at https://qtransformer.github.io
A precision measurement by AMS of the antiproton flux and the antiproton-to-proton flux ratio in primary cosmic rays in the absolute rigidity range from 1 to 450 GV is presented based on 3.49×10^{5} antiproton events and 2.42×10^{9} proton events. The fluxes and flux ratios of charged elementary particles in cosmic rays are also presented. In the absolute rigidity range ∼60 to ∼500 GV, the antiproton p[over ¯], proton p, and positron e^{+} fluxes are found to have nearly identical rigidity dependence and the electron e^{-} flux exhibits a different rigidity dependence. Below 60 GV, the (p[over ¯]/p), (p[over ¯]/e^{+}), and (p/e^{+}) flux ratios each reaches a maximum. From ∼60 to ∼500 GV, the (p[over ¯]/p), (p[over ¯]/e^{+}), and (p/e^{+}) flux ratios show no rigidity dependence. These are new observations of the properties of elementary particles in the cosmos.
Straddle Option is a financial trading tool that explores volatility premiums in high-volatility markets without predicting price direction. Although deep reinforcement learning has emerged as a powerful approach to trading automation in financial markets, existing work mostly focused on predicting price trends and making trading decisions by combining multi-dimensional datasets like blogs and videos, which led to high computational costs and unstable performance in high-volatility markets. To tackle this challenge, we develop automated straddle option trading based on reinforcement learning and attention mechanisms to handle unpredictability in high-volatility markets. Firstly, we leverage the attention mechanisms in Transformer-DDQN through both self-attention with time series data and channel attention with multi-cycle information. Secondly, a novel reward function considering excess earnings is designed to focus on long-term profits and neglect short-term losses over a stop line. Thirdly, we identify the resistance levels to provide reference information when great uncertainty in price movements occurs with intensified battle between the buyers and sellers. Through extensive experiments on the Chinese stock, Brent crude oil, and Bitcoin markets, our attention-based Transformer-DDQN model exhibits the lowest maximum drawdown across all markets, and outperforms other models by 92.5\% in terms of the average return excluding the crude oil market due to relatively low fluctuation.
Public announcement dates are used in the green bond literature to measure equity market reactions to upcoming green bond issues. We find a sizeable number of green bond announcements were pre-dated by anonymous information leakages on the Bloomberg Terminal. From a candidate set of 2,036 'Bloomberg News' and 'Bloomberg First Word' headlines gathered between 2016 and 2022, we identify 259 instances of green bond-related information being released before being publicly announced by the issuing firm. These pre-announcement leaks significantly alter the equity trading dynamics of the issuing firms over intraday and daily event windows. Significant negative abnormal returns and increased trading volumes are observed following news leaks about upcoming green bond issues. These negative investor reactions are concentrated amongst financial firms, and leaks that arrive pre-market or early in market trading. We find equity price movements following news leaks can be explained to a greater degree than following public announcements. Sectoral differences are also observed in the key drivers behind investor reactions to green bond leaks by non-financials (Tobin's Q and free cash flow) and financials (ROA). Our results suggest that information leakages have a strong impact on market behaviour, and should be accounted for in green bond literature. Our findings also have broader ramifications for financial literature going forward. Privileged access to financially material information, courtesy of the ubiquitous use of Bloomberg Terminals by professional investors, highlights the need for event studies to consider wider sets of communication channels to confirm the date at which information first becomes available.
Divyalakshmi Bhaskaran, Joshua Savage, Amit Patel
et al.
Abstract Background Glioblastoma (GBM) is the most common adult malignant brain tumour, with an incidence of 5 per 100,000 per year in England. Patients with tumours showing O6-methylguanine-DNA methyltransferase (MGMT) promoter methylation represent around 40% of newly diagnosed GBM. Relapse/tumour recurrence is inevitable. There is no agreed standard treatment for patients with GBM, therefore, it is aimed at delaying further tumour progression and maintaining health-related quality of life (HRQoL). Limited clinical trial data exist using cannabinoids in combination with temozolomide (TMZ) in this setting, but early phase data demonstrate prolonged overall survival compared to TMZ alone, with few additional side effects. Jazz Pharmaceuticals (previously GW Pharma Ltd.) have developed nabiximols (trade name Sativex®), an oromucosal spray containing a blend of cannabis plant extracts, that we aim to assess for preliminary efficacy in patients with recurrent GBM. Methods ARISTOCRAT is a phase II, multi-centre, double-blind, placebo-controlled, randomised trial to assess cannabinoids in patients with recurrent MGMT methylated GBM who are suitable for treatment with TMZ. Patients who have relapsed ≥ 3 months after completion of initial first-line treatment will be randomised 2:1 to receive either nabiximols or placebo in combination with TMZ. The primary outcome is overall survival time defined as the time in whole days from the date of randomisation to the date of death from any cause. Secondary outcomes include overall survival at 12 months, progression-free survival time, HRQoL (using patient reported outcomes from QLQ-C30, QLQ-BN20 and EQ-5D-5L questionnaires), and adverse events. Discussion Patients with recurrent MGMT promoter methylated GBM represent a relatively good prognosis sub-group of patients with GBM. However, their median survival remains poor and, therefore, more effective treatments are needed. The phase II design of this trial was chosen, rather than phase III, due to the lack of data currently available on cannabinoid efficacy in this setting. A randomised, double-blind, placebo-controlled trial will ensure an unbiased robust evaluation of the treatment and will allow potential expansion of recruitment into a phase III trial should the emerging phase II results warrant this development. Trial registration ISRCTN: 11460478. ClinicalTrials.Gov: NCT05629702.
Abstract Purpose Self-management can have clinical and quality-of-life benefits. However, people with lower-grade gliomas (LGG) may face chronic tumour- and/or treatment-related symptoms and impairments (e.g. cognitive deficits, seizures), which could influence their ability to self-manage. Our study aimed to identify and understand the barriers and facilitators to self-management in people with LGG. Methods We conducted semi-structured interviews with 28 people with LGG across the United Kingdom, who had completed primary treatment. Sixteen participants were male, mean age was 50.4 years, and mean time since diagnosis was 8.7 years. Interviews were audio-recorded and transcribed. Following inductive open coding, we deductively mapped codes to Schulman-Green et al.’s framework of factors influencing self-management, developed in chronic illness. Results Data suggested extensive support for all five framework categories (‘Personal/lifestyle characteristics’, ‘Health status’, ‘Resources’, ‘Environmental characteristics’, ‘Healthcare system’), encompassing all 18 factors influencing self-management. How people with LGG experience many of these factors appears somewhat distinct from other cancers; participants described multiple, often co-occurring, challenges, primarily with knowledge and acceptance of their incurable condition, the impact of seizures and cognitive deficits, transport difficulties, and access to (in)formal support. Several factors were on a continuum, for example, sufficient knowledge was a facilitator, whereas lack thereof, was a barrier to self-management. Conclusions People with LGG described distinctive experiences with wide-ranging factors influencing their ability to self-manage. Implications for cancer survivors These findings will improve awareness of the potential challenges faced by people with LGG around self-management and inform development of self-management interventions for this population.
Roberto Frota Decourt, Heitor Almeida, Philippe Protin
et al.
The purpose of the research was to build an index of informational asymmetry with market and firm proxies that reflect the analysts' perception of the level of informational asymmetry of companies. The proposed method consists of the construction of an algorithm based on the Elo rating and captures the perception of the analyst that choose, between two firms, the one they consider to have better information. After we have the informational asymmetry index, we run a regression model with our rating as dependent variable and proxies used by the literature as the independent variable to have a model that can be used for other researches that need to measure the level of informational asymmetry of a company. Our model presented a good fit between our index and the proxies used to measure informational asymmetry and we find four significant variables: coverage, volatility, Tobin q, and size.