<p>Stratospheric aerosol injections have been proposed to mitigate the effects of global warming. The injection of sulphur dioxide into the stratosphere is one possible idea. However, depending on the latitude, high emission rates can lead to very low transmissions from the perspective of a typical satellite solar occultation instrument, leading to the so-called zero transmission problem. Consequently, it is highly unlikely that a physically meaningful retrieval of the stratospheric aerosol extinction profiles is possible, depending on the latitude and wavelength. The current study analyses, using MAECHAM5-HAM and SCIATRAN, continuous injections of 30 <span class="inline-formula">Tg S yr<sup>−1</sup></span> as a hypothetical large-scale stratospheric aerosol injection scenario. For this purpose, sulphur dioxide was continuously injected at an altitude of 60 hPa (<span class="inline-formula">≈</span> 19 km) into one grid box (<span class="inline-formula">2.8<i>°</i>×2.8<i>°</i></span>) centred on the Equator at 121° E. Specifically, it is investigated which wavelengths, depending on the latitude, are necessary for plausible aerosol extinction profile retrievals. While a wavelength of 520 nm is insufficient for the retrieval for 5° N, the opposite can be concluded for 75° N and 75° S. For the latitudes 45° N and 45° S, a wavelength of at least 1543 nm is necessary. In contrast, 1900 nm is sufficient for 15° N and 15° S, as well as 5° N. Simulation results for an emission rate of 10 <span class="inline-formula">Tg S yr<sup>−1</sup></span> show that a minimum wavelength of 1543 nm is already sufficient for 5° N. The results also emphasize that encountering the zero transmission problem at shorter wavelengths does not render solar occultation measurements impossible, it requires appropriate wavelength selection based on aerosol loading. Consistent with expectations, a longer wavelength is required for the latitude range of and near the injection. These findings are therefore also relevant for satellite solar occultation measurements after major volcanic eruptions.</p>
<p>Calibration of lidar signals at 1064 nm from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) onboard Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellite depends on the prior calibration of the primary 532 nm channel. However, the 1064 nm calibration procedure also requires knowledge of the ratio of stratospheric signal attenuations at 1064 and 532 nm, which is not available a priori and thus is assumed to be 1. This assumption introduces a potential bias in the computed 1064 nm calibration coefficients. In this work we assess this bias by using independent multi-channel occultation retrievals of stratospheric aerosol extinction from the Stratospheric Aerosol and Gas Experiment (SAGE III) on the International Space Station (ISS) for the period 2017 onwards. We also use the GLObal Space based Stratospheric Aerosol Climatology (GloSSAC) to provide a historical background during the SAGE II era (1984 through 2005). The results show that the magnitude of the CALIOP 1064 nm calibration bias is less than 1 %–2 % within the tropics under stratospheric background conditions. However, recent biases can be as high as 5 % when volcanic perturbations and/or pyro-cumulonimbus (pyroCb) injections dominate the stratospheric aerosol loading. We explore the effects of this bias on CALIOP's level 2 science retrievals by estimating the anticipated perturbations in cloud-aerosol discrimination (CAD) performance and by quantifying the non-linear propagation of errors in CALIOP's 1064 nm extinction coefficients. This global characterization of the spectral attenuation differences should provide useful information for future spaceborne elastic lidars operating at 1064 nm.</p>
Current research on visual analytics systems largely follows the research paradigm of interactive system design in the field of Human-Computer Interaction (HCI), and includes key methodologies including design requirement development based on user needs, interactive system design, and system evaluation. However, most studies under this paradigm have a contradiction: there is a significant mismatch between the research methods developed for simple cognitive behaviors (e.g., color perception, the perception of spatial relationship among interactive artifacts) and research goals targeting for complex analytical behaviors (e.g., reasoning, problem-solving, decision-making). This mismatch may hurt the theoretical contributions of research studies, in particularly the internal validity of a designed system and the external validity of design methods. To address this challenge, this paper argues for a need to go beyond traditional HCI theoretical foundations and proposes to adopt complex cognition theories to build new theoretical foundations. Specifically, this paper analyzes how current design and evaluation methods in research on visual analytics systems constrain the internal and external validity of research, discusses the connections between complex cognition theories and visual analytics tasks, and explores how problem-solving theories from complex cognition can guide research on visual analytics systems.
<p>In this study we examine the performance of the 354.8 nm Rayleigh temperature channel of the Raman lidar at the Schneefernerhaus high-altitude research station (UFS) in the Bavarian Alps (at 2675 m a.s.l.). The temperature reference value of the retrieval is adjusted to match the temperature determined from the <span class="inline-formula">OH<sup>*</sup></span> airglow around 86 km by the GRIPS instruments at UFS. In this way the quality of the 1 h measurements of the lidar is improved above 70 km. Comparisons were made between the UFS lidar, the MLS (Microwave Limb Sounder) satellite-borne instrument and the 354.8 nm temperature channel of Hohenpeißenberg (MOHp) differential-absorption ozone lidar. Between 35 and 70 km we see a positive offset of the UFS temperatures with respect to the MLS values of up to about 9 K. This behaviour just slightly exceeds the expectations from earlier work. Despite a horizontal distance of just 40 km between UFS and MOHp acceptable agreement below 70 km was found in several cases. However, in general, the MOHp temperatures were slightly lower than those above UFS. We discuss potential technical issues and suggest solutions for upgrading the UFS lidar system. A significant enhancement of the laser repetition rate is recommended.</p>
Alina Devkota, Annahita Amireskandari, Joel Palko
et al.
Gastrointestinal (GI) endoscopy is essential in identifying GI tract abnormalities in order to detect diseases in their early stages and improve patient outcomes. Although deep learning has shown success in supporting GI diagnostics and decision-making, these models require curated datasets with labels that are expensive to acquire. Foundation models offer a promising solution by learning general-purpose representations, which can be finetuned for specific tasks, overcoming data scarcity. Developing foundation models for medical imaging holds significant potential, but the sensitive and protected nature of medical data presents unique challenges. Foundation model training typically requires extensive datasets, and while hospitals generate large volumes of data, privacy restrictions prevent direct data sharing, making foundation model training infeasible in most scenarios. In this work, we propose a FL framework for training foundation models for gastroendoscopy imaging, enabling data to remain within local hospital environments while contributing to a shared model. We explore several established FL algorithms, assessing their suitability for training foundation models without relying on task-specific labels, conducting experiments in both homogeneous and heterogeneous settings. We evaluate the trained foundation model on three critical downstream tasks--classification, detection, and segmentation--and demonstrate that it achieves improved performance across all tasks, highlighting the effectiveness of our approach in a federated, privacy-preserving setting.
Marcin Kostrzewa, Oleksii Furman, Roman Furman
et al.
Foundation models have shown promise across various financial applications, yet their effectiveness for corporate bankruptcy prediction remains systematically unevaluated against established methods. We study bankruptcy forecasting using Llama-3.3-70B-Instruct and TabPFN, evaluated on large, highly imbalanced datasets of over one million company records from the Visegrád Group. We provide the first systematic comparison of foundation models against classical machine learning baselines for this task. Our results show that models such as XGBoost and CatBoost consistently outperform foundation models across all prediction horizons. LLM-based approaches suffer from unreliable probability estimates, undermining their use in risk-sensitive financial settings. TabPFN, while competitive with simpler baselines, requires substantial computational resources with costs not justified by performance gains. These findings suggest that, despite their generality, current foundation models remain less effective than specialized methods for bankruptcy forecasting.
Vision-language models have demonstrated impressive capabilities as computer-use agents (CUAs) capable of automating diverse computer tasks. As their commercial potential grows, critical details of the most capable CUA systems remain closed. As these agents will increasingly mediate digital interactions and execute consequential decisions on our behalf, the research community needs access to open CUA frameworks to study their capabilities, limitations, and risks. To bridge this gap, we propose OpenCUA, a comprehensive open-source framework for scaling CUA data and foundation models. Our framework consists of: (1) an annotation infrastructure that seamlessly captures human computer-use demonstrations; (2) AgentNet, the first large-scale computer-use task dataset spanning 3 operating systems and 200+ applications and websites; (3) a scalable pipeline that transforms demonstrations into state-action pairs with reflective long Chain-of-Thought reasoning that sustain robust performance gains as data scales. Our end-to-end agent models demonstrate strong performance across CUA benchmarks. In particular, OpenCUA-72B achieves an average success rate of 45.0% on OSWorld-Verified, establishing a new state-of-the-art (SOTA) among open-source models. Further analysis confirms that our approach generalizes well across domains and benefits significantly from increased test-time computation. We release our annotation tool, datasets, code, and models to build open foundations for further CUA research.
Recent advances in diffusion models have revolutionized video generation, offering superior temporal consistency and visual quality compared to traditional generative adversarial networks-based approaches. While this emerging field shows tremendous promise in applications, it faces significant challenges in motion consistency, computational efficiency, and ethical considerations. This survey provides a comprehensive review of diffusion-based video generation, examining its evolution, technical foundations, and practical applications. We present a systematic taxonomy of current methodologies, analyze architectural innovations and optimization strategies, and investigate applications across low-level vision tasks such as denoising and super-resolution. Additionally, we explore the synergies between diffusionbased video generation and related domains, including video representation learning, question answering, and retrieval. Compared to the existing surveys (Lei et al., 2024a;b; Melnik et al., 2024; Cao et al., 2023; Xing et al., 2024c) which focus on specific aspects of video generation, such as human video synthesis (Lei et al., 2024a) or long-form content generation (Lei et al., 2024b), our work provides a broader, more updated, and more fine-grained perspective on diffusion-based approaches with a special section for evaluation metrics, industry solutions, and training engineering techniques in video generation. This survey serves as a foundational resource for researchers and practitioners working at the intersection of diffusion models and video generation, providing insights into both the theoretical frameworks and practical implementations that drive this rapidly evolving field. A structured list of related works involved in this survey is also available on https://github.com/Eyeline-Research/Survey-Video-Diffusion.
This paper is an attempt at laying the foundations for the classification of queries on relational data bases according to their structure and their computational complexity. Using the operations of composition and fixpoints, a Σ-Π hierarchy of height, ω2, called the fixpoint query hierarchy, is defined, and its properties investigated. The hierarchy includes most of the queries considered in the literature including those of Codd and Aho and Ullman. The hierarchy to level ω characterizes the first-order queries, and the levels up to ω are shown to be strict. Sets of queries larger than the fixpoint query hierarchy are obtained by considering the queries computable in polynomial time, queries computable in polynomial space, etc. It is shown that classes of queries defined from such complexity classes behave (with respect to containment) in a manner very similar to the corresponding complexity classes. Also, the set of second-order queries turns out to be the same as the set of queries defined from the polynomialtime hierarchy. Finally, these classes of queries are used to characterize a set of queries defined from language considerations: those expressible in a programming language with only typed (or ranked) relation variables. At the end of the paper is a list of symbols used therein.
The rapid advancement of foundation models in medical imaging represents a significant leap toward enhancing diagnostic accuracy and personalized treatment. However, the deployment of foundation models in healthcare necessitates a rigorous examination of their trustworthiness, encompassing privacy, robustness, reliability, explainability, and fairness. The current body of survey literature on foundation models in medical imaging reveals considerable gaps, particularly in the area of trustworthiness. Additionally, existing surveys on the trustworthiness of foundation models do not adequately address their specific variations and applications within the medical imaging domain. This survey aims to fill that gap by presenting a novel taxonomy of foundation models used in medical imaging and analyzing the key motivations for ensuring their trustworthiness. We review current research on foundation models in major medical imaging applications, focusing on segmentation, medical report generation, medical question and answering (Q\&A), and disease diagnosis. These areas are highlighted because they have seen a relatively mature and substantial number of foundation models compared to other applications. We focus on literature that discusses trustworthiness in medical image analysis manuscripts. We explore the complex challenges of building trustworthy foundation models for each application, summarizing current concerns and strategies for enhancing trustworthiness. Furthermore, we examine the potential of these models to revolutionize patient care. Our analysis underscores the imperative for advancing towards trustworthy AI in medical image analysis, advocating for a balanced approach that fosters innovation while ensuring ethical and equitable healthcare delivery.
Many decision problems concerning cellular automata are known to be decidable in the case of algebraic cellular automata, that is, when the state set has an algebraic structure and the automaton acts as a morphism. The most studied cases include finite fields, finite commutative rings and finite commutative groups. In this paper, we provide methods to generalize these results to the broader case of group cellular automata, that is, the case where the state set is a finite (possibly non-commutative) finite group. The configuration space is not even necessarily the full shift but a subshift — called a group shift — that is a subgroup of the full shift on [Formula: see text], for any number [Formula: see text] of dimensions. We show, in particular, that injectivity, surjectivity, equicontinuity, sensitivity and nilpotency are decidable for group cellular automata, and non-transitivity is semi-decidable. Injectivity always implies surjectivity, and jointly periodic points are dense in the limit set. The Moore direction of the Garden-of-Eden theorem holds for all group cellular automata, while the Myhill direction fails in some cases. The proofs are based on effective projection operations on group shifts that are, in particular, applied on the set of valid space-time diagrams of group cellular automata. This allows one to effectively construct the traces and the limit sets of group cellular automata. A preliminary version of this work was presented at the conference Mathematical Foundations of Computer Science 2020.
Foundations published its inaugural issue in 2021, establishing itself as a new international open access, peer-reviewed, multidisciplinary journal of science and techonology, covering mathematics, physics, chemistry, biology, engineering, earth sciences, materials, information sciences, and medical sciences [...]
This paper investigates foundation models tailored for music informatics, a domain currently challenged by the scarcity of labeled data and generalization issues. To this end, we conduct an in-depth comparative study among various foundation model variants, examining key determinants such as model architectures, tokenization methods, temporal resolution, data, and model scalability. This research aims to bridge the existing knowledge gap by elucidating how these individual factors contribute to the success of foundation models in music informatics. Employing a careful evaluation framework, we assess the performance of these models across diverse downstream tasks in music information retrieval, with a particular focus on token-level and sequence-level classification. Our results reveal that our model demonstrates robust performance, surpassing existing models in specific key metrics. These findings contribute to the understanding of self-supervised learning in music informatics and pave the way for developing more effective and versatile foundation models in the field. A pretrained version of our model is publicly available to foster reproducibility and future research.
The recent release of large language model (LLM) based chatbots, such as ChatGPT, has attracted huge interest in foundation models. It is widely believed that foundation models will serve as the fundamental building blocks for future AI systems. As foundation models are in their early stages, the design of foundation model based systems has not yet been systematically explored. There is limited understanding about the impact of introducing foundation models in software architecture. Therefore, in this paper, we propose a taxonomy of foundation model based systems, which classifies and compares the characteristics of foundation models and design options of foundation model based systems. Our taxonomy comprises three categories: the pretraining and adaptation of foundation models, the architecture design of foundation model based systems, and responsible-AI-by-design. This taxonomy can serve as concrete guidance for making major architectural design decisions when designing foundation model based systems and highlights trade-offs arising from design decisions.
The proliferation of Large Language Models (LLMs) has s fueled a shift in robot learning from automation towards general embodied Artificial Intelligence (AI). Adopting foundation models together with traditional learning methods to robot learning has increasingly gained recent interest research community and showed potential for real-life application. However, there are few literatures comprehensively reviewing the relatively new technologies combined with robotics. The purpose of this review is to systematically assess the state-of-the-art foundation model techniques in the robot learning and to identify future potential areas. Specifically, we first summarized the technical evolution of robot learning and identified the necessary preliminary preparations for foundation models including the simulators, datasets, foundation model framework. In addition, we focused on the following four mainstream areas of robot learning including manipulation, navigation, planning, and reasoning and demonstrated how the foundation model techniques can be adopted in the above scenarios. Furthermore, critical issues which are neglected in the current literatures including robot hardware and software decoupling, dynamic data, generalization performance with the presence of human, etc. were discussed. This review highlights the state-of-the-art progress of foundation models in robot learning and future research should focus on multimodal interaction especially dynamics data, exclusive foundation models for robots, and AI alignment, etc.
A measurement performed on a quantum system is an act of gaining information about its state, a view that is widespread in practical and foundational work in quantum theory. However, the concept of information in quantum theory reconstructions is multiply-defined, and its conceptual foundations remain surprisingly under-explored. In this paper, we investigate the gain of information in quantum measurements from an operational viewpoint. We show that the continuous extension of the Shannon entropy naturally admits two distinct measures of information gain, differential information gain and relative information gain, and that these have radically different characteristics. In particular, while differential information gain can increase or decrease as additional data is acquired, relative information gain consistently grows, and moreover exhibits asymptotic indifference to the data or choice of Bayesian prior. In order to make a principled choice between these measures, we articulate a Principle of Information Increase, which incorporates Summhammer's proposal that more data from measurements leads to more knowledge about the system, and also takes into consideration black swan events. This principle favors differential information gain as the more relevant metric in two-outcome quantum systems, and guides the selection of priors for these information measures. Finally, we show that, of the beta distribution priors, the Jeffreys' binomial prior is the prior ensures maximal robustness of information gain to the particular data sequence obtained in a run of experiments.
Theoretical frames for analyzing information in biological and molecular multicomponent structures are proposed. The mathematical foundations of the proposal are presented. Both the information encoded in structures is defined and the method of calculating the amount of this information is introduced. The proposed approach is applied to the operation of a molecular multicomponent machine.
É. R. Silva, J. A. Baldovino, Ronaldo Luis dos Santos Izzo
ABSTRACT Lime and cement as traditional binders need to be reduced in soil stabilization to avoid carbon dioxide emissions. Infrastructure development like paved structures, buildings, foundations, and dams involves exhaustive use of natural resources and serious concern for the environment. Using natural fibers as reinforced soil is a green-novel technique that improves soil properties such as strength, deformation, and expansion. Thus, this paper advances the performance of soil reinforcement using natural fiber. Natural Curauá fiber was used to improve unconfined compressive strength (qu), splitting tensile strength (qt), shear direct, and deformation of sedimentary silt in southern Brazil. By studying strength properties, 0.50% fiber and a length of 6 mm were selected as an optimum percentage and optimum length, respectively. For direct shear tests, the increase in strength at low confining stresses is proportional to the increase in fiber content and fiber length. By natural biodegradation of fiber, expanded polystyrene (EPS)-based treatment was employed, influencing a reduction of 10% in water absorption of fiber and increasing fiber’s tensile strength by 5%. Finally, the results demonstrate that natural fiber is a friendly-environmental alternative to enhance soil properties by applying rammed earth and some short-use earthworks in green constructions.