<p>Diurnal variations in planetary boundary layer height (PBLH) is highly linked to weather, climate, and environmental processes. However, remaining challenges persist in estimating its diurnal behavior at a large scale due to insufficient observations and limitations of operational retrieval algorithms. This study proposed a deep learning framework based on an attention-augmented residual neural network to estimate diurnal variations in near-global PBLH, incorporating profiles from an non-sun-synchronous lidar (Cloud-Aerosol Transport System: CATS) and meteorological fields. The framework can largely address the issue of multi-layer structures in space-borne lidar signals, significantly improving the accuracy of PBLH retrieval during morning and evening (with accuracy improvement approach 30 % compared to traditional algorithm). Due to insufficient observations aligned with CATS orbits, a pre-train model was firstly trained using pseudo-labels from reanalysis, and then was transferred to observation-based target labels. The transfer model demonstrates superior performance in most regions and periods, outperforming classical algorithm in capturing PBLH magnitude and its diurnal variations. Further assessments over different land covers show that the transfer model estimated PBLH and diurnal patterns were highly consistent with those from radiosondes, surpassing reanalysis outputs. For model capability, wavelet covariance transformation derived potential PBLH and temperature profiles emerged as dominant factors, with contributions exhibiting diurnal patterns. Overall, this work proposes a novel framework for large-scale PBLH estimation and provides insights for improving retrieval algorithms, particularly through integrating remote sensing and machine learning.</p>
This paper presents a comprehensive analysis of the strategic imperative for healthcare organizations to develop proprietary foundation models rather than relying exclusively on commercial alternatives. We examine four fundamental considerations driving this imperative: the domain-specific requirements of healthcare data representation, critical data sovereignty and governance considerations unique to healthcare, strategic competitive advantages afforded by proprietary AI infrastructure, and the transformative potential of healthcare-specific foundation models for patient care and organizational operations. Through analysis of empirical evidence, economic frameworks, and organizational case studies, we demonstrate that proprietary multimodal foundation models enable healthcare organizations to achieve superior clinical performance, maintain robust data governance, create sustainable competitive advantages, and accelerate innovation pathways. While acknowledging implementation challenges, we present evidence showing organizations with proprietary AI capabilities demonstrate measurably improved outcomes, faster innovation cycles, and stronger strategic positioning in the evolving healthcare ecosystem. This analysis provides healthcare leaders with a comprehensive framework for evaluating build-versus-buy decisions regarding foundation model implementation, positioning proprietary foundation model development as a cornerstone capability for forward-thinking healthcare organizations.
Physical AI needs to be trained digitally first. It needs a digital twin of itself, the policy model, and a digital twin of the world, the world model. In this paper, we present the Cosmos World Foundation Model Platform to help developers build customized world models for their Physical AI setups. We position a world foundation model as a general-purpose world model that can be fine-tuned into customized world models for downstream applications. Our platform covers a video curation pipeline, pre-trained world foundation models, examples of post-training of pre-trained world foundation models, and video tokenizers. To help Physical AI builders solve the most critical problems of our society, we make Cosmos open-source and our models open-weight with permissive licenses available via https://github.com/nvidia-cosmos/cosmos-predict1.
Brian Pulfer, Yury Belousov, Vitaliy Kinakh
et al.
The study of security in machine learning mainly focuses on downstream task-specific attacks, where the adversarial example is obtained by optimizing a loss function specific to the downstream task. At the same time, it has become standard practice for machine learning practitioners to adopt publicly available pre-trained vision foundation models, effectively sharing a common backbone architecture across a multitude of applications such as classification, segmentation, depth estimation, retrieval, question-answering and more. The study of attacks on such foundation models and their impact to multiple downstream tasks remains vastly unexplored. This work proposes a general framework that forges task-agnostic adversarial examples by maximally disrupting the feature representation obtained with foundation models. We extensively evaluate the security of the feature representations obtained by popular vision foundation models by measuring the impact of this attack on multiple downstream tasks and its transferability between models.
David Piorkowski, Michael Hind, John Richards
et al.
As foundation models grow in both popularity and capability, researchers have uncovered a variety of ways that the models can pose a risk to the model's owner, user, or others. Despite the efforts of measuring these risks via benchmarks and cataloging them in AI risk taxonomies, there is little guidance for practitioners on how to determine which risks are relevant for a given foundation model use. In this paper, we address this gap and develop requirements and an initial design for a risk identification framework. To do so, we look to prior literature to identify challenges for building a foundation model risk identification framework and adapt ideas from usage governance to synthesize four design requirements. We then demonstrate how a candidate framework can addresses these design requirements and provide a foundation model use example to show how the framework works in practice for a small subset of risks.
Mohammad Areeb Qazi, Munachiso S Nwadike, Ibrahim Almakky
et al.
Foundational models are trained on extensive datasets to capture the general trends of a domain. However, in medical imaging, the scarcity of data makes pre-training for every domain, modality, or task challenging. Continual learning offers a solution by fine-tuning a model sequentially on different domains or tasks, enabling it to integrate new knowledge without requiring large datasets for each training phase. In this paper, we propose UNIfied CONtinual Learning for Medical Foundational Models (UNICON), a framework that enables the seamless adaptation of foundation models to diverse domains, tasks, and modalities. Unlike conventional adaptation methods that treat these changes in isolation, UNICON provides a unified, perpetually expandable framework. Through careful integration, we show that foundation models can dynamically expand across imaging modalities, anatomical regions, and clinical objectives without catastrophic forgetting or task interference. Empirically, we validate our approach by adapting a chest CT foundation model initially trained for classification to a prognosis and segmentation task. Our results show improved performance across both additional tasks. Furthermore, we continually incorporated PET scans and achieved a 5\% improvement in Dice score compared to respective baselines. These findings establish that foundation models are not inherently constrained to their initial training scope but can evolve, paving the way toward generalist AI models for medical imaging.
All aspects of our society, including the life sciences, need a mechanism for people working within them to represent the concepts they employ to carry out their research. For the information systems being designed and developed to support researchers and scientists in conducting their work, conceptual models of the relevant domains are usually designed as both blueprints for a system being developed and as a means of communication between the designer and developer. Most conceptual modelling concepts are generic in the sense that they are applied with the same understanding across many applications. Problems in the life sciences, however, are especially complex and important, because they deal with humans, their well-being, and their interactions with the environment as well as other organisms. This work proposes a systemist perspective for creating a conceptual model of a life scientist's problem. We introduce the notion of a system and then show how it can be applied to the development of an information system for handling genomic-related information. We extend our discussion to show how the proposed systemist perspective can support the modelling of precision medicine. This research recognizes challenges in life sciences research of how to model problems to better represent the connections between physical and digital worlds. We propose a new notation that explicitly incorporates systemist thinking, as well as the components of systems based on recent ontological foundations. The new notation captures important semantics in the domain of life sciences. It may be used to facilitate understanding, communication and problem-solving more broadly. We also provide a precise, sound, ontologically supported characterization of the term system, as a basic construct for conceptual modelling in life sciences.
Foundation models for tabular data are rapidly evolving, with increasing interest in extending them to support additional modalities such as free-text features. However, existing benchmarks for tabular data rarely include textual columns, and identifying real-world tabular datasets with semantically rich text features is non-trivial. We propose a series of simple yet effective ablation-style strategies for incorporating text into conventional tabular pipelines. Moreover, we benchmark how state-of-the-art tabular foundation models can handle textual data by manually curating a collection of real-world tabular datasets with meaningful textual features. Our study is an important step towards improving benchmarking of foundation models for tabular data with text.
Finslerian extensions of Special and General Relativity -- commonly referred to as Very Special and Very General Relativity -- necessitate the development of a unified Lorentz-Finsler geometry. However, the scope of this geometric framework extends well beyond relativistic physics. Indeed, it offers powerful tools for modeling wave propagation in classical mechanics, discretizing spacetimes in classical and relativistic settings, and supporting effective theories in fundamental physics. Moreover, Lorentz-Finsler geometry provides a versatile setting that facilitates the resolution of problems within Riemannian, Lorentzian, and Finslerian geometries individually. This work presents a plain introduction to the subject, reviewing foundational concepts, key applications, and future prospects. The reviewed topics include (i) basics on the setting of cones, Finsler and Lorentz-Finsler metrics and their (nonlinear, anisotropic and linear) connections, (ii) the global structure of Lorentz-Finsler manifolds and its space of null geodesics, (iii) links among Riemannian, Finsler and Lorentz geometries, (iv) applications in classical settings as wildfires and seisms propagation, and discretization in classical and relativistic settings with quantum prospects, and (v) Finslerian variational approach to Einstein equations. The new results include the splitting of globally hyperbolic Finsler spacetimes, in addition to the analysis of several extensions of the Lorentz setting, as the case of timelike boundaries.
<p>Water vapour isotopes are important tools to better understand processes governing the atmospheric hydrological cycle. Their measurement in polar regions is crucial to improve the interpretation of water isotopic records in ice cores. In situ water vapour isotopic monitoring remains challenging, especially in dry places of the East Antarctic Plateau, where water mixing ratios can be as low as 10 ppm. We present in this article new commercial laser spectrometers based on the optical-feedback cavity-enhanced absorption spectroscopy (OF–CEAS) technique, adapted for water vapour isotopic measurements in dry regions. We characterise a first instrument adapted for Antarctic coastal monitoring with an optical cavity finesse of 64 000 (ring-down time of 54 <span class="inline-formula">µ</span>s), installed at Dumont d'Urville Station during the summer campaign 2022–2023, and a second instrument with a high finesse of 116 000 (98 <span class="inline-formula">µ</span>s ring-down time), to be deployed inland of East Antarctica. With a drift calibration every 24 h, the stability demonstrated by the high-finesse instrument allows one to study isotopic diurnal cycles down to 10 ppm humidity for <span class="inline-formula"><i>δ</i></span>D and 100 ppm for <span class="inline-formula"><i>δ</i><sup>18</sup></span>O.</p>
Nikolaos Dionelis, Casper Fibaek, Luke Camilleri
et al.
When we are primarily interested in solving several problems jointly with a given prescribed high performance accuracy for each target application, then Foundation Models should for most cases be used rather than problem-specific models. We focus on the specific Computer Vision application of Foundation Models for Earth Observation (EO) and geospatial AI. These models can solve important problems we are tackling, including for example land cover classification, crop type mapping, flood segmentation, building density estimation, and road regression segmentation. In this paper, we show that for a limited number of labelled data, Foundation Models achieve improved performance compared to problem-specific models. In this work, we also present our proposed evaluation benchmark for Foundation Models for EO. Benchmarking the generalization performance of Foundation Models is important as it has become difficult to standardize a fair comparison across the many different models that have been proposed recently. We present the results using our evaluation benchmark for EO Foundation Models and show that Foundation Models are label efficient in the downstream tasks and help us solve problems we are tackling in EO and remote sensing.
Hypergraphs generalize classical graphs by allowing a single edge to connect multiple vertices, providing a natural language for modeling higher-order interactions. Superhypergraphs extend this paradigm further by accommodating nested, set-valued entities and relations, enabling the representation of hierarchical, multi-level structures beyond the expressive reach of ordinary graphs or hypergraphs. In parallel, neural networks-especially Graph Neural Networks (GNNs)-have become a standard tool for learning from relational data, and recent years have seen rapid progress on Hypergraph Neural Networks (HGNNs) and their theoretical properties. To model uncertainty and multi-aspect attributes in complex networks, several graded and multi-valued graph frameworks have been developed, including fuzzy graphs and neutrosophic graphs. The plithogenic graph framework unifies and refines these approaches by incorporating multi-valued attributes together with membership and contradiction mechanisms, offering a flexible representation for heterogeneous and partially inconsistent information. This book develops the theoretical foundations of SuperHyperGraph Neural Networks (SHGNNs) and Plithogenic Graph Neural Networks, with the goal of extending message-passing principles to these advanced higher-order structures. We provide rigorous definitions, establish fundamental structural properties, and prove well-definedness results for key constructions, with particular emphasis on strengthened formulations of Soft Graph Neural Networks and Rough Graph Neural Networks.
The realization of universal robots is an ultimate goal of researchers. However, a key hurdle in achieving this goal lies in the robots' ability to manipulate objects in their unstructured surrounding environments according to different tasks. The learning-based approach is considered an effective way to address generalization. The impressive performance of foundation models in the fields of computer vision and natural language suggests the potential of embedding foundation models into manipulation tasks as a viable path toward achieving general manipulation capability. However, we believe achieving general manipulation capability requires an overarching framework akin to auto driving. This framework should encompass multiple functional modules, with different foundation models assuming distinct roles in facilitating general manipulation capability. This survey focuses on the contributions of foundation models to robot learning for manipulation. We propose a comprehensive framework and detail how foundation models can address challenges in each module of the framework. What's more, we examine current approaches, outline challenges, suggest future research directions, and identify potential risks associated with integrating foundation models into this domain.
This article discusses the opportunities, applications and future directions of large-scale pre-trained models, i.e., foundation models, for analyzing medical images. Medical foundation models have immense potential in solving a wide range of downstream tasks, as they can help to accelerate the development of accurate and robust models, reduce the large amounts of required labeled data, preserve the privacy and confidentiality of patient data. Specifically, we illustrate the "spectrum" of medical foundation models, ranging from general vision models, modality-specific models, to organ/task-specific models, highlighting their challenges, opportunities and applications. We also discuss how foundation models can be leveraged in downstream medical tasks to enhance the accuracy and efficiency of medical image analysis, leading to more precise diagnosis and treatment decisions.
Brandon Duderstadt, Hayden S. Helm, Carey E. Priebe
Recent advances in self-supervised learning and neural network scaling have enabled the creation of large models, known as foundation models, which can be easily adapted to a wide range of downstream tasks. The current paradigm for comparing foundation models involves evaluating them with aggregate metrics on various benchmark datasets. This method of model comparison is heavily dependent on the chosen evaluation metric, which makes it unsuitable for situations where the ideal metric is either not obvious or unavailable. In this work, we present a methodology for directly comparing the embedding space geometry of foundation models, which facilitates model comparison without the need for an explicit evaluation metric. Our methodology is grounded in random graph theory and enables valid hypothesis testing of embedding similarity on a per-datum basis. Further, we demonstrate how our methodology can be extended to facilitate population level model comparison. In particular, we show how our framework can induce a manifold of models equipped with a distance function that correlates strongly with several downstream metrics. We remark on the utility of this population level model comparison as a first step towards a taxonomic science of foundation models.
As the potential of foundation models in visual tasks has garnered significant attention, pretraining these models before downstream tasks has become a crucial step. The three key factors in pretraining foundation models are the pretraining method, the size of the pretraining dataset, and the number of model parameters. Recently, research in the remote sensing field has focused primarily on the pretraining method and the size of the dataset, with limited emphasis on the number of model parameters. This paper addresses this gap by examining the effect of increasing the number of model parameters on the performance of foundation models in downstream tasks such as rotated object detection and semantic segmentation. We pretrained foundation models with varying numbers of parameters, including 86M, 605.26M, 1.3B, and 2.4B, to determine whether performance in downstream tasks improved with an increase in parameters. To the best of our knowledge, this is the first billion-scale foundation model in the remote sensing field. Furthermore, we propose an effective method for scaling up and fine-tuning a vision transformer in the remote sensing field. To evaluate general performance in downstream tasks, we employed the DOTA v2.0 and DIOR-R benchmark datasets for rotated object detection, and the Potsdam and LoveDA datasets for semantic segmentation. Experimental results demonstrated that, across all benchmark datasets and downstream tasks, the performance of the foundation models and data efficiency improved as the number of parameters increased. Moreover, our models achieve the state-of-the-art performance on several datasets including DIOR-R, Postdam, and LoveDA.
We describe a reformulation (following Hales (2017)) of a 1934 conjecture of Reinhardt on pessimal packings of convex domains in the plane as a problem in optimal control theory. Several structural results of this problem including its Hamiltonian structure and Lax pair formalism are presented. General solutions of this problem for constant control are presented and are used to prove that the Pontryagin extremals of the control problem are constrained to lie in a compact domain of the state space. We further describe the structure of the control problem near its singular locus, and prove that we recover the Pontryagin system of the multi-dimensional Fuller optimal control problem (with two dimensional control) in this case. We show how this system admits logarithmic spiral trajectories when the control set is the circumscribing disk of the 2-simplex with the associated control performing an infinite number of rotations on the boundary of the disk in finite time. We also describe formalization projects in foundational optimal control viz., model-based and model-free Reinforcement Learning theory. Key ingredients which make these formalization novel viz., the Giry monad and contraction coinduction are considered and some applications are discussed.
<p>Accurate particle classification plays a vital role in
aerosol studies. Differential mobility analyzers (DMAs), centrifugal particle
mass analyzers (CPMAs) and aerodynamic aerosol classifiers (AACs) are commonly
used to select particles with a specific mobility diameter, aerodynamic
diameter or mass, respectively. However, multiple charging effects cannot be
entirely avoided when using either individual techniques or tandem systems
such as DMA–CPMA, especially when selecting soot particles with fractal
structures. In this study, we calculate the transfer functions of the
DMA–CPMA and DMA–AAC in static configurations for flame-generated soot
particles. We propose an equation that constrains the resolutions of the DMA and
CPMA to eliminate the multiple charging effect when selecting particles with
a certain mass–mobility relationship using the DMA–CPMA system. The equation
for the DMA–AAC system is also derived. For DMA–CPMA in a static
configuration, our results show that the ability to remove multiply charged
particles mainly depends on the particle morphology and resolution settings
of the DMA and CPMA. Using measurements from soot experiments and literature
data, a general trend in the appearance of the multiple charging effect with
decreasing size when selecting aspherical particles is observed. As for
DMA–AAC in a static configuration, the ability to eliminate particles with
multiple charges is mainly related to the resolutions of the classifiers. In
most cases, the DMA–AAC in a static configuration can eliminate the multiple
charging effect regardless of the particle morphology, but multiply charged
particles will be selected when decreasing the resolution of the DMA or AAC.
We propose that the potential influence of the multiple charging effect
should be considered when using the DMA–CPMA or DMA–AAC systems in
estimating size- and mass-resolved optical properties in field and lab
experiments.</p>
G. Ancellet, S. Godin-Beekmann, H. G. J. Smit
et al.
<p>The Observatoire de Haute Provence (OHP) weekly electrochemical concentration cell (ECC) ozonesonde data have been homogenized for the period 1991–2021 according to the recommendations of the Ozonesonde Data Quality Assessment (O3S-DQA) panel. The assessment of the ECC homogenization
benefit has been carried out using comparisons with other ozone-measuring ground-based instruments at the same station (lidar, surface measurements)
and with colocated satellite observations of the <span class="inline-formula">O<sub>3</sub></span> vertical profile by Microwave Limb Sounder (MLS). The major differences between
uncorrected and homogenized ECC data are related to a change of ozonesonde type in 1997, removal of the pressure dependency of the ECC background
current and correction of internal pump temperature. The original 3–4 <span class="inline-formula">ppbv</span> positive bias between ECC and lidar in the troposphere is
corrected with the homogenization. The ECC 30-year trends of the seasonally adjusted ozone concentrations are also significantly improved in both
the troposphere and the stratosphere after the ECC homogenization, as shown by the ECC/lidar or ECC/surface ozone trend comparisons. A <span class="inline-formula">−0.19</span> % yr<span class="inline-formula"><sup>−1</sup></span> negative trend of the normalization factor (<span class="inline-formula"><i>N</i><sub>T</sub></span>) calculated using independent measurements of the total ozone column (TOC) at
OHP disappears after homogenization of the ECC data. There is, however, a remaining <span class="inline-formula">−3.7</span> % negative bias in the TOC which is likely related to
an underestimate of the ECC concentrations in the stratosphere above 50 <span class="inline-formula">hPa</span>. Differences between TOC measured by homogenized ECC and
satellite observations show a smaller bias of <span class="inline-formula">−1</span> %. Comparisons between homogenized ECC and OHP stratospheric lidar and MLS observations below 26 <span class="inline-formula">km</span> are slightly negative (<span class="inline-formula">−2</span> %) or positive (<span class="inline-formula">+2</span> %), respectively. The comparisons with both lidar and satellite observations
suggest that homogenization increases the negative bias of the ECC to values lower than <span class="inline-formula">−6</span> % above 28 <span class="inline-formula">km</span>. The reason for this bias is still unclear, but a possible explanation might be related to freezing or evaporation of the sonde solution in the stratosphere.</p>