Francesca Fati, Alberto Rota, Adriana V. Gregory
et al.
Adnexal mass evaluation via ultrasound is a challenging clinical task, often hindered by subjective interpretation and significant inter-observer variability. While automated segmentation is a foundational step for quantitative risk assessment, traditional fully supervised convolutional architectures frequently require large amounts of pixel-level annotations and struggle with domain shifts common in medical imaging. In this work, we propose a label-efficient segmentation framework that leverages the robust semantic priors of a pretrained DINOv3 foundational vision transformer backbone. By integrating this backbone with a Dense Prediction Transformer (DPT)-style decoder, our model hierarchically reassembles multi-scale features to combine global semantic representations with fine-grained spatial details. Evaluated on a clinical dataset of 7,777 annotated frames from 112 patients, our method achieves state-of-the-art performance compared to established fully supervised baselines, including U-Net, U-Net++, DeepLabV3, and MAnet. Specifically, we obtain a Dice score of 0.945 and improved boundary adherence, reducing the 95th-percentile Hausdorff Distance by 11.4% relative to the strongest convolutional baseline. Furthermore, we conduct an extensive efficiency analysis demonstrating that our DINOv3-based approach retains significantly higher performance under data starvation regimes, maintaining strong results even when trained on only 25% of the data. These results suggest that leveraging large-scale self-supervised foundations provides a promising and data-efficient solution for medical image segmentation in data-constrained clinical environments. Project Repository: https://github.com/FrancescaFati/MESA
AI-communication integration is widely regarded as a core enabling technology for 6G. Most existing AI-based physical-layer designs rely on task-specific models that are separately tailored to individual modules, resulting in poor generalization. In contrast, communication systems are inherently general-purpose and should support broad applicability and robustness across diverse scenarios. Foundation models offer a promising solution through strong reasoning and generalization, yet wireless-system constraints hinder a direct transfer of large language model (LLM)-style success to the wireless domain. Therefore, we introduce the concept of large wireless foundation models (LWFMs) and present a novel framework for empowering the physical layer with foundation models under wireless constraints. Specifically, we propose two paradigms for realizing LWFMs, including leveraging existing general-purpose foundation models and building novel wireless foundation models. Based on recent progress, we distill two roadmaps for each paradigm and formulate design principles under wireless constraints. We further provide case studies of LWFM-empowered wireless systems to intuitively validate their advantages. Finally, we characterize the notion of "large" in LWFMs through a multidimensional analysis of existing work and outline promising directions for future research.
<p>Observations of contrails are vital for improving our understanding of the contrail formation and life cycle, informing models, and assessing mitigation strategies. Here, we developed a methodology that utilises ground-based cameras for tracking and analysing young contrails (<span class="inline-formula"><</span> 35 <span class="inline-formula">min</span>) formed under clear-sky conditions, comparing these observations against reanalysis meteorology and simulations from the contrail cirrus prediction model (CoCiP) with actual flight trajectories. Our observations consist of 14 <span class="inline-formula">h</span> of video footage recorded over 5 different days in Central London, capturing 1582 flight waypoints from 281 flights. The simulation correctly predicted contrail formation and absence for around 75 % of these waypoints, with incorrect contrail predictions occurring at warmer temperatures than those with true-positive predictions (7.8 <span class="inline-formula">K</span> vs. 12.8 <span class="inline-formula">K</span> below the Schmidt–Appleman criterion threshold temperature). When evaluating contrails with observed lifetimes of at least 2 <span class="inline-formula">min</span>, the simulation's correct prediction rate for contrail formation increases to over 85 %. Among all waypoints with contrail observations, 78 % of short-lived contrails (observed lifetimes <span class="inline-formula"><</span> 2 <span class="inline-formula">min</span>) formed under ice-subsaturated conditions, whereas 75 % of persistent contrails (observed lifetimes <span class="inline-formula">></span> 10 <span class="inline-formula">min</span>) formed under ice-supersaturated conditions. On average, the simulated contrail geometric width was around 100 <span class="inline-formula">m</span> smaller than the observed (visible) width over its observed lifetime, with the mean underestimation reaching up to 280 <span class="inline-formula">m</span> within the first 5 min. Discrepancies between the observed and simulated contrail formation, lifetime, and width can be associated with uncertainties in reanalysis meteorology due to known model limitations and sub-grid-scale variabilities, contrail model simplifications, uncertainties in aircraft performance estimates, and observational challenges, among other possible factors. Overall, this study demonstrates the potential of ground-based cameras to create essential observational and benchmark datasets for validating and improving existing weather and contrail models.</p>
Hondamunige Prasanna Silva, Federico Becattini, Lorenzo Seidenari
Foundation models represent the most prominent and recent paradigm shift in artificial intelligence. Foundation models are large models, trained on broad data that deliver high accuracy in many downstream tasks, often without fine-tuning. For this reason, models such as CLIP , DINO or Vision Transfomers (ViT), are becoming the bedrock of many industrial AI-powered applications. However, the reliance on pre-trained foundation models also introduces significant security concerns, as these models are vulnerable to adversarial attacks. Such attacks involve deliberately crafted inputs designed to deceive AI systems, jeopardizing their reliability. This paper studies the vulnerabilities of vision foundation models, focusing specifically on CLIP and ViTs, and explores the transferability of adversarial attacks to downstream tasks. We introduce a novel attack, targeting the structure of transformer-based architectures in a task-agnostic fashion. We demonstrate the effectiveness of our attack on several downstream tasks: classification, captioning, image/text retrieval, segmentation and depth estimation. Code available at:https://github.com/HondamunigePrasannaSilva/attack-attention
Vishal Nedungadi, Xingguo Xiong, Aike Potze
et al.
Food security remains a global concern as population grows and climate change intensifies, demanding innovative solutions for sustainable agricultural productivity. Recent advances in foundation models have demonstrated remarkable performance in remote sensing and climate sciences, and therefore offer new opportunities for agricultural monitoring. However, their application in challenges related to agriculture-such as crop type mapping, crop phenology estimation, and crop yield estimation-remains under-explored. In this work, we quantitatively evaluate existing foundational models to assess their effectivity for a representative set of agricultural tasks. From an agricultural domain perspective, we describe a requirements framework for an ideal agricultural foundation model (CropFM). We then survey and compare existing general-purpose foundational models in this framework and empirically evaluate two exemplary of them in three representative agriculture specific tasks. Finally, we highlight the need for a dedicated foundational model tailored specifically to agriculture.
<p>The morphological complexity of urban environments results in a high spatial and temporal variability of the urban microclimate. The consequent demand for high-resolution atmospheric data remains a challenge for atmospheric research and operational application. The recent widespread availability and increasing adoption of low-cost mobile sensing offer the opportunity to integrate observations from conventional monitoring networks with microclimatic and air pollution data at a finer spatial and temporal scale. So far, the relatively low quality of the measurements and outdoor performance compared to conventional instrumentation has discouraged the full deployment of mobile sensors for routine monitoring. The present study addresses the performance of a commercial mobile sensor, the MeteoTracker (IoTopon Srl), recently launched on the market to quantify the microclimatic characteristics of the outdoor environment. The sensor follows the philosophy of the Internet of Things technology, being low cost, having an automatic data flow via personal smartphones and online data sharing, supporting user-friendly software, and having the potential to be deployed in large quantities. In this paper, the outdoor performance is evaluated through tests aimed at quantifying (i) the intra-sensor variability under similar atmospheric conditions and (ii) the outdoor accuracy compared to a reference weather station under sub-optimal (in a fixed location) and optimal (mobile) sensor usage. Data-driven corrections are developed and successfully applied to improve the MeteoTracker data quality. In particular, a recursive method for the simultaneous improvement of relative humidity, dew point, and humidex index proves to be crucial for increasing the data quality. The results mark an intra-sensor variability of approximately <span class="inline-formula">±</span> 0.5 °C for air temperature and <span class="inline-formula">±</span> 1.2 % for the corrected relative humidity, both of which are within the declared sensor accuracy. The sensor captures the same atmospheric variability as the reference sensor during both fixed and mobile tests, showing positive biases (overestimation) for both variables. Through the mobile test, the outdoor accuracy is observed to be between <span class="inline-formula">±</span> 0.3 to <span class="inline-formula">±</span> 0.5 °C for air temperature and between <span class="inline-formula">±</span> 3 % and <span class="inline-formula">±</span> 5 % for the relative humidity, ranking the MeteoTracker in the real accuracy range of similar commercial sensors from the literature and making it a valid solution for atmospheric monitoring.</p>
<p>It is a challenge to obtain accurate measurements of the microphysical properties of delicate, structurally complex, frozen, and semi-frozen hydrometeors. We present a new technique for the real-time measurement of the density of freshly fallen individual snowflakes. A new thermal-imaging instrument, the Differential Emissivity Imaging Disdrometer (DEID), has been shown through laboratory and field experiments to be capable of providing accurate estimates of individual snowflake and bulk snow hydrometeor density (which can be interpreted as the snow-to-liquid ratio or SLR). The method exploits the rate of heat transfer during the melting of a hydrometeor on a heated metal plate, which is a function of the temperature difference between the hotplate surface and the top of the hydrometeor. The product of the melting speed and melting time yields an effective particle thickness normal to the hotplate surface, which can then be used in combination with the particle mass and area on the plate to determine a particle density. Uncertainties in estimates of particle density are approximately 4 % based on calibrations with laboratory-produced particles made from water and frozen solutions of salt and water and field comparisons with both high-resolution imagery of falling snow and traditional snowpack density measurements obtained at 12 h intervals. For 17 storms, individual particle densities vary from 19 to 495 kg m<span class="inline-formula"><sup>−3</sup></span>, and storm mean snow densities vary from 40 to 100 kg m<span class="inline-formula"><sup>−3</sup></span>. We observe probability distribution functions for hydrometeor density that are nearly Gaussian with kurtosis of <span class="inline-formula">≈</span> 3 and skewness of <span class="inline-formula">≈</span> 0.01.</p>
Volodymyr Rashkivskyi, Mykola Prystailo, Bohdan Fedyshyn
et al.
The purpose of the proposed article is the development and construction of a mechanized mobile platform for serving people, which is caused by the need to increase the safety of the operation of such technical means, in particular, in the case of the need for mass customer service. The methodology is based on search, research and creative approaches. The methods of development analysis, patent search, synthesis of technical solutions, simulation modelling was used. Scientific novelty. The study of the features of various approaches to the creation of effective mechanized moving platforms, the analysis of solutions and the dynamics of patenting made it possible to substantiate the directions of development of technical solutions and the prospects of developments. The authors proposed constructive solutions for mobile platforms, developed approaches to the technical implementation of increasing the safety of their operation, proposed energy-saving approaches aimed at reducing the energy consumption of mechanized means, which is especially relevant in the mass implementation of platforms for serving people. Research results. The article solves important safety issues of human service, in particular in the entertainment industry through the development of structural parts, drives and rules for the operation of mechanized moving platforms. Synthesized constructive solutions obtained in the course of patent research, analysis of modern technical solutions, rational technical design, and expert evaluation are presented. It was determined that the safety of the operation of mechanized moving platforms, which are intended for the transport of people in the field of tourism, depends on effective approaches to the design and components of the technical system in the form of a moving platform, its structural components, elements of its mechanism and the drive system as a whole, which with the optimization of technical indicators the stability of the overall system, the smoothness of movement and braking of the platform, the optimization of the materiality of the structure in total allow to have a qualitative effect on improving the safety of human operation.
Electroencephalography (EEG) signals provide critical insights for applications in disease diagnosis and healthcare. However, the scarcity of labeled EEG data poses a significant challenge. Foundation models offer a promising solution by leveraging large-scale unlabeled data through pre-training, enabling strong performance across diverse tasks. While both temporal dynamics and inter-channel relationships are vital for understanding EEG signals, existing EEG foundation models primarily focus on the former, overlooking the latter. To address this limitation, we propose Graph-Enhanced EEG Foundation Model (GEFM), a novel foundation model for EEG that integrates both temporal and inter-channel information. Our architecture combines Graph Neural Networks (GNNs), which effectively capture relational structures, with a masked autoencoder to enable efficient pre-training. We evaluated our approach using three downstream tasks and experimented with various GNN architectures. The results demonstrate that our proposed model, particularly when employing the GCN architecture with optimized configurations, consistently outperformed baseline methods across all tasks. These findings suggest that our model serves as a robust foundation model for EEG analysis.
Rishi Bommasani, Kevin Klyman, Sayash Kapoor
et al.
Foundation models are increasingly consequential yet extremely opaque. To characterize the status quo, the Foundation Model Transparency Index (FMTI) was launched in October 2023 to measure the transparency of leading foundation model developers. FMTI 2023 assessed 10 major foundation model developers (e.g. OpenAI, Google) on 100 transparency indicators (e.g. does the developer disclose the wages it pays for data labor?). At the time, developers publicly disclosed very limited information with the average score being 37 out of 100. To understand how the status quo has changed, we conduct a follow-up study after 6 months: we score 14 developers against the same 100 indicators. While in FMTI 2023 we searched for publicly available information, in FMTI 2024 developers submit reports on the 100 transparency indicators, potentially including information that was not previously public. We find that developers now score 58 out of 100 on average, a 21 point improvement over FMTI 2023. Much of this increase is driven by developers disclosing information during the FMTI 2024 process: on average, developers disclosed information related to 16.6 indicators that was not previously public. We observe regions of sustained (i.e. across 2023 and 2024) and systemic (i.e. across most or all developers) opacity such as on copyright status, data access, data labor, and downstream impact. We publish transparency reports for each developer that consolidate information disclosures: these reports are based on the information disclosed to us via developers. Our findings demonstrate that transparency can be improved in this nascent ecosystem, the Foundation Model Transparency Index likely contributes to these improvements, and policymakers should consider interventions in areas where transparency has not improved.
This book introduces the theoretical foundations of FMCW radar systems, including range and velocity estimation, signal processing techniques, and the generation of radar point clouds. A detailed discussion of Python and MATLAB as the primary programming tools for radar signal processing is provided, including the integration of libraries like NumPy, Matplotlib, and SciPy for data analysis and visualization. In addition, the book covers advanced techniques such as deep learning applications for radar signal processing, focusing on Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformers for analyzing radar data. Furthermore, it highlights state-of-the-art methods for human activity recognition using radar, leveraging a combination of traditional signal processing techniques and machine learning models. The book is designed to cater to both beginners and experts in radar signal processing, offering practical examples, code implementations, and insights into the future of radar technology in various domains, including autonomous systems and security applications.
Lorenzo Zangari, Candida M. Greco, Davide Picca
et al.
Moral values have deep roots in early civilizations, codified within norms and laws that regulated societal order and the common good. They play a crucial role in understanding the psychological basis of human behavior and cultural orientation. The Moral Foundation Theory (MFT) is a well-established framework that identifies the core moral foundations underlying the manner in which different cultures shape individual and social lives. Recent advancements in natural language processing, particularly Pre-trained Language Models (PLMs), have enabled the extraction and analysis of moral dimensions from textual data. This survey presents a comprehensive review of MFT-informed PLMs, providing an analysis of moral tendencies in PLMs and their application in the context of the MFT. We also review relevant datasets and lexicons and discuss trends, limitations, and future directions. By providing a structured overview of the intersection between PLMs and MFT, this work bridges moral psychology insights within the realm of PLMs, paving the way for further research and development in creating morally aware AI systems.
Björn Schuller, Adria Mallol-Ragolta, Alejandro Peña Almansa
et al.
The dawn of Foundation Models has on the one hand revolutionised a wide range of research problems, and, on the other hand, democratised the access and use of AI-based tools by the general public. We even observe an incursion of these models into disciplines related to human psychology, such as the Affective Computing domain, suggesting their affective, emerging capabilities. In this work, we aim to raise awareness of the power of Foundation Models in the field of Affective Computing by synthetically generating and analysing multimodal affective data, focusing on vision, linguistics, and speech (acoustics). We also discuss some fundamental problems, such as ethical issues and regulatory aspects, related to the use of Foundation Models in this research area.
AbstractThe bulk of the particulate matter (PM) emissions generated during construction projects are significantly released during the earthwork and foundation stages. To reduce and control these emissions, it is necessary to have reliable data on their characteristics. However, construction PM are poorly characterized because their composition depends on several factors (e.g., weather and reduction measures) and various on-site activities whose effects may interact. To address these challenges, a long-term quantitative empirical study using advanced statistical methods was performed on a real construction project during the whole earthwork and foundation stages. The upwind-downwind method was used to collect data on PM emissions throughout the earthwork and foundation construction process, and correlation analysis, paired samples t-test, and partial least squares regression (PLS) were used to analyze TSP, PM10, and PM2.5 emissions and their relationships with various influencing factors. The results showed that both earthwork and foundation constructions generate substantial PM emissions because there were differences with statistical significances in the PM levels measured upwind and downwind of the construction site. TSP and PM10 emissions correlated moderately with humidity and wind speed. However, temperature and atmospheric pressure did not correlate significantly with any of the measured emissions. The main activities responsible for PM emissions during the earthwork and foundation construction stages were hammer piling, waste stacking, and materials transportation. Water spraying was found to effectively reduce TSP and PM10 emissions, while the use of a fog cannon more effectively reduced PM2.5 emissions. Construction PM is an important source of atmospheric pollution in cities; the findings presented herein provide cornerstone and knowledge to guide efforts for reducing its impact.
<p>Since its inception more than 2 decades ago, proton-transfer-reaction mass spectrometry (PTR-MS) has established itself as a powerful technique for the measurements of a wide range of volatile
organic compounds (VOCs) with high time resolution and low detection limits
and without the need for any sample pre-treatment. As this technology has
matured and its application become more widespread, there is a growing need for accurate and traceable calibration to ensure measurement comparability.
As a result of the large number of VOCs detectable with PTR-MS, it is impractical to have a calibration standard or standards that cover all
observable compounds. However, recent work has demonstrated that
quantitative measurements of uncalibrated compounds are possible provided
that the transmission curve is accurately constrained. To enable this, a
novel traceable multi-component gas reference material containing 20
compounds spanning a mass range of 32 to 671 has been developed. The
development and compositional evolution of this reference material are described along with an evaluation of its accuracy and stability. This work
demonstrates that for the majority of components the accuracy is <span class="inline-formula"><</span> 5 % (most <span class="inline-formula"><</span> 3 %; <span class="inline-formula"><</span> 10 % for
hexamethylcyclotrisiloxane (D3-siloxane) and 1,2,4-trichlorobenzene – 1,2,4-TCB) with stabilities of <span class="inline-formula">></span> 2 years (<span class="inline-formula">></span> 1 year for acetonitrile, methanol and perfluorotributylamine – PFTBA).</p>
<p>The Global Navigation Satellite System (GNSS) radio occultation (RO) technique has proven to be an effective tool for Earth atmosphere profiling. Traditional spaceborne RO satellite constellations are expensive with relatively low sampling density for specific regions of interest. In contrast, in-atmosphere RO platforms can provide much higher spatial and temporal sampling of ROs around regional weather events. This study explores the capability of a low-cost and scalable commercial off-the-shelf (COTS) GNSS receiver on board high-altitude balloons. The refractivity retrievals from balloon-borne RO payloads obtained from two flight campaigns (World View and ZPM-1) are presented. The balloon-borne RO soundings from the World View campaign show refractivity profiles between 6 and 19 km, with overall near-zero median difference from colocated ECMWF ERA5 reanalysis data and variability comparable to spaceborne RO missions (<span class="inline-formula">∼</span> 2.3 % median absolute deviation or MAD). Soundings from the ZPM-1 campaign show a relatively large positive refractivity bias (<span class="inline-formula">∼</span> 2.5 %). In summary, low-cost COTS RO payloads on board balloon platforms are worth further engineering and study in order to provide capabilities for dense, targeted atmospheric soundings that can improve regional weather forecasts via data assimilation.</p>
Heejong Kim, Victor Ion Butoi, Adrian V. Dalca
et al.
Most state-of-the-art techniques for medical image segmentation rely on deep-learning models. These models, however, are often trained on narrowly-defined tasks in a supervised fashion, which requires expensive labeled datasets. Recent advances in several machine learning domains, such as natural language generation have demonstrated the feasibility and utility of building foundation models that can be customized for various downstream tasks with little to no labeled data. This likely represents a paradigm shift for medical imaging, where we expect that foundation models may shape the future of the field. In this paper, we consider a recently developed foundation model for medical image segmentation, UniverSeg. We conduct an empirical evaluation study in the context of prostate imaging and compare it against the conventional approach of training a task-specific segmentation model. Our results and discussion highlight several important factors that will likely be important in the development and adoption of foundation models for medical image segmentation.
Organizations typically train large models individually. This is costly and time-consuming, particularly for large-scale foundation models. Such vertical production is known to be suboptimal. Inspired by this economic insight, we ask whether it is possible to leverage others' expertise by trading the constituent parts in models, i.e., sets of weights, as if they were market commodities. While recent advances in aligning and interpolating models suggest that doing so may be possible, a number of fundamental questions must be answered to create viable parameter markets. In this work, we address these basic questions, propose a framework containing the infrastructure necessary for market operations to take place, study strategies for exchanging parameters, and offer means for agents to monetize parameters. Excitingly, compared to agents who train siloed models from scratch, we show that it is possible to mutually gain by using the market, even in competitive settings. This suggests that the notion of parameter markets may be a useful paradigm for improving large-scale model training in the future.
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data. However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations. Moreover, the data used for pretraining foundation models are usually invisible and very different from the target data of downstream tasks. This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task that has a quite different architecture with only downstream target data. Existing transfer learning or knowledge distillation methods depend on either the same model structure or finetuning of the foundation model. Thus, naively introducing these methods can be either infeasible or very inefficient. To address this, we propose a Task-Driven Model Reprogramming (TDMR) framework. Specifically, we reprogram the foundation model to project the knowledge into a proxy space, which alleviates the adverse effect of task mismatch and domain inconsistency. Then, we reprogram the target model via progressive distillation from the proxy space to efficiently learn the knowledge from the reprogrammed foundation model. TDMR is compatible with different pre-trained model types (CNN, transformer or their mix) and limited target data, and promotes the wide applications of vision foundation models to downstream tasks in a cost-effective manner. Extensive experiments on different downstream classification tasks and target model structures demonstrate the effectiveness of our methods with both CNNs and transformer foundation models.