Recent LLM-based data agents aim to automate data science tasks ranging from data analysis to deep learning. However, the open-ended nature of real-world data science problems, which often span multiple taxonomies and lack standard answers, poses a significant challenge for evaluation. To address this, we introduce DSAEval, a benchmark comprising 641 real-world data science problems grounded in 285 diverse datasets, covering both structured and unstructured data (e.g., vision and text). DSAEval incorporates three distinctive features: (1) Multimodal Environment Perception, which enables agents to interpret observations from multiple modalities including text and vision; (2) Multi-Query Interactions, which mirror the iterative and cumulative nature of real-world data science projects; and (3) Multi-Dimensional Evaluation, which provides a holistic assessment across reasoning, code, and results. We systematically evaluate 11 advanced agentic LLMs using DSAEval. Our results show that Claude-Sonnet-4.5 achieves the strongest overall performance, GPT-5.2 is the most efficient, and MiMo-V2-Flash is the most cost-effective. We further demonstrate that multimodal perception consistently improves performance on vision-related tasks, with gains ranging from 2.04% to 11.30%. Overall, while current data science agents perform well on structured data and routine data anlysis workflows, substantial challenges remain in unstructured domains. Finally, we offer critical insights and outline future research directions to advance the development of data science agents.
Jorge Martinez-Palomera, Amy Tuson, TESS Science Support Center
3I/ATLAS is the third known interstellar object to pass through our Solar System. NASA's Transiting Exoplanet Survey Satellite (TESS) made dedicated observations of 3I/ATLAS between 15 -- 22 January 2026 (Sector 1751), capturing high-cadence observations at 200s and 20s cadence. We present two High Level Science Products (HLSPs): (1) comet-centered image time series, corrected for background scattered light and stars; and (2) aperture light curves extracted from the corrected images. We created these data products using the official TESS products and they are publicly available at the Mikulski Archive for Space Telescopes (MAST). TESS's high-precision, near-continuous photometry will provide unique insights into the comet's activity following its closest approach to the Sun. The TESS Science Support Center (TSSC) has created these data products to facilitate scientific analyses by the TESS and Solar System communities.
Abstract Background Non-contrast CT (NCCT) is first-line imaging for suspected acute ischemic stroke (AIS) but has limited early sensitivity; deep learning (DL) may improve patient-level detection. Objectives To estimate the diagnostic accuracy of DL applied to NCCT for patient-level AIS detection and to examine prespecified sources of between-study heterogeneity. Methods We searched MEDLINE, Embase, and Web of Science (January 2010–May 2025). Eligible prospective or retrospective diagnostic studies evaluated DL on NCCT against an appropriate reference standard and reported (or allowed reconstruction of) patient-level 2 × 2 data. Two-gate case–control and lesion-only reports were excluded. Dual reviewers screened/extracted data; risk of bias was assessed with QUADAS-2, and AI-reporting against items adapted from STARD-AI/CLAIM/CONSORT-AI. Bivariate random-effects/HSROC models summarized sensitivity and specificity. Prespecified moderators were posterior-fossa inclusion, reference-standard robustness, and validation type. Sensitivity analyses included external-only cohorts, robust standards, posterior-fossa inclusion, and a “Direct AIS” construct subset. Results Of 1,899 records, 16 studies met inclusion; 13 contributed patient-level data to meta-analysis. Summary sensitivity was 0.91 (95% CI, 0.81–0.96) and specificity 0.90 (0.85–0.94). Sensitivity was lower for externally validated models than internally validated ones (0.82 [0.67–0.91] vs. 0.95 [0.89–0.98]) with similar specificity (0.88 [0.83–0.92] vs. 0.93 [0.82–0.97]). Findings were directionally robust across sensitivity analyses. QUADAS-2 frequently indicated concerns in patient selection and index-test domains; AI-reporting quality was mostly moderate, and explicit external validation remained uncommon. Conclusions DL applied to NCCT shows high accuracy for patient-level AIS detection. However, generalizability is the principal gap; broader external validation and guideline-concordant reporting are needed to support safe clinical adoption.
Artificial Intelligence (AI) is rapidly transforming engineering fields, from robotics to aerospace, with applications in control systems for UAVs and satellites. This work builds on a previously developed AI attitude controller for the InnoCube 3U nanosatellite. Deploying complex Neural Networks (NNs) on resource-limited microcontrollers presents a significant challenge. To overcome this, we propose distilling a Multi-Layer Perceptron (MLP) trained with Deep Reinforcement Learning (DRL) for attitude control into a Kolmogorov–Arnold Network (KAN). We convert this numeric KAN into a symbolic KAN, where each edge represents a learnable mathematical function, and finally extract a concise symbolic formula. This symbolic representation dramatically reduces memory usage and computational complexity, making it ideal for pico- and nanosatellites. We evaluate and demonstrate the feasibility of this approach for inertial pointing with reaction wheels in simulation using a realistic model of the InnoCube satellite. Our results show that the highly compressed KANs successfully solve the attitude control problem, while reducing the required memory footprint and inference time on the InnoCube ADCS hardware by over an order of magnitude. Beyond attitude control, we believe symbolic KANs hold great potential in aerospace for neural network compression and interpretable, data-driven modeling and system identification in future space missions.
Abstract The application of the $$\Phi$$ -OTDR (Phase-Optical Time Domain Reflectometry) system in real-time monitoring of power grid infrastructure has been proven effective in identifying and classifying various anomalies, such as digging, watering, and shaking. However, previous deep learning-based methods for $$\Phi$$ -OTDR event classification are primarily designed for balanced classification problems, where the number of abnormal and normal event samples is relatively equal. In practical scenarios, the data for abnormal events are often much smaller than those for normal events (noise), resulting in a long-tailed distribution problem that poses significant challenges for accurate classification. To address this long-tailed imbalance issue in the practical application of $$\Phi$$ -OTDR data, we introduce the Controllable Diffusion (ConDiff) framework, which aims to generate high-quality synthetic samples for abnormal situations. The ConDiff framework is composed of three essential components: Feedback-guided $$\Phi$$ -OTDR Augmenter, the High-Quality Sample Selection module, and the Dynamic Threshold Adjustment module. The Feedback-guided $$\Phi$$ -OTDR Augmenter utilizes diffusion model to generate synthetic samples that simulate abnormal events. The High-Quality Sample Selection module evaluates the quality of the generated synthetic samples and selects high-Quality samples. The Dynamic Threshold Adjustment module provides real-time feedback to dynamically control the sample generation process of Feedback-guided $$\Phi$$ -OTDR Augmenter. Compared to current state-of-the-art baselines, our proposed ConDiff framework achieves a notable improvement in classification accuracy, with an increase ranging from 3.7% to 7.2% in the BJTU-OTDR-LT dataset. This improvement demonstrates the effectiveness of the proposed ConDiff framework in addressing the long-tailed imbalance problem in $$\Phi$$ -OTDR event classification. The code will be released upon acceptance.
<p>Air pollution adversely affects health, ecosystems, and infrastructure. In the <i>Western Balkans</i> (Albania, Bosnia and Herzegovina, Kosovo<span class="note-anchor" id="fna_Ch1.Footn1"><a href="#fn_Ch1.Footn1"><sup>1</sup></a></span>, Montenegro, the Republic of North Macedonia, and Serbia), the air pollution situation is more adverse than in the European Union in general. Understanding the air quality situation requires high-quality emission data with a high-resolution spatial distribution, especially for enabling remediation efforts, which is lacking in the Western Balkan region.</p>
<p>In this work, we have calculated air pollution emissions from the heating of individual housing units in the Western Balkan region. The basis for the dataset is a geographical dataset of buildings detected from satellite imagery by artificial intelligence (AI) methods. The building data have been combined with geospatial land-use datasets and statistical data for heating needs for residential buildings in the countries included and finally with emission factors to calculate the heating emissions.</p>
<p>Using this novel approach, the resulting datasets provide high-resolution heating emission data for common pollutants and are published as open data (<a href="https://doi.org/10.5281/zenodo.13906810">https://doi.org/10.5281/zenodo.13906810</a>, <span class="cit" id="xref_altparen.1"><a href="#bib1.bibx2">Asker</a>, <a href="#bib1.bibx2">2024</a></span>). When comparing national totals for emissions, the datasets in this work are comparable to other, spatially coarser datasets, though the agreement strongly depends on the fuel usage data for each country/region.</p>
Hyperspectral anomaly detection (HAD) is challenging especially when anomalies are presented in sub-pixel form.The spectral signatures of anomalies in mixed pixels are mixed with those of background, making anomalies difficult to be distinguished from background. Most existing methods detect sub-pixel targets in abundance space by spectral unmixing. However, since abundance feature extraction and anomaly detection are decoupled, the learned features are not well-suitable for the subsequent detection. Moreover, these methods neglect the negative effect of anomalies on spectral unmixing, which leads to degradation of detection performance. To tackle these problems, we propose a cascaded autoencoder (AE) unmixing network for HAD. First, based on anomalies have larger spectral reconstruction errors than background, a background estimation approach is proposed to alleviate the negative effect of anomalies on spectral unmixing. Second, a cascaded AE is designed to achieve spectral unmixing from the estimated background to simultaneously obtain the endmembers and abundance vectors. Third, a deep Gaussian mixture model is leveraged to estimate the density distributions of spectral features since anomalies usually lie in the low-density areas. In this way, spectral unmixing and detection are jointly optimized to construct a unified detection framework. Experimental results demonstrate that our method achieves superior detection performance to existing state-of-the-art HAD methods.
Wan Hussain Wan Ishak, Fadhilah Yamin, Siti Sarah Maidin
et al.
Rainfall data is essential for applications such as climate monitoring, agricultural planning, flood forecasting, and water resource management. However, the interpretation of this data is often hindered by its high volume, variability, and multi-scale temporal nature. Effective visualization is critical not only for summarizing complex datasets but also for uncovering patterns, detecting anomalies, and facilitating informed decision-making. Despite the availability of numerous visualization techniques, selecting the most suitable method for rainfall data, especially across varying temporal resolutions is a challenging task. This study presents a comparative analysis of widely used data visualization techniques in the context of rainfall data. The methodology was structured into three phases: understanding the nature of rainfall data, reviewing relevant visualization techniques, and conducting a comparative content analysis. A SWOT (Strengths, Weaknesses, Opportunities, and Threats) evaluation was used to assess each technique’s analytical potential, while a temporal suitability comparison was performed across five time granularities: yearly, monthly, weekly, daily, and hourly. Findings show that no single technique is universally effective. Instead, each method demonstrates specific strengths and limitations depending on the temporal scale and analytical objective. Line charts and bar charts are well-suited for lower-frequency data, while heat maps and scatter plots are more effective for high-resolution, time-sensitive patterns. Box plots and histograms provide valuable insights into data distribution and variability, whereas map-based visualizations excel in spatial analysis but require enhancements for temporal exploration. The study concludes that visualization effectiveness depends on aligning method selection with data characteristics and analytical goals. A thoughtful combination of techniques is often necessary to achieve clarity, reduce misinterpretation, and enhance decision support in rainfall data analysis.
Scientific research faces high costs and inefficiencies with traditional methods, but the rise of deep learning and large language models (LLMs) offers innovative solutions. This survey reviews transformer-based LLM applications across scientific fields such as biology, medicine, chemistry, and meteorology, underscoring their role in advancing research. However, the continuous expansion of model size has led to significant memory demands, hindering further development and application of LLMs for science. This survey systematically reviews and categorizes memory-efficient pre-training techniques for large-scale transformers, including algorithm-level, system-level, and hardware-software co-optimization. Using AlphaFold 2 as an example, we demonstrate how tailored memory optimization methods can reduce storage needs while preserving prediction accuracy. By bridging model efficiency and scientific application needs, we hope to provide insights for scalable and cost-effective LLM training in AI for science.
Grace Wolf-Chase, Charles Kerton, Kathryn Devine
et al.
We review participatory science programs that have contributed to the understanding of star formation. The Milky Way Project (MWP), one of the earliest participatory science projects launched on the Zooniverse platform, produced the largest catalog of ``bubbles'' associated with feedback from hot young stars to date, and enabled the identification of a new class of compact star-forming regions (SFRs) known as ``yellowballs'' (YBs). The analysis of YBs through their infrared colors and catalog cross-matching led to discovering that YBs are compact photodissociation regions generated by intermediate- and high-mass young stellar objects embedded in clumps that range in mass from 10 - 10,000 solar masses and luminosity from 10 - 1,000,000 solar luminosities. The MIRION catalog, assembled from 6176 YBs identified by citizen scientists, increases the number of candidate intermediate-mass SFRs by nearly two orders of magnitude. Ongoing work utilizing data from the Spitzer, Herschel and WISE missions involves analyzing infrared color trends to predict physical properties and ages of YB environments. Methods include applying summary statistics to histograms and color-color plots as well as SED fitting. Students in introductory astronomy classes contribute toward continued efforts refining photometric measurements of YBs while learning fundamental concepts in astronomy through a classroom-based participatory science experience, the PERYSCOPE project. We also describe an initiative that engaged seminaries, family groups, and interfaith communities in a wide variety of science projects on the Zooniverse platform. This initiative produced important guidance on attracting audiences that are underserved, underrepresented, or apprehensive about science.
Abstract Background Academic publishing is a cornerstone of scholarly communications, yet is unfortunately open to abuse, having given rise to ‘predatory publishers’– groups that employ aggressive marketing tactics, are deficient in methods and ethics, and bypass peer review. Preventing these predatory publishers from infiltrating scholarly activity is of high importance, and students must be trained in this area to increase awareness and reduce use. The scope of this issue in the context of medical students remains unknown, and therefore this sought to examine the breadth of the current literature base. Methods A rapid scoping review was undertaken, adhering to adapted PRISMA guidelines. Six databases (ASSIA, EBSCO, Ovid, PubMed, Scopus, Web of Science) were systematically searched for content related to predatory publishing and medical students. Results were single-screened, facilitated by online reviewing software. Resultant data were narratively described, with common themes identified. Results After searching and screening, five studies were included, representing a total of 1338 students. Two predominant themes– understanding, and utilisation– of predatory publishers was identified. These themes revealed that medical students were broadly unaware of the issue of predatory publishing, and that a small number have already, or would consider, using their services. Conclusion There remains a lack of understanding of the threat that predatory publishers pose amongst medical students. Future research and education in this domain will be required to focus on informing medical students on the issue, and the implication of engaging with predatory publishers.
It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
Abstract. Quality data remain elusive while data access freedoms disappear. Serious mis-matches between data availability and human need should attract societal attention.
Background: Disruptions in the children Autism spectrum’s mirror neurons has been challenged by researchers in psychological and motor behavior science for children with autism spectrum. Methods that based on the mirror neuron system, such as cooperatively observation learning, can be used to increase the imitation of autism in children, and use them in teaching. However, few studies have investigated the effect of this educational method on the learning of autistic children.
Aims: The purpose of this study was to investigate the effect of observational learning by individual and dyad training on mirror neurons function and learning of movement skills in autism spectrum disorder children.
Methods: The current research is a semi-experimental one that was designed and implemented in 2017. The statistical population includes 7-14-year-old boys on the autism spectrum (Asperger's and high-functioning) in Tehran, who were selected through purposive sampling. The samples were allocated to two individual and dyad training groups and trained in breaststroke skills twice a week for a period of 8 weeks. The data collection tool includes the Gars Autism Severity Scale (1994); the Chest crawl skill assessment checklist (Galaho and Ozman, 2005), and an electroencephalograph (EEG) sensor used to record brain waves. Data analysis was done by repeated measure covariance analysis, which was done in SPSS software.
Results: The results of covariance analysis showed that both groups improved mirror neurons function, but significantly, the dyad training group improved mirror neurons function in post-test, the difference between two groups was significantly in benefit of the dyad training group (p <0.001). Also, the quality of swimming skill (chest) in the dyad training group was significantly improved, and the difference between the two groups in improving the swimming skill was significantly in favor of the dyad group (p <0.001).
Conclusion: Training through observational learning, imitation, and observational feedback improves the activity of mirror neurons, which is effective in improving motor actions. As a result, one of the suggestions in this study is to focus on enhancing the functional capabilities of the neurons through observational learning in order to improve the motor skill of autism and neuropsychiatric rehabilitation.
In the present paper, in view of the variational approach, we discuss the Neumann problems with px-Laplacian-like operator and nonstandard growth condition, originated from a capillary phenomena. By using the least action principle and fountain theorem, we prove the existence and multiplicity of solutions to the class of Neumann problems under suitable assumptions.
Heni Siswantari, Lovandri Dwanda Putra, Fatih Ridlwan Munier
et al.
Sufi dance is one of the dances that is considered the most Islamic and represents spiritual values. One of the Islamic boarding schools that is still developing Sufi dance is Maulana Rumi Islamic Boarding School, Yogyakarta. The method used is qualitative with a performance studies approach in Sufi dance. The collection techniques are observation, interviews, and documentation. Obtaining survey data uses a multidisciplinary approach; art-social science, and art-religion. While the data analysis technique uses the opinion of Milles & Huberman, namely data collection- data reduction- data presentation- conclusion. The results showed that the performance of Sufi dance at Maulana Rumi Islamic Boarding School are as follows: (1) Sufi dance performance is carried out routinely every selapanan night or once every 40 days at Maulana Rumi Islamic boarding school. Another event that is also regularly held is the Sufi dance performance in the study which is held at Basa Basi cafe in Yogyakarta every Wednesday from 20.00-22.00 WIB. Sufi dance performances are performed in between recitations of books both by Maulana Rumi and others, such as the book Ihya Ulumuddin by Imam Gozali, the book Nurudh Dholam by Syekh Muhammad Nawawi As-Syafi'ie, and the book Hidayatul Azkiya Ila Thariqil Auliya by Zeinuddin ibn Ali Al-Ma' Bari Al-Malibari. (2) There is a social interaction between dancers and study participants who are both present in the same space and time at the Maulana Rumi Islamic boarding school and the Basa Basi cafe, and (3) The Sufi dance performance events were manifested by an interrelationship between dancers and performers where the audience feels involved in spiritual events in the form of the Sufi dance.
Matipa Ricky Ngandu, David Risinamhodzi, Godwin Pedzisai Dzvapatsva
et al.
Abstract ICT tools in education are widely used to support the aim of achieving learning outcomes by improving critical areas such as student engagement, participation, and motivation. In this study, we examine literature to explore how game elements are used in capturing students’ interest, which the study suggests is fundamental to the teaching and learning of Software Engineering in higher education. Given the potential of alternative ICT tools such as flipped classrooms to increase interest in learning activities, there is a gap in similar literature on capturing interest in gamified environments, which has the potential to improve the achievement of learning outcomes. We applied flow theory to provide a guiding frame for our study. Following a systematic literature review for our data, we analysed 15 papers from the initial 342 articles, which were extracted from IEEE Xplore and Science Direct databases. The main finding in the reviewed papers underscores the positive impact of gamified learning environments on capturing student interest when teaching and learning Software Engineering. While the reviewed papers were not conclusive in identifying the best game elements for capturing students’ interest, we found, that game elements such as points and leaderboards were the most common mechanisms used to advance students' interest when studying Software Engineering courses. The findings also suggest that different game elements are used in gamified environments to increase participation and engagement. The paper adds voice to the practical implications of gamification for teaching and learning. Although our study requires empirical evidence to validate our claims, we believe it sets the stage for further discussion. In the future, comparative studies of game elements in similar environments will be beneficial for identifying the ones that are more engaging and assessing their long-term impacts.