The way science is currently practiced shows conclusions but hides how they were reached. Researchers work privately, polish their results, publish a finished paper, and defend it. Errors are punished by retraction rather than corrected by amendment. Alternative directions are pursued through competing papers with no shared history. The reasoning, the dead ends, the trade-offs, the corrections: everything that would let others understand how a conclusion was reached is invisible. Two decades of open science reform have addressed this by opening specific artifacts: papers, data, code, notebooks, protocols. Each is valuable, but the unit remains a finished product. None opens the thinking process itself: the evolving sequence of questions, interpretations, dead ends, and direction changes that constitutes the actual scientific contribution. This paper argues that opening the process of science (not just its outputs) would produce a step change in the speed of scientific progress, the accessibility of scientific reasoning, the trustworthiness of scientific claims, and the scalability of scientific quality assurance. We identify three properties the workflow needs: visible (the process is open, not just the product), trackable (every change is recorded and attributable), and forkable (anyone can branch from any point with shared history preserved). A visible, trackable flow is inherently verifiable: by humans, by automated tools, by AI agents. Software development adopted this flow decades ago, and the results (faster correction, broader contribution, maintained quality at scale) demonstrate the opportunity for science.
I describe my experience writing the first original, modern Computer Science research paper expressed entirely in an Indian language. The paper is in Telugu, a language with approximately 100 million speakers. The paper is in the field of distributed computing and it introduces a technique for proving epistemic logic based lower bounds for multiprocessor algorithms. A key hurdle to writing the paper was developing technical terminology for advanced computer science concepts, including those in algorithms, distributed computing, and discrete mathematics. I overcame this challenge by deriving and coining native language scientific terminology through the powerful, productive, Pāninian grammar of Samskrtam. The typesetting of the paper was an additional challenge, since mathematical typesetting in Telugu is underdeveloped. I overcame this problem by developing a Telugu XeLaTeX template, which I call TeluguTeX. Leveraging this experience of writing an original computer science research paper in an Indian language, I lay out a vision for how to ameliorate the state of scientific writing at all levels in Indic languages -- languages whose native speakers exceed one billion people -- through the further development of the Sanskrit technical lexicon and through technological internationalization.
Active learning (AL) plays a critical role in materials science, enabling applications such as the construction of machine-learning interatomic potentials for atomistic simulations and the operation of self-driving laboratories. Despite its widespread use, the reliability and effectiveness of AL workflows depend on implicit design assumptions that are rarely examined systematically. Here, we critically assess AL workflows deployed in materials science and investigate how key design choices, such as surrogate models, sampling strategies, uncertainty quantification and evaluation metrics, relate to their performance. By identifying common pitfalls and discussing practical mitigation strategies, we provide guidance to practitioners for the efficient design, assessment, and interpretation of AL workflows in materials science.
The past few years have witnessed an increasing use of machine learning (ML) systems in science. Paul Humphreys has argued that, because of specific characteristics of ML systems, human scientists are pushed out of the loop of science. In this chapter, I investigate to what extent this is true. First, I express these concerns in terms of what I call epistemic control. I identify two conditions for epistemic control, called tracking and tracing, drawing on works in philosophy of technology. With this new understanding of the problem, I then argue against Humphreys pessimistic view. Finally, I construct a more nuanced view of epistemic control in ML-based science.
Following in the footsteps of the success of Mathlib - the centralised library of formalised mathematics in Lean - CSLib is a rapidly-growing centralised library of formalised computer science and software. In this paper, we present its founding technical principles, operation, abstractions, and semantic framework. We contribute reusable semantic interfaces (reduction and labelled transition systems), proof automation, CI/testing support for maintaining automation and compatibility with Mathlib, and the first substantial developments of languages and models.
В умовах тривалої агресії проти України питання забезпечення воєнної безпеки тісно пов’язані з ефективністю функціонування енергетичної системи держави. Масовані ракетні удари по об’єктах енергетичної інфраструктури, загроза нестачі палива та перебої в енергопостачанні безпосередньо впливають на боєздатність Збройних Сил України, логістику та стійкість оборонно-промислового комплексу. Відтак, виникає потреба у створенні інтегрованого інструментарію, здатного кількісно оцінити вплив енергетичної безпеки на воєнну безпеку держави. Метою статті є розроблення методичного апарату для оцінювання і прогнозування впливу стану енергетичної безпеки на рівень воєнної безпеки держави шляхом побудови формалізованих моделей взаємозв’язку між компонентами енергетичної інфраструктури та оборонного потенціалу. Основу дослідження становлять методи системного аналізу, модифікованого когнітивного моделювання, аналізу енергетичних потоків, багатофакторної експертної оцінки, а також методи побудови індикативних моделей (індексу стійкості, ризику, впливу). Застосовано також сценарний підхід до прогнозування воєнно-критичних ситуацій. У статті запропоновано опис структури комплексного методичного апарату, що передбачає чотири взаємопов’язані методи: когнітивне моделювання функціонування енергосистеми; моделювання енергетичних потоків з урахуванням потреб воєнної сфери; метод оцінювання та прогнозування впливу енергетичної безпеки на воєнну; метод визначення вимог до енергосистеми у кризових умовах. Напрямами подальших досліджень є: удосконалення моделей адаптації енергосистеми до умов воєнного часу, розроблення алгоритмів розподілу резервних потужностей у надзвичайних умовах, інтеграція моделей у системи стратегічного планування безпекового сектору України.Результати дослідження будуть корисними для науковців, аналітиків, представників органів державної влади, що займаються розробленням енергетичної політики.
Ryozo Masukawa, Sanggeon Yun, Sungheon Jeong
et al.
Network security faces major challenges from sophisticated cyber attacks that exploit lateral movement and evade traditional network intrusion detection mechanisms. To address these challenges, micro-segmentation has proven to be an effective defense strategy for isolating network components and limiting breach propagation. This paper presents TriageHD, a novel framework that integrates graph-based Hyper-Dimensional Computing (HDC) with a learning-to-rank algorithm to strengthen zero-trust network security. TriageHD constructs dynamic scene graphs from time-based network flow data, integrating feature representations extracted via a self-attention-based payload encoder. It employs a learning-to-rank algorithm with an approximated nDCG loss function, incorporating time-aware relevance and graph-aware HDC to prioritize nodes for segregation, thereby mitigating attack propagation. Experiments on the CIC-IDS-2017 dataset demonstrate that TriageHD outperforms state-of-the-art graph neural networks, including graph convolutional networks, graph attention networks, and graph transformer models, in threat prioritization accuracy. By providing a dynamic micro-segmentation approach, TriageHD significantly enhances automated threat detection and response. This work bridges traditional network security measures with zero-trust paradigms, laying the groundwork for future advancements in dynamic micro-segmentation.
Сформоване, дієве громадянське суспільство є основою розвитку державності та суспільних інституцій. Сьогодні в українському суспільстві є релевантними та актуальними приклади єднання та взаємодопомоги, цінності спільноти вільних та відповідальних громадян, що робить важливим запит на розвиток культури громадянськості. Засадничим у розумінні культури є поняття громадянина, як людини, яка захищає свою країну, цінує пам’ять й спадщину предків, з повагою ставиться до місця, де живе. Вона не байдужа до спів громадян, здатна бути чуйною та людяною, толерантною до думок, дій й стилю життя інших людей, уміє брати відповідальність за свої вчинки та самостійно приймати рішення. Людина є метою державницької політики, активним учасником суспільних змін. Успіх та потенції суспільного розвитку залежить від її масштабу мислення, рівня освіченості й розвитку цінностей та чеснот. Низка гарних законів та суспільних починань за відсутності критичної маси відповідальних та вільних громадян навряд чи буде ефективними.
У розвитку культури громадянськості релевантними є ідеї філософії більдунґу. Ця філософія спрямована на забезпечення індивідуального дорослішання, морального та емоційного зростання великої кількості людей, “окультурення” їх освітою, розвитком громадянських чеснот та моральних цінностей тощо. Ідеї філософії розуміються як інструмент особистісного саморозвитку, яка дає можливості для розкриття унікальності та індивідуальності людини у межах спільної системи культурних цінностей. Більдунґ – це процес та результат, спрямований на процеси самотрансформації.
У площині практичної реалізації для українського суспільства філософія більдунґу може запропонувати продуктивні ідеї у вирішенні кризових ситуацій, розвинути у людини спроможність стратегічно й гнучко мислити та бути здатною вести продуктивний діалог на складні й дражливі теми, знаходити ефективні рішення у довгостроковій перспективі, організовувати та втілювати їх реалізацію тощо. Це є особливо релевантним у воєнні часи, коли важливо зберігати моральну та емоційну зрілість, а також накопичувати та розвивати ресурси для відновлення країни у післявоєнний період. Більдунґ як “світська форма внутрішнього розвитку” має великий потенціал об’єднавчої сили у суспільстві. Концепції більдунґу створюють концепції та бренд національної ідентичності, що є особливо актуальним в умовах екзистенційних криз, коли людина має відчувати свою приналежність до родини та соціальних спільнот.
Mourad Feindiri, Hakima Kabbaj, Mouna Salihoun
et al.
Abstract Background Hepatitis B is a silently devastating disease that presents a public health concern and affects millions of people. Recent estimates indicate that 254 million individuals globally are afflicted with chronic HBV infection, with around 1.2 million new cases emerging annually and roughly 1.1 million fatalities primarily resulting from long-term consequences. Like other developing countries, Morocco still lacks consolidated epidemiological data on the real burden of hepatitis B. In this context, our work aims to fill this gap through a systematic review and a meta-analysis of studies conducted over the last 25 years. Methods We conducted a comprehensive search in databases such as PubMed, Scopus, Web of Science, and Google Scholar, focusing on studies related to Hepatitis B in Morocco. Our inclusion criteria encompass all observational studies conducted and published between 2000 and 2024 reporting HBsAg prevalence among individuals residing in Morocco regardless of age, sex, or subpopulations. The prevalence studies included were assessed using the JBI Critical Appraisal Checklist for Studies Reporting Prevalence Data. Where appropriate, a pooled prevalence was calculated using a DerSimonian-Laird random effects model. Results 30 studies met the inclusion criteria, encompassing 34 cohorts and 750,784 individuals with prevalence ranging between 0.07% and 7.98%. HBsAg prevalence was assessed across four endemicity levels. About 74% of cohorts showed low prevalence (< 2%), seven studies lay in the 2–4.99% band. Only two studies reach the 5–7.99% band, while no study showed high endemicity (≥ 8%). The pooled prevalence of HBsAg in Morocco was 1.33% (95% CI: 1.07–1.61). Subgroup analysis revealed the lowest prevalence among military populations (0.36%) and the highest in vulnerable high-risk groups (5.43%), underscoring subpopulation-specific disparities in HBV burden. Conclusions The pooled prevalence observed across included studies reinforces the hypothesis that Morocco is transitioning towards a low-endemic HBV status, which reflects the comprehensive national measures, along with the positive impact of systematic vaccination of newborn introduced in 1999 and certain groups as well. However, additional efforts remain necessary to sustain this progress and adequately address the weaknesses observed at the national level in managing HBV infection, particularly among certain at-risk groups.
Preprints have become increasingly essential in the landscape of open science, facilitating not only the exchange of knowledge within the scientific community but also bridging the gap between science and technology. However, the impact of preprints on technological innovation, given their unreviewed nature, remains unclear. This study fills this gap by conducting a comprehensive scientometric analysis of patent citations to bioRxiv preprints submitted between 2013 and 2021, measuring and accessing the contribution of preprints in accelerating knowledge transfer from science to technology. Our findings reveal a growing trend of patent citations to bioRxiv preprints, with a notable surge in 2020, primarily driven by the COVID-19 pandemic. Preprints play a critical role in accelerating innovation, not only expedite the dissemination of scientific knowledge into technological innovation but also enhance the visibility of early research results in the patenting process, while journals remain essential for academic rigor and reliability. The substantial number of post-online-publication patent citations highlights the critical role of the open science model-particularly the "open access" effect of preprints-in amplifying the impact of science on technological innovation. This study provides empirical evidence that open science policies encouraging the early sharing of research outputs, such as preprints, contribute to more efficient linkage between science and technology, suggesting an acceleration in the pace of innovation, higher innovation quality, and economic benefits.
Rania Abdelghani, Kou Murayama, Celeste Kidd
et al.
Generative AI (GenAI) tools allow for effortless task completion, potentially fostering cognitive and metacognitive laziness in students. While surveys indicate widespread GenAI use among students as young as 11, their interactions strategies remain under-explored. A critical indicator of these interactions' quality is the ability to lead Question-Asking (QA) cycles: initiating goal-oriented inquiries, critically evaluating AI responses, and regulating subsequent strategies. While these behaviors predict robust learning in traditional settings, their role in AI-mediated environments remains unclear. Addressing this gap, this study investigates middle school students' (N=63, aged 14--15) capacity to adopt these behaviors with GenAI during science investigation tasks. We analyzed their proficiency in distinguishing efficient goal-oriented prompt from inefficient ones, their critical evaluation of AI responses, and their ability to generate follow-up questions to regulate learning in alignment with their informational needs. Findings reveal a pattern of over-reliance: students struggled to discriminate between prompt types, failed to detect vague AI explanations, and frequently terminated inquiry prematurely, without follow-up. Consequently, task performance remained moderate despite unrestricted AI access and high self-reported prior knowledge. Notably, positive AI attitudes were negatively associated with interaction quality, suggesting a disconnect between perceived and actual competence, whereas higher metacognitive skills predicted superior sensitivity to prompt quality. These results underscore the necessity for AI literacy interventions that move beyond technical understanding to explicitly train metacognitive regulation strategies, required for meaningful and sustainable QA-based learning with GenAI.
The Ultraviolet (UV) Type Ia Supernova Mission (UVIa) is a CubeSat/SmallSat concept that stands to test critical space-borne UV technology for future missions like the Habitable Worlds Observatory (HWO) while elucidating long-standing questions about the explosion mechanisms of Type Ia supernovae (SNe Ia). UVIa will observe whether any SNe Ia emit excess UV light shortly after explosion to test progenitor/explosion models and provide follow-up over many days to characterize their UV and optical flux variations over time, assembling a comprehensive multi-band UV and optical low-redshift anchor sample for upcoming high-redshift SNe Ia surveys (e.g., Euclid, Vera Rubin Observatory, Nancy Roman Space Telescope). UVIa's mission profile requires it to perform rapid and frequent visits to newly discovered SNe Ia, simultaneously observing each SNe Ia in two UV bands (FUV: 1500-1800A and NUV: 1800-2400A) and one optical band (u-band: 3000-4200A). In this study, we describe the UVIa mission concept science motivation and basic mission design. The UVIa mission concept has been submitted to the CubeSats category of the NASA ROSES Astrophysics Research & Analysis (APRA) program (\$10M cost cap) and NASA Astrophysics Pioneers program (\$20M cost cap).
The advent of foundation models (FMs) such as large language models (LLMs) has led to a cultural shift in data science, both in medicine and beyond. This shift involves moving away from specialized predictive models trained for specific, well-defined domain questions to generalist FMs pre-trained on vast amounts of unstructured data, which can then be adapted to various clinical tasks and questions. As a result, the standard data science workflow in medicine has been fundamentally altered; the foundation model lifecycle (FMLC) now includes distinct upstream and downstream processes, in which computational resources, model and data access, and decision-making power are distributed among multiple stakeholders. At their core, FMs are fundamentally statistical models, and this new workflow challenges the principles of Veridical Data Science (VDS), hindering the rigorous statistical analysis expected in transparent and scientifically reproducible data science practices. We critically examine the medical FMLC in light of the core principles of VDS: predictability, computability, and stability (PCS), and explain how it deviates from the standard data science workflow. Finally, we propose recommendations for a reimagined medical FMLC that expands and refines the PCS principles for VDS including considering the computational and accessibility constraints inherent to FMs.
A major challenge of AI + Science lies in their inherent incompatibility: today's AI is primarily based on connectionism, while science depends on symbolism. To bridge the two worlds, we propose a framework to seamlessly synergize Kolmogorov-Arnold Networks (KANs) and science. The framework highlights KANs' usage for three aspects of scientific discovery: identifying relevant features, revealing modular structures, and discovering symbolic formulas. The synergy is bidirectional: science to KAN (incorporating scientific knowledge into KANs), and KAN to science (extracting scientific insights from KANs). We highlight major new functionalities in the pykan package: (1) MultKAN: KANs with multiplication nodes. (2) kanpiler: a KAN compiler that compiles symbolic formulas into KANs. (3) tree converter: convert KANs (or any neural networks) to tree graphs. Based on these tools, we demonstrate KANs' capability to discover various types of physical laws, including conserved quantities, Lagrangians, symmetries, and constitutive laws.
Background: Soft-tissue sarcomas (STSs) are a rare type of cancer, accounting for about 1% of all adult cancers. Treatments for STSs can be difficult to implement because of their diverse histological and molecular features, which lead to variations in tumor behavior and response to therapy. Despite the growing importance of NETosis in cancer diagnosis and treatment, researches on its role in STSs remain limited compared to other cancer types.Methods: The study thoroughly investigated NETosis-related genes (NRGs) in STSs using large cohorts from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. The Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis and Support Vector Machine Recursive Feature Elimination (SVM-RFE) were employed for screening NRGs. Utilizing single-cell RNA-seq (scRNA-seq) dataset, we elucidated the expression profiles of NRGs within distinct cellular subpopulations. Several NRGs were validated by quantitative PCR (qPCR) and our proprietary sequencing data. To ascertain the impact of NRGs on the sarcoma phenotype, we conducted a series of in vitro experimental investigations. Employing unsupervised consensus clustering analysis, we established the NETosis clusters and respective NETosis subtypes. By analyzing DEGs between NETosis clusters, an NETosis scoring system was developed.Results: By comparing the outcomes obtained from LASSO regression analysis and SVM-RFE, 17 common NRGs were identified. The expression levels of the majority of NRGs exhibited notable dissimilarities between STS and normal tissues. The correlation with immune cell infiltration were demonstrated by the network comprising 17 NRGs. Patients within various NETosis clusters and subtypes exhibited different clinical and biological features. The prognostic and immune cell infiltration predictive capabilities of the scoring system were deemed efficient. Furthermore, the scoring system demonstrated potential for predicting immunotherapy response.Conclusion: The current study presents a systematic analysis of NETosis-related gene patterns in STS. The results of our study highlight the critical role NRGs play in tumor biology and the potential for personalized therapeutic approaches through the application of the NETosis score model in STS patients.
Douglas Beck, Joseph Carlson, Zohreh Davoudi
et al.
In preparation for the 2023 NSAC Long Range Plan (LRP), members of the Nuclear Science community gathered to discuss the current state of, and plans for further leveraging opportunities in, QIST in NP research at the Quantum Information Science for U.S. Nuclear Physics Long Range Planning workshop, held in Santa Fe, New Mexico on January 31 - February 1, 2023. The workshop included 45 in-person participants and 53 remote attendees. The outcome of the workshop identified strategic plans and requirements for the next 5-10 years to advance quantum sensing and quantum simulations within NP, and to develop a diverse quantum-ready workforce. The plans include resolutions endorsed by the participants to address the compelling scientific opportunities at the intersections of NP and QIST. These endorsements are aligned with similar affirmations by the LRP Computational Nuclear Physics and AI/ML Workshop, the Nuclear Structure, Reactions, and Astrophysics LRP Town Hall, and the Fundamental Symmetries, Neutrons, and Neutrinos LRP Town Hall communities.
Natália Dal Pizzol, Eduardo Dos Santos Barbosa, Soraia Raupp Musse
This study presents an automated bibliometric analysis of 6569 research papers published in thirteen Brazilian Computer Science Society (SBC) conferences from 1999 to 2021. Our primary goal was to gather data to understand the gender representation in publications in the field of Computer Science. We applied a systematic assignment of gender to 23.573 listed papers authorships, finding that the gender gap for women is significant, with female authors being under-represented in all years of the study.