Systemic Gendered Citation Imbalance in Computer Science: Evidence from Conferences and Journals
Kazuki Nakajima, Yuya Sasaki, Sohei Tokuno
et al.
Gender imbalance persists across science, technology, engineering, and mathematics (STEM) fields, including computer science, where it appears in researcher demographics, productivity, recognition, hiring, and career progression. Given computer science's rapid expansion and global influence, addressing this imbalance is essential for broadening participation and fueling innovation. Although journal-oriented disciplines exhibit consistent gender imbalances in citation practices, it remains unclear whether similar patterns arise in the conference-centric culture of computer science. Here, we systematically investigate gender imbalance in citations of conference and journal papers in computer science. We find that papers for which a woman is listed as either first or last author receive fewer citations than expected, partly because of homophilic citation tendencies (i.e., authors tend to cite papers that share specific attributes). This imbalance is especially pronounced for conference papers--particularly those published at top-tier venues--relative to journals. Moreover, we find that the prominence of the first or last author and the structure of their local co-authorship networks are potential drivers of these imbalances. By exploring how conference-centric publishing practices can amplify systemic imbalances in computer science, our study offers insights that may inform efforts to foster more equitable representation in academia.
Science Literacy: Generative AI as Enabler of Coherence in the Teaching, Learning, and Assessment of Scientific Knowledge and Reasoning
Xiaoming Zhai, James W. Pellegrino, Matias Rojas
et al.
This chapter examines the potential of generative AI in enhancing science literacy across the K-16+ grade span, including its benefits as well as the conceptual and practical challenges that doing so presents. It begins with a discussion of what defines science literacy in the era of AI, including how AI has changed science and the demand for future citizens to be scientifically literate when AI is applied in their careers and lives. The chapter further discusses why science literacy presents such a challenge in K-16+ educational settings. It then develops an argument for the type of architecture needed for AI to assist in solving the problem by bringing coherence to the teaching, learning, and assessment of science knowledge and reasoning. Components of this architecture are illustrated with respect to the AI tools and capabilities needed for design and implementation. The chapter concludes with a consideration of what has been learned regarding both science literacy and AI, as well as what remains to be learned, including the research and development (R&D) needed, and the generalizability of this science literacy case to other disciplinary learning and knowledge domains.
A non-field analytical solution of the Luikov equations for simultaneous heat and mass transfer
Vladimir Kulish, Bui Thanh Phan, Vladimír Horák
Non-field solutions to the Luikov equations for simultaneous heat and mass transfer have been derived using the method of Kulish. These solutions provide simple-to-use formulae for estimating transient values of temperature and mass (moisture) content as functions of heat and mass fluxes. The approach simplifies the modelling process by bypassing complex boundary conditions and eigenvalue calculations. Validation was performed using two limiting cases of geometry – planar and spherical – under constant heat flux conditions. The results demonstrated good agreement with experimental data, underscoring the accuracy and practical applicability of the derived solutions for engineering purposes such as drying processes.
Investigating the effect of NiO and NiF2 on boron carbide combustion
Siyi Zhang, Yue Jiang, Dunhui Xu
et al.
Boron-based fuels, recognized for their high energy density and potential in energetic applications, encounter challenges such as long ignition delays and incomplete combustion, which result in reduced combustion efficiency and limited performance in aerospace propulsion. In this study, boron carbide (B4C) is investigated as an alternative fuel to pristine boron due to its favorable gas-phase combustion. Both metal oxide (nickel oxide (NiO)) and metal fluoride (nickel fluoride (NiF2)) are selected as oxidizing modifiers to enhance the reactivity of B4C. A method combining laser ignition with optical diagnostics is employed to investigate the enhancing effects of different oxidizers on the ignition and combustion characteristics of B4C. Both NiO and NiF2 can significantly increase the combustion radiation intensity and reduce the time to maximum intensity of B4C. Differential scanning calorimetry, in-situ X-ray diffraction, and Fourier transform infrared spectroscopy were used for simultaneous thermal analysis of the B4C composite powders. Combined thermal analysis showed that the effects of NiO and NiF2 on promoting B4C combustion is mainly achieved via the formation of NimBn and the release of a large number of gas products. It is reasonable to speculate that the phase separation at the B2O3/NimBn interface forms new pathways for oxygen diffusion and reaction with the B core. The difference in the combustion mechanism of B4C with NiO and NiF2 lies in the gas phase products, i.e., CO2 and BF3, respectively, thus leading to significant differences in their reaction processes.
Applying Deep Learning to Anomaly Detection of Russian Satellite Activity for Indications Prior to Military Activity
David Kurtenbach, Megan Manly, Zach Metzinger
We apply deep learning techniques for anomaly detection to analyze activity of Russian-owned resident space objects (RSO) prior to the Ukraine invasion and assess the results for any findings that can be used as indications and warnings (I&W) of aggressive military behavior for future conflicts. Through analysis of anomalous activity, an understanding of possible tactics and procedures can be established to assess the existence of statistically significant changes in Russian RSO pattern of life/pattern of behavior (PoL/PoB) using publicly available two-line element (TLE) data. This research looks at statistical and deep learning approaches to assess anomalous activity. The deep learning methods assessed are isolation forest (IF), traditional autoencoder (AE), variational autoencoder (VAE), Kolmogorov Arnold Network (KAN), and a novel anchor-loss based autoencoder (Anchor AE). Each model is used to establish a baseline of on-orbit activity based on a five-year data sample. The primary investigation period focuses on the six months leading up to the invasion date of February 24, 2022. Additional analysis looks at RSO activity during an active combat period by sampling TLE data after the invasion date. The deep learning autoencoder models identify anomalies based on reconstruction errors that surpass a threshold sigma. To capture the nuance and unique characteristics of each RSO an individual model was trained for each observed space object. The research made an effort to prioritize explainability and interpretability of the model results thus each observation was assessed for anomalous behavior of the individual six orbital elements versus analyzing the input data as a single monolithic observation. The results demonstrate not only statistically significant anomalies of Russian RSO activity but also details anomalous findings to the individual orbital element.
Model Science: getting serious about verification, explanation and control of AI systems
Przemyslaw Biecek, Wojciech Samek
The growing adoption of foundation models calls for a paradigm shift from Data Science to Model Science. Unlike data-centric approaches, Model Science places the trained model at the core of analysis, aiming to interact, verify, explain, and control its behavior across diverse operational contexts. This paper introduces a conceptual framework for a new discipline called Model Science, along with the proposal for its four key pillars: Verification, which requires strict, context-aware evaluation protocols; Explanation, which is understood as various approaches to explore of internal model operations; Control, which integrates alignment techniques to steer model behavior; and Interface, which develops interactive and visual explanation tools to improve human calibration and decision-making. The proposed framework aims to guide the development of credible, safe, and human-aligned AI systems.
Computational Protein Science in the Era of Large Language Models (LLMs)
Wenqi Fan, Yi Zhou, Shijie Wang
et al.
Considering the significance of proteins, computational protein science has always been a critical scientific field, dedicated to revealing knowledge and developing applications within the protein sequence-structure-function paradigm. In the last few decades, Artificial Intelligence (AI) has made significant impacts in computational protein science, leading to notable successes in specific protein modeling tasks. However, those previous AI models still meet limitations, such as the difficulty in comprehending the semantics of protein sequences, and the inability to generalize across a wide range of protein modeling tasks. Recently, LLMs have emerged as a milestone in AI due to their unprecedented language processing & generalization capability. They can promote comprehensive progress in fields rather than solving individual tasks. As a result, researchers have actively introduced LLM techniques in computational protein science, developing protein Language Models (pLMs) that skillfully grasp the foundational knowledge of proteins and can be effectively generalized to solve a diversity of sequence-structure-function reasoning problems. While witnessing prosperous developments, it's necessary to present a systematic overview of computational protein science empowered by LLM techniques. First, we summarize existing pLMs into categories based on their mastered protein knowledge, i.e., underlying sequence patterns, explicit structural and functional information, and external scientific languages. Second, we introduce the utilization and adaptation of pLMs, highlighting their remarkable achievements in promoting protein structure prediction, protein function prediction, and protein design studies. Then, we describe the practical application of pLMs in antibody design, enzyme design, and drug discovery. Finally, we specifically discuss the promising future directions in this fast-growing field.
WebDS: An End-to-End Benchmark for Web-based Data Science
Ethan Hsu, Hong Meng Yam, Ines Bouissou
et al.
Many real-world data science tasks involve complex web-based interactions: finding appropriate data available on the internet, synthesizing multimodal data from different locations, and producing summarized analyses. Existing web benchmarks often focus on simplistic interactions and often do not require diverse tool-using capabilities. Conversely, traditional data science benchmarks typically concentrate on static, highly structured datasets and do not assess end-to-end workflows that encompass data acquisition, cleaning, analysis, and insight generation. In response, we introduce WebDS, the first end-to-end web-based data science benchmark. It comprises 870 web-based data science tasks across 29 diverse websites from structured government data portals to unstructured news media, challenging agents to perform complex, multi-step, tool-based operations, across heterogeneous data formats, to better reflect the realities of modern data analytics. Evaluations of current SOTA LLM agents indicate significant performance gaps in accomplishing these tasks. For instance, Browser Use, which accomplishes $80\%$ of tasks on WebVoyager, completes only 15% of tasks in WebDS, which our analysis suggests is due to new failure modes, such as poor information grounding, repetitive behavior and shortcut-taking that agents performing WebDS's tasks display. By contrast, humans achieve around 90% accuracy, highlighting a substantial gap between current agents and human performance. By providing a more robust and realistic testing ground, WebDS sets the stage for significant advances in the development of practically useful LLM-based data science.
Ukrainian Right-Wing Extremists: Exploring Their Involvement in the Ongoing War and Outlining Potential Threats for Post-War Ukraine
Martin Zilvar
As the Russia-Ukraine war constitutes the most severe security challenge Europe has faced
since the Cold War's end, many states have realized the fragility of statehood, which an aggressor can destroy overnight. Although this concern is valid, it should not overshadow other security threats. Unlike other authors addressing the phenomenon of foreign fighters in the war, the present article investigates the involvement of Ukrainian right-wing extremists regarding the pre-2022 development, during which the growing sociocultural nationalism, militarism, and tolerance of ultranationalist
and ethnonational groups helped shrink their isolation. While they might have played an important role in Ukraine's territorial defence, heavily armed and combat-skilled right-wing extremists might pose a severe threat to Ukraine's post-war restoration. Initially, whereas a literature review indicates the hitherto research and positions the article's inquiry within it, existing theoretical approaches define the observed actors. Based on the open-source intelligence data collection from Telegram and content analysis, the article identified several Ukrainian and also foreign right-wing extremists involved
in the war despite its focus on the former. It concludes that the predecessor authors' debated threats associated with the latter, i.e., physical threats, organizational challenges, and wider societal consequences, should be primarily applied to Ukraine
Scientific and Technological Advances as Current Challenges to the Biological Weapons Non-Proliferation Regime
D. L. Poklonskii
The recent advances in biological sciences and biotechnology have resulted in new knowledge and capabilities that challenge existing understandings of biological threats and biological weapons (BW). The purpose of the article is to evaluate scientific and engineering decisions that pose potential challenges to the biological weapons non-proliferation regime and can reduce barriers to their development, production and use. Materials and methods. The scientific articles available through the PubMed, Google Scholar and Russian Electronic Library databases were used in the research. The method of analysis is the description. The results of the research. The success of biotechnology provides impetus for experimentation with biological weapons, particularly by non-state actors such as terrorist organizations and extremist groups. Transformative changes are occurring in areas not directly related to microbiology. However, the potential for their malicious use is no less of a concern than the development, production and stockpiling of biological weapons. The transformation of the concept of «biological threat» is traced. It becomes more complex and includes elements from other fields outside of biotechnology and the traditional understanding of biological weapons. In addition to biotechnology and synthetic biology, such technologies that are directly related to the BTWC issue, may include: additive manufacturing based on 3D printing technologies; big data analysis and artificial intelligence technologies; nanotechnology and materials science, as well as biological research automation and robotics. Conclusion. Many dual-use technologies have received close attention from the scientific community and international experts, but this does not always contribute to an accurate and balanced understanding of their potential in the context of BTWC issues. The convergence of new and emerging disciplines is creating new areas of scientific knowledge that address the problem of non-proliferation of biological weapons, which requires the expert community to make a balanced assessment from the point of view of both dual use and the risk of excessive prohibition and negative impact on further scientific and technological progress.
PyTond: Efficient Python Data Science on the Shoulders of Databases
Hesam Shahrokhi, Amirali Kaboli, Mahdi Ghorbani
et al.
Python data science libraries such as Pandas and NumPy have recently gained immense popularity. Although these libraries are feature-rich and easy to use, their scalability limitations require more robust computational resources. In this paper, we present PyTond, an efficient approach to push the processing of data science workloads down into the database engines that are already known for their big data handling capabilities. Compared to the previous work, by introducing TondIR, our approach can capture a more comprehensive set of workloads and data layouts. Moreover, by doing IR-level optimizations, we generate better SQL code that improves the query processing by the underlying database engine. Our evaluation results show promising performance improvement compared to Python and other alternatives for diverse data science workloads.
Towards a Healthy AI Tradition: Lessons from Biology and Biomedical Science
Simon Kasif
AI is a magnificent field that directly and profoundly touches on numerous disciplines ranging from philosophy, computer science, engineering, mathematics, decision and data science and economics, to cognitive science, neuroscience and more. The number of applications and impact of AI is second to none and the potential of AI to broadly impact future science developments is particularly thrilling. While attempts to understand knowledge, reasoning, cognition and learning go back centuries, AI remains a relatively new field. In part due to the fact it has so many wide-ranging overlaps with other disparate fields it appears to have trouble developing a robust identity and culture. Here we suggest that contrasting the fast-moving AI culture to biological and biomedical sciences is both insightful and useful way to inaugurate a healthy tradition needed to envision and manage our ascent to AGI and beyond (independent of the AI Platforms used). The co-evolution of AI and Biomedical Science offers many benefits to both fields. In a previous perspective, we suggested that biomedical laboratories or centers can usefully embrace logistic traditions in AI labs that will allow them to be highly collaborative, improve the reproducibility of research, reduce risk aversion and produce faster mentorship pathways for PhDs and fellows. This perspective focuses on the benefits to AI by adapting features of biomedical science at higher, primarily cultural levels.
Computational Thought Experiments for a More Rigorous Philosophy and Science of the Mind
Iris Oved, Nikhil Krishnaswamy, James Pustejovsky
et al.
We offer philosophical motivations for a method we call Virtual World Cognitive Science (VW CogSci), in which researchers use virtual embodied agents that are embedded in virtual worlds to explore questions in the field of Cognitive Science. We focus on questions about mental and linguistic representation and the ways that such computational modeling can add rigor to philosophical thought experiments, as well as the terminology used in the scientific study of such representations. We find that this method forces researchers to take a god's-eye view when describing dynamical relationships between entities in minds and entities in an environment in a way that eliminates the need for problematic talk of belief and concept types, such as the belief that cats are silly, and the concept CAT, while preserving belief and concept tokens in individual cognizers' minds. We conclude with some further key advantages of VW CogSci for the scientific study of mental and linguistic representation and for Cognitive Science more broadly.
Promotional Language and the Adoption of Innovative Ideas in Science
Hao Peng, Huilian Sophie Qiu, Henrik Barslund Fosse
et al.
How are the merits of innovative ideas communicated in science? Here we conduct semantic analyses of grant application success with a focus on scientific promotional language, which has been growing in frequency in many contexts and purportedly may convey an innovative idea's originality and significance. Our analysis attempts to surmount limitations of prior studies by examining the full text of tens of thousands of both funded and unfunded grants from three leading public and private funding agencies: the NIH, the NSF, and the Novo Nordisk Foundation, one of the world's largest private science foundations. We find a robust association between promotional language and the support and adoption of innovative ideas by funders and other scientists. First, the percentage of promotional language in a grant proposal is associated with up to a doubling of the grant's probability of being funded. Second, a grant's promotional language reflects its intrinsic level of innovativeness. Third, the percentage of promotional language predicts the expected citation and productivity impact of publications that are supported by funded grants. Lastly, a computer-assisted experiment that manipulates the promotional language in our data demonstrates how promotional language can communicate the merit of ideas through cognitive activation. With the incidence of promotional language in science steeply rising, and the pivotal role of grants in converting promising and aspirational ideas into solutions, our analysis provides empirical evidence that promotional language is associated with effectively communicating the merits of innovative scientific ideas.
G-RAG: Knowledge Expansion in Material Science
Radeen Mostafa, Mirza Nihal Baig, Mashaekh Tausif Ehsan
et al.
In the field of Material Science, effective information retrieval systems are essential for facilitating research. Traditional Retrieval-Augmented Generation (RAG) approaches in Large Language Models (LLMs) often encounter challenges such as outdated information, hallucinations, limited interpretability due to context constraints, and inaccurate retrieval. To address these issues, Graph RAG integrates graph databases to enhance the retrieval process. Our proposed method processes Material Science documents by extracting key entities (referred to as MatIDs) from sentences, which are then utilized to query external Wikipedia knowledge bases (KBs) for additional relevant information. We implement an agent-based parsing technique to achieve a more detailed representation of the documents. Our improved version of Graph RAG called G-RAG further leverages a graph database to capture relationships between these entities, improving both retrieval accuracy and contextual understanding. This enhanced approach demonstrates significant improvements in performance for domains that require precise information retrieval, such as Material Science.
Sleep in the United States Military
C. Good, A. Brager, V. Capaldi
et al.
The military lifestyle often includes continuous operations whether in training or deployed environments. These stressful environments present unique challenges for service members attempting to achieve consolidated, restorative sleep. The significant mental and physical derangements caused by degraded metabolic, cardiovascular, skeletomuscular, and cognitive health often result from insufficient sleep and/or circadian misalignment. Insufficient sleep and resulting fatigue compromises personal safety, mission success, and even national security. In the long-term, chronic insufficient sleep and circadian rhythm disorders have been associated with other sleep disorders (e.g., insomnia, obstructive sleep apnea, and parasomnias). Other physiologic and psychologic diagnoses such as post-traumatic stress disorder, cardiovascular disease, and dementia have also been associated with chronic, insufficient sleep. Increased co-morbidity and mortality are compounded by traumatic brain injury resulting from blunt trauma, blast exposure, and highly physically demanding tasks under load. We present the current state of science in human and animal models specific to service members during- and post-military career. We focus on mission requirements of night shift work, sustained operations, and rapid re-entrainment to time zones. We then propose targeted pharmacological and non-pharmacological countermeasures to optimize performance that are mission- and symptom-specific. We recognize a critical gap in research involving service members, but provide tailored interventions for military health care providers based on the large body of research in health care and public service workers.
Analysis of Surgical Volume in Military Medical Treatment Facilities and Clinical Combat Readiness of US Military Surgeons.
Michael K. Dalton, K. Remick, Michael Mathias
et al.
Importance Low surgical volume in the US Military Health System (MHS) has been identified as a challenge to military surgeon readiness. The Uniformed Services University of Health Sciences, in partnership with the American College of Surgeons, developed the Knowledge, Skills, and Abilities (KSA) Clinical Readiness Program that includes a tool for quantifying the clinical readiness value of surgeon workload, known as the KSA metric. Objective To describe changes in US military general surgeon procedural volume and readiness using the KSA metric. Design, Setting, and Participants This cohort study analyzed general surgery workload performed across the MHS, including military and civilian facilities, between fiscal year 2015 and 2019 and the calculated KSA metric value. The surgeon-level readiness among military general surgeons was calculated based on the KSA metric readiness threshold. Data were obtained from TRICARE, the US Department of Defense health insurance product. Main Outcomes and Measures The main outcomes were general surgery procedural volumes and the KSA metric point value of those procedures across the MHS as well as the number of military general surgeons meeting the KSA metric readiness threshold. Aggregate facility and regional market-level claims data were used to calculate the procedural volumes and KSA metric readiness value of those procedures. Annual adjusted KSA metric points earned were used to determine the number of individual US military general surgeons meeting the readiness threshold. Results The number of general surgery procedures generating KSAs in military hospitals decreased 25.6%, from 128 377 in 2015 to 95 461 in 2019, with a 19.1% decrease in the number of general surgeon KSA points (from 7 155 563 to 5 790 001). From 2015 to 2019, there was a 3.2% increase in both the number of procedures (from 419 980 to 433 495) and KSA points (from 21 071 033 to 21 748 984) in civilian care settings. The proportion of military general surgeons meeting the KSA metric readiness threshold decreased from 16.7% (n = 97) in 2015 to 10.1% (n = 68) in 2019. Conclusions and Relevance This study noted that the number of KSA metric points and procedural volume in military hospitals has been decreasing since 2015, whereas both measures have increased in civilian facilities. The findings suggest that loss of surgical workload has resulted in further decreases in military surgeon readiness and may require substantial changes in patient care flow in the MHS to reverse the change.
HoneyBee: Progressive Instruction Finetuning of Large Language Models for Materials Science
Yu Song, Santiago Miret, Huan Zhang
et al.
We propose an instruction-based process for trustworthy data curation in materials science (MatSci-Instruct), which we then apply to finetune a LLaMa-based language model targeted for materials science (HoneyBee). MatSci-Instruct helps alleviate the scarcity of relevant, high-quality materials science textual data available in the open literature, and HoneyBee is the first billion-parameter language model specialized to materials science. In MatSci-Instruct we improve the trustworthiness of generated data by prompting multiple commercially available large language models for generation with an Instructor module (e.g. Chat-GPT) and verification from an independent Verifier module (e.g. Claude). Using MatSci-Instruct, we construct a dataset of multiple tasks and measure the quality of our dataset along multiple dimensions, including accuracy against known facts, relevance to materials science, as well as completeness and reasonableness of the data. Moreover, we iteratively generate more targeted instructions and instruction-data in a finetuning-evaluation-feedback loop leading to progressively better performance for our finetuned HoneyBee models. Our evaluation on the MatSci-NLP benchmark shows HoneyBee's outperformance of existing language models on materials science tasks and iterative improvement in successive stages of instruction-data refinement. We study the quality of HoneyBee's language modeling through automatic evaluation and analyze case studies to further understand the model's capabilities and limitations. Our code and relevant datasets are publicly available at \url{https://github.com/BangLab-UdeM-Mila/NLP4MatSci-HoneyBee}.
en
cs.CL, cond-mat.mtrl-sci
Robotic autonomous systems for earthmoving in military applications
Q. Ha, L. Yen, C. Balaguer
Abstract Along with increasing innovations in frontier engineering sciences, the advancement in Robotic Autonomous Systems (RAS) has brought about a new horizon in earthmoving processes for construction. In the military domain, there is also an increasing interest in utilising RAS technologies. In particular, ground-based forces are frequently called upon to conduct earthmoving tasks as part of military operations, tasks which could be partially or fully aided by the employment of RAS technologies. There have been rapid developments in military construction automation using high-mobility ground-based platforms, human-machine and machine-machine interfaces, teleoperation and control systems, data transmission systems, machine perception and manipulation capabilities, as well as advances in networked robotics and cyberphysical systems. Given these developments it is timely to undertake a comprehensive overview on the topic of interest to the research community and the authority. This paper presents an overview of the RAS development for platform-centric earthworks together with an analysis of the technical feasibility, maturity, key technical challenges, and future directions for the application of RAS technologies to earthmoving tasks of interest to the army.
106 sitasi
en
Engineering
Effect of the FA2H Gene on cashmere fineness of Jiangnan cashmere goats based on transcriptome sequencing
Cuiling Wu, Jianying Li, Xinming Xu
et al.
Abstract Background Cashmere goats are a heterogeneous hairy mammal. The fineness of cashmere can affect its economic value. Therefore, in this study, we used transcriptome sequencing techniques to analyze the gene expression profiles of the skin tissues of cashmere goats with different cashmere fineness. The selected candidate genes were functionally verified with the secondary hair follicle hair papillary cells of cashmere goats. Results We identified 479 DEGs, of which 238 mRNAs were up-regulated in the fine velvet group and 241 mRNA were down-regulated. Based on functional annotation and protein interaction network analysis, we found some genes that may affect the fineness of cashmere, including SOX18, SOX4, WNT5A, IGFBP4, KAP8, KRT36, and FA2H. Using qRT-PCR, Western blot, CCK-8 cell viability detection, EDU cell proliferation detection, and flow cytometry, we found that overexpression of the FA2H gene could promote the proliferation of secondary hair follicle DPCs in cashmere goats. At the same time, we proved that FA2H could regulate the expression levels of the FGF5 and BMP2 genes in DPCs. Conclusion The results of this study provide a useful reference for the genetics and breeding of Jiangnan cashmere goats and goat genome annotation, and provide an experimental basis for improving cashmere quality of the cashmere goat.