Towards Comprehensive Benchmarking Infrastructure for LLMs In Software Engineering
Daniel Rodriguez-Cardenas, Xiaochang Li, Marcos Macedo
et al.
Large language models for code are advancing fast, yet our ability to evaluate them lags behind. Current benchmarks focus on narrow tasks and single metrics, which hide critical gaps in robustness, interpretability, fairness, efficiency, and real-world usability. They also suffer from inconsistent data engineering practices, limited software engineering context, and widespread contamination issues. To understand these problems and chart a path forward, we combined an in-depth survey of existing benchmarks with insights gathered from a dedicated community workshop. We identified three core barriers to reliable evaluation: the absence of software-engineering-rich datasets, overreliance on ML-centric metrics, and the lack of standardized, reproducible data pipelines. Building on these findings, we introduce BEHELM, a holistic benchmarking infrastructure that unifies software-scenario specification with multi-metric evaluation. BEHELM provides a structured way to assess models across tasks, languages, input and output granularities, and key quality dimensions. Our goal is to reduce the overhead currently required to construct benchmarks while enabling a fair, realistic, and future-proof assessment of LLMs in software engineering.
A Comparison of Energy Consumption and Quality of Solutions in Evolutionary Algorithms
Francisco Javier Luque-Hernández, Sergio Aquino-Britez, Josefa Díaz-Álvarez
et al.
Evolutionary algorithms are extensively used to solve optimisation problems. However, it is important to consider and reduce their energy consumption, bearing in mind that programming languages also significantly affect energy efficiency. This research work compares the execution of four frameworks—ParadisEO (C++), ECJ (Java), DEAPand Inspyred (Python)—running on two different architectures: a laptop and a server. The study follows a design that combines three population sizes (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>2</mn><mn>6</mn></msup></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>2</mn><mn>10</mn></msup></semantics></math></inline-formula>, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>2</mn><mn>14</mn></msup></semantics></math></inline-formula> individuals) and three crossover probabilities (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.01</mn></mrow></semantics></math></inline-formula>; <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.2</mn></mrow></semantics></math></inline-formula>; <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.8</mn></mrow></semantics></math></inline-formula>) applied to four benchmarks (OneMax, Sphere, Rosenbrock and Schwefel). This work makes a relevant methodological contribution by providing a consistent implementation of the metric <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>η</mi><mo>=</mo><mi>f</mi><mi>i</mi><mi>t</mi><mi>n</mi><mi>e</mi><mi>s</mi><mi>s</mi><mo>/</mo><mi>k</mi><mi>W</mi><mi>h</mi></mrow></semantics></math></inline-formula>. This metric has been systematically applied in four different frameworks, thereby setting up a standardized and replicable protocol for the evaluation of the energy efficiency of evolutionary algorithms. The CodeCarbon software was used to estimate energy consumption, which was measured using RAPL counters. This unified metric also indicates the algorithmic productivity. The experimental results show that the server speeds up the number of generations by a factor of approximately <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>2.5</mn></mrow></semantics></math></inline-formula>, but the energy consumption increases four- to sevenfold. Therefore, on average, the energy efficiency of the laptop is five times higher. The results confirm the following conclusions: the computer power does not guarantee sustainability, and population size is a key factor in balancing quality and energy.
Industrial engineering. Management engineering, Electronic computers. Computer science
Sensing Out-of-Equilibrium and Quantum Non-Gaussian Environments via Induced Time-Reversal Symmetry Breaking on the Quantum-Probe Dynamics
Martin Kuffer, Analia Zwick, Gonzalo A. Álvarez
Advancing quantum sensing tools for investigating atomic and nanoscale systems is crucial for the progress of quantum technologies. While many protocols use quantum probes to extract information from stationary or weakly coupled environments, challenges intensify at the atomic scale and the nanoscale, where the environment is inherently out of equilibrium and/or strongly coupled with the sensor. In this work, we demonstrate that time-reversal symmetry in the control dynamics of a quantum sensor is broken when the qubit sensor interacts with environments that are out of equilibrium (with nonstationary fluctuations), contain quantum non-Gaussian correlations, or exhibit intrinsic time-reversal symmetry breaking. Leveraging this phenomenon, we introduce a quantum sensing paradigm based on time-asymmetric dynamical control sequences, enabling the probing of the distance of an environment from equilibrium, its time-reversal symmetry-breaking behavior, or its quantum non-Gaussian nature. Additionally, we propose sensing strategies to distinguish between these different sources of symmetry breaking. We validate our approach through proof-of-principle experimental quantum simulations using solid-state nuclear magnetic resonance, where we drive the environment of a qubit sensor out of equilibrium and use our protocol to quantify the nonstationary characteristics of the generated states. Our findings highlight the potential of this framework for quantifying nonstationary behavior, designing tailored pump-probe experiments (including measurements of quantum information scrambling), and detecting quantum states that exhibit time-reversal symmetry breaking, time-crystal structures, or quantum non-Gaussian fluctuations. Overall, this work constitutes a step forward in designing quantum devices for atomic scale and nanoscale sensing, broadening their applicability to complex and dynamic quantum environments.
Physics, Computer software
Novel transfer learning based bone fracture detection using radiographic images
Aneeza Alam, Ahmad Sami Al-Shamayleh, Nisrean Thalji
et al.
Abstract A bone fracture is a medical condition characterized by a partial or complete break in the continuity of the bone. Fractures are primarily caused by injuries and accidents, affecting millions of people worldwide. The healing process for a fracture can take anywhere from one month to one year, leading to significant economic and psychological challenges for patients. The detection of bone fractures is crucial, and radiographic images are often relied on for accurate assessment. An efficient neural network method is essential for the early detection and timely treatment of fractures. In this study, we propose a novel transfer learning-based approach called MobLG-Net for feature engineering purposes. Initially, the spatial features are extracted from bone X-ray images using a transfer model, MobileNet, and then input into a tree-based light gradient boosting machine (LGBM) model for the generation of class probability features. Several machine learning (ML) techniques are applied to the subsets of newly generated transfer features to compare the results. K-nearest neighbor (KNN), LGBM, logistic regression (LR), and random forest (RF) are implemented using the novel features with optimized hyperparameters. The LGBM and LR models trained on proposed MobLG-Net (MobileNet-LGBM) based features outperformed others, achieving an accuracy of 99% in predicting bone fractures. A cross-validation mechanism is used to evaluate the performance of each model. The proposed study can improve the detection of bone fractures using X-ray images.
Design of Minimal Spanning Tree and Analytic Hierarchical Process (SAHP) Based Hybrid Technique for Software Requirements Prioritization
Muhammad Yaseen, Esraa Ali, Nadeem Sarwar
et al.
Prioritizing software requirements in a sustainable manner can significantly contribute to the success of a software project, adding substantial value throughout its development lifecycle. Analytic hierarchical process (AHP) is considered to yield more accurate prioritized results, but due to high pairwise comparisons, it is not considered to be scalable for prioritization of high number of requirements. To address scalability issue, a hybrid approach of minimal spanning trees (MSTs) and AHP, called as spanning tree and AHP (SAHP), is designed for prioritizing large set of functional requirements (FRs) with fewer comparisons, and thus scalability issue is solved. In this research, on-demand open object (ODOO) enterprise resource planning (ERP) system FRs are prioritized, and the results are compared with AHP. The results of the case study proved that SAHP is more scalable that can prioritize any type of requirement with only n–1 pairs of requirements. Total FRs considered for case from ODOO were 100, where 18 spanning trees were constructed from it. With only 90 pairwise comparisons, these FRs were prioritized with more consistency compared to AHP. Total pairwise comparisons with AHP reach 4950, which is 55 times more compared with SAHP. Consistency of results is measured from average consistency index (CI) value, which was below 0.1. The consistency ratio (CR) value below 0.1 shows results are consistent and acceptable.
Identifying Fraud Sellers in E-Commerce Platform
Lovesh Anand, Hui-Ngo Goh, Choo-Yee Ting
et al.
Identifying fake reviews in e-commerce is crucial as they might impact buyers' purchasing decisions and overall satisfaction. This work investigates the effectiveness of machine learning and transformer-based models for detecting fake reviews on the Amazon Fake Review Labelled Dataset. The dataset contains 20,000 computer-generated and 20,000 original reviews across various product categories with no missing value. In this study, machine learning and transformer-based models were compared, revealing that transformer-based models outperformed in detecting fake reviews, achieving an accuracy of 98% with the DistilBERT model. Additionally, this work too examines the impact of word embedding on machine learning models in enhancing fake review detection accuracy. The results show that the word embedding model Word2Vec displays notable improvements, achieving accuracies of 92% with SVM and 90% with Random Forest and Logistic Regression. Furthermore, a comparison study was carried out on comparing transformer models from previous work, which utilized the same full dataset; it was found that the DistilBERT model produced comparable accuracy despite its lighter architecture. In summary, this study underscores the effectiveness of transformer-based models and machine learning models in detecting fake reviews while at the same time highlighting the importance of word embedding techniques in enhancing the performance of machine learning models. This work is hoped to contribute to combating fake reviews and fostering trust in e-commerce platforms.
Self-Admitted GenAI Usage in Open-Source Software
Tao Xiao, Youmei Fan, Fabio Calefato
et al.
Strategized LaTeX removal and whitespace normalization approachThe widespread adoption of generative AI (GenAI) tools such as GitHub Copilot and ChatGPT is transforming software development. Since generated source code is virtually impossible to distinguish from manually written code, their real-world usage and impact on open-source software (OSS) development remain poorly understood. In this paper, we introduce the concept of self-admitted GenAI usage, that is, developers explicitly referring to the use of GenAI tools for content creation in software artifacts. Using this concept as a lens to study how GenAI tools are integrated into OSS projects, we analyze a curated sample of more than 200,000 GitHub repositories, identifying 1,292 such self-admissions across 156 repositories in commit messages, code comments, and project documentation. Using a mixed methods approach, we derive a taxonomy of 32 tasks, 10 content types, and 11 purposes associated with GenAI usage based on 1,292 qualitatively coded mentions. We then analyze 13 documents with policies and usage guidelines for GenAI tools and conduct a developer survey to uncover the ethical, legal, and practical concerns behind them. Our findings reveal that developers actively manage how GenAI is used in their projects, highlighting the need for project-level transparency, attribution, and quality control practices in AI-assisted software development. Finally, we examine the longitudinal impact of GenAI adoption on code churn in 151 repositories with self-admitted GenAI usage and find no general increase, contradicting popular narratives on the impact of GenAI on software development.
Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects
Jerin Yasmin, Wenxin Jiang, James C. Davis
et al.
Pre-trained models (PTMs) are machine learning models that have been trained in advance, often on large-scale data, and can be reused for new tasks, thereby reducing the need for costly training from scratch. Their widespread adoption introduces a new class of software dependency, which we term Software Dependencies 2.0, extending beyond conventional libraries to learned behaviors embodied in trained models and their associated artifacts. The integration of PTMs as software dependencies in real projects remains unclear, potentially threatening maintainability and reliability of modern software systems that increasingly rely on them. Objective: In this study, we investigate Software Dependencies 2.0 in open-source software (OSS) projects by examining the reuse of PTMs, with a focus on how developers manage and integrate these models. Specifically, we seek to understand: (1) how OSS projects structure and document their PTM dependencies; (2) what stages and organizational patterns emerge in the reuse pipelines of PTMs within these projects; and (3) the interactions among PTMs and other learned components across pipeline stages. We conduct a mixed-methods analysis of a statistically significant random sample of 401 GitHub repositories from the PeaTMOSS dataset (28,575 repositories reusing PTMs from Hugging Face and PyTorch Hub). We quantitatively examine PTM reuse by identifying patterns and qualitatively investigate how developers integrate and manage these models in practice.
MobileNet Backbone Based Approach for Quality Classification of Straw Mushrooms (Volvariella volvacea) Using Convolutional Neural Networks (CNN)
Bayu Priyatna, Titik Khawa Abdul Rahman, April Lia Hananto
et al.
Straw mushrooms (Volvariella volvacea) are a crucial commodity in Indonesia, with consumption on the rise due to their nutritional value and increasing demand for healthy food options. Despite this growth, farmers often struggle with accurately assessing the post-harvest quality of mushrooms according to market standards, which can diminish their economic value. Manual classification, which relies on human judgment and estimation, is frequently inefficient and susceptible to errors such as inconsistencies in quality assessment and limitations in detecting subtle variations. This study aims to automate the classification of straw mushrooms based on quality using deep learning, specifically by employing MobileNetv3 as the backbone for classifying mushrooms based on their shape and color by the Indonesian National Standards (SNI). The MobileNet-CNN Backbone model implemented in this study demonstrated exceptional performance, achieving a classification accuracy of 99%, thus proving its effectiveness and reliability in replacing traditional manual methods. The results of this research indicate significant potential for applying deep learning models to enhance the efficiency and precision of mushroom quality assessment. However, there remain challenges that require further development, including adding more diverse background data, improving image resolution, and refining data augmentation techniques. Addressing these challenges is essential for achieving optimal results in varying environmental conditions, ensuring the model can be broadly implemented in the agricultural industry. Such advancements could lead to more consistent and accurate quality assessments, benefiting producers and consumers in the mushroom market.
A systematic review on EEG-based neuromarketing: recent trends and analyzing techniques
Md. Fazlul Karim Khondakar, Md. Hasib Sarowar, Mehdi Hasan Chowdhury
et al.
Abstract Neuromarketing is an emerging research field that aims to understand consumers’ decision-making processes when choosing which product to buy. This information is highly sought after by businesses looking to improve their marketing strategies by understanding what leaves a positive or negative impression on consumers. It has the potential to revolutionize the marketing industry by enabling companies to offer engaging experiences, create more effective advertisements, avoid the wrong marketing strategies, and ultimately save millions of dollars for businesses. Therefore, good documentation is necessary to capture the current research situation in this vital sector. In this article, we present a systematic review of EEG-based Neuromarketing. We aim to shed light on the research trends, technical scopes, and potential opportunities in this field. We reviewed recent publications from valid databases and divided the popular research topics in Neuromarketing into five clusters to present the current research trend in this field. We also discuss the brain regions that are activated when making purchase decisions and their relevance to Neuromarketing applications. The article provides appropriate illustrations of marketing stimuli that can elicit authentic impressions from consumers' minds, the techniques used to process and analyze recorded brain data, and the current strategies employed to interpret the data. Finally, we offer recommendations to upcoming researchers to help them investigate the possibilities in this area more efficiently in the future.
Computer applications to medicine. Medical informatics, Computer software
A Large-Scale Study of Model Integration in ML-Enabled Software Systems
Yorick Sens, Henriette Knopp, Sven Peldszus
et al.
The rise of machine learning (ML) and its integration into software systems has drastically changed development practices. While software engineering traditionally focused on manually created code artifacts with dedicated processes and architectures, ML-enabled systems require additional data-science methods and tools to create ML artifacts -- especially ML models and training data. However, integrating models into systems, and managing the many different artifacts involved, is far from trivial. ML-enabled systems can easily have multiple ML models that interact with each other and with traditional code in intricate ways. Unfortunately, while challenges and practices of building ML-enabled systems have been studied, little is known about the characteristics of real-world ML-enabled systems beyond isolated examples. Improving engineering processes and architectures for ML-enabled systems requires improving the empirical understanding of these systems. We present a large-scale study of 2,928 open-source ML-enabled software systems. We classified and analyzed them to determine system characteristics, model and code reuse practices, and architectural aspects of integrating ML models. Our findings show that these systems still mainly consist of traditional source code, and that ML model reuse through code duplication or pre-trained models is common. We also identified different ML integration patterns and related implementation practices. We hope that our results help improve practices for integrating ML models, bringing data science and software engineering closer together.
Code Ownership: The Principles, Differences, and Their Associations with Software Quality
Patanamon Thongtanunam, Chakkrit Tantithamthavorn
Code ownership -- an approximation of the degree of ownership of a software component -- is one of the important software measures used in quality improvement plans. However, prior studies proposed different variants of code ownership approximations. Yet, little is known about the difference in code ownership approximations and their association with software quality. In this paper, we investigate the differences in the commonly used ownership approximations (i.e., commit-based and line-based) in terms of the set of developers, the approximated code ownership values, and the expertise level. Then, we analyze the association of each code ownership approximation with the defect-proneness. Through an empirical study of 25 releases that span real-world open-source software systems, we find that commit-based and line-based ownership approximations produce different sets of developers, different code ownership values, and different sets of major developers. In addition, we find that the commit-based approximation has a stronger association with software quality than the line-based approximation. Based on our analysis, we recommend line-based code ownership be used for accountability purposes (e.g., authorship attribution, intellectual property), while commit-based code ownership should be used for rapid bug-fixing and charting quality improvement plans.
Systematic Mapping Study on Requirements Engineering for Regulatory Compliance of Software Systems
Oleksandr Kosenkov, Parisa Elahidoost, Tony Gorschek
et al.
Context: As the diversity and complexity of regulations affecting Software-Intensive Products and Services (SIPS) is increasing, software engineers need to address the growing regulatory scrutiny. As with any other non-negotiable requirements, SIPS compliance should be addressed early in SIPS engineering - i.e., during requirements engineering (RE). Objectives: In the conditions of the expanding regulatory landscape, existing research offers scattered insights into regulatory compliance of SIPS. This study addresses the pressing need for a structured overview of the state of the art in software RE and its contribution to regulatory compliance of SIPS. Method: We conducted a systematic mapping study to provide an overview of the current state of research regarding challenges, principles and practices for regulatory compliance of SIPS related to RE. We focused on the role of RE and its contribution to other SIPS lifecycle phases. We retrieved 6914 studies published from 2017 until 2023 from four academic databases, which we filtered down to 280 relevant primary studies. Results: We identified and categorized the RE-related challenges in regulatory compliance of SIPS and their potential connection to six types of principles and practices. We found that about 13.6% of the primary studies considered the involvement of both software engineers and legal experts. About 20.7% of primary studies considered RE in connection to other process areas. Most primary studies focused on a few popular regulation fields and application domains. Our results suggest that there can be differences in terms of challenges and involvement of stakeholders across different fields of regulation. Conclusion: Our findings highlight the need for an in-depth investigation of stakeholders' roles, relationships between process areas, and specific challenges for distinct regulatory fields to guide research and practice.
Say NO to Optimization: A Nonorthogonal Quantum Eigensolver
Unpil Baek, Diptarka Hait, James Shee
et al.
A balanced description of both static and dynamic correlations in electronic systems with nearly degenerate low-lying states presents a challenge for multiconfigurational methods on classical computers. We present here a quantum algorithm utilizing the action of correlating cluster operators to provide high-quality wave-function ansätze employing a nonorthogonal multireference basis that captures a significant portion of the exact wave function in a highly compact manner and that allows computation of the resulting energies and wave functions at polynomial cost with a quantum computer. This enables a significant improvement over the corresponding classical nonorthogonal solver, which incurs an exponential cost when evaluating off-diagonal matrix elements between the ansatz states and is therefore intractable. We implement the nonorthogonal quantum eigensolver (NOQE) here with an efficient ansatz parametrization inspired by classical quantum chemistry methods that succeed in capturing significant amounts of electronic correlation accurately. Crucially, we avoid the need to perform any optimization of the ansatz on the quantum device. By taking advantage of such classical approaches, NOQE provides a flexible, compact, and rigorous description of both static and dynamic electronic correlation, making it an attractive method for the calculation of electronic states of a wide range of molecular systems.
Physics, Computer software
Reflecting on the Use of the Policy-Process-Product Theory in Empirical Software Engineering
Kelechi G. Kalu, Taylor R. Schorlemmer, Sophie Chen
et al.
The primary theory of software engineering is that an organization's Policies and Processes influence the quality of its Products. We call this the PPP Theory. Although empirical software engineering research has grown common, it is unclear whether researchers are trying to evaluate the PPP Theory. To assess this, we analyzed half (33) of the empirical works published over the last two years in three prominent software engineering conferences. In this sample, 70% focus on policies/processes or products, not both. Only 33% provided measurements relating policy/process and products. We make four recommendations: (1) Use PPP Theory in study design; (2) Study feedback relationships; (3) Diversify the studied feedforward relationships; and (4) Disentangle policy and process. Let us remember that research results are in the context of, and with respect to, the relationship between software products, processes, and policies.
Spatially resolved transcriptomics in immersive environments
Denis Bienroth, Hieu T. Nim, Dimitar Garkov
et al.
Abstract Spatially resolved transcriptomics is an emerging class of high-throughput technologies that enable biologists to systematically investigate the expression of genes along with spatial information. Upon data acquisition, one major hurdle is the subsequent interpretation and visualization of the datasets acquired. To address this challenge, VR-Cardiomics is presented, which is a novel data visualization system with interactive functionalities designed to help biologists interpret spatially resolved transcriptomic datasets. By implementing the system in two separate immersive environments, fish tank virtual reality (FTVR) and head-mounted display virtual reality (HMD-VR), biologists can interact with the data in novel ways not previously possible, such as visually exploring the gene expression patterns of an organ, and comparing genes based on their 3D expression profiles. Further, a biologist-driven use-case is presented, in which immersive environments facilitate biologists to explore and compare the heart expression profiles of different genes.
Drawing. Design. Illustration, Computer applications to medicine. Medical informatics
Intelligent Monitoring Model for Aggregated Infection Risk Against the Background of COVID-19 Epidemic
CHUN Yutong, HAN Feiteng, HE Mingke
The Corona Virus Disease 2019(COVID-19) epidemic is a serious threat to people's lives.Supervision of the density of clustered people and wearing of masks is key to controlling the virus.Public places are characterized by a dense flow of people and high mobility.Manual monitoring can easily increase the risk of infection, and existing mask detection algorithms based on deep learning suffer from the limitation of having a single function and can be applied to only a single type of scenes; as such, they cannot achieve multi-category detection across multiple scenes.Furthermore, their accuracy needs to be improved.The Cascade-Attention R-CNN target detection algorithm is proposed for realizing the automatic detection of aggregations in areas, pedestrians, and face masks.Aiming to solve the problem that the target scale changes too significantly during the task, a high-precision two-stage Cascade R-CNN target detection algorithm is selected as the basic detection framework.By designing multiple cascaded candidate classification regression networks and adding a spatial attention mechanism, we highlight the important features of the candidate region features and suppress noise features to improve the detection accuracy.Based on this, an intelligent monitoring model for aggregated infection risk is constructed, and the infection risk level is determined by combining the outputs of the proposed algorithm.The experimental results show that the model has high accuracy and robustness for multi-category target images with different scenes and perspectives.The average accuracy of the Cascade Attention R-CNN algorithm reaches 89.4%, which is 2.6 percentage points higher than that of the original Cascade R-CNN algorithm, and 10.1 and 8.4 percentage points higher than those of the classic two-stage target detection algorithm, Faster R-CNN and the single-stage target detection framework, RetinaNet, respectively.
Computer engineering. Computer hardware, Computer software
An RSE Group Model: Operational and Organizational Approaches From Princeton University's Central Research Software Engineering Group
Ian A. Cosden
The Princeton Research Software Engineering Group has grown rapidly since its inception in late 2016. The group, housed in the central Research Computing Department, comprised of professional Research Software Engineers (RSEs), works directly with researchers to create high quality research software to enable new scientific advances. As the group has matured so has the need for formalizing operational details and procedures. The RSE group uses an RSE partnership model, where Research Software Engineers work long-term with a designated academic department, institute, center, consortium, or individual principal investigator (PI). This article describes the operation of the central Princeton RSE group including funding, partner & project selection, and best practices for defining expectations for a successful partnership with researchers.
The General Index of Software Engineering Papers
Zeinab Abou Khalil, Stefano Zacchiroli
We introduce the General Index of Software Engineering Papers, a dataset of fulltext-indexed papers from the most prominent scientific venues in the field of Software Engineering. The dataset includes both complete bibliographic information and indexed ngrams (sequence of contiguous words after removal of stopwords and non-words, for a total of 577 276 382 unique n-grams in this release) with length 1 to 5 for 44 581 papers retrieved from 34 venues over the 1971-2020 period.The dataset serves use cases in the field of meta-research, allowing to introspect the output of software engineering research even when access to papers or scholarly search engines is not possible (e.g., due to contractual reasons). The dataset also contributes to making such analyses reproducible and independently verifiable, as opposed to what happens when they are conducted using 3rd-party and non-open scholarly indexing services.The dataset is available as a portable Postgres database dump and released as open data.
Intelligent Assignment and Positioning Algorithm of Moving Target Based on Fuzzy Neural Network
QU Li-cheng, LYU Jiao, QU Yi-hua, WANG Hai-fei
In order to solve the problems of limited monitoring range,unreasonable allocation of monitoring resources,and untimely detection of moving targets in intelligent video surveillance systems under special application scenarios,the use of radar to detect electromagnetic waves has astrong penetration ability,large search range,and being not subject to special weather and optical conditions.Combined with the flexibility and maneuverability of unmanned aerial vehicles and automatic navigation vehicles,this paper proposes a radar-directed integrated linkage video surveillance model,and on this basis,studies a unified coordinate positioning system based on geodetic coordinates and intelligent assignment and positioning algorithm of moving target based on fuzzy neural network optimized by particle swarm optimization.The algorithm can automatically solve the control parameters of each camera in three dimensions of horizontal,vertical and zoom according to the radar detection signal,and combines the linkage control system to achieve real-time positioning and tracking of moving targets.Through field tests at a cultural relics protection site,the accuracy of the target positioning accuracy of the geodetic positioning system reaches 99.6%,and the accuracy rate of the intelligent assignment algorithm for moving targets based on the fuzzy neural network reaches 95%,which can achieve precise positioning and intelligent allocation of monitoring resources,and has high practical application value.
Computer software, Technology (General)