Hasil "Computer software"

arXiv Open Access 2026

CIAO - Code In Architecture Out - Automated Software Architecture Documentation with Large Language Models

Marco De Luca, Tiziano Santilli, Domenico Amalfitano et al.

Software architecture documentation is essential for system comprehension, yet it is often unavailable or incomplete. While recent LLM-based techniques can generate documentation from code, they typically address local artifacts rather than producing coherent, system-level architectural descriptions. This paper presents a structured process for automatically generating system-level architectural documentation directly from GitHub repositories using Large Language Models. The process, called CIAO (Code In Architecture Out), defines an LLM-based workflow that takes a repository as input and produces system-level architectural documentation following a template derived from ISO/IEC/IEEE 42010, SEI Views \& Beyond, and the C4 model. The resulting documentation can be directly added to the target repository. We evaluated the process through a study with 22 developers, each reviewing the documentation generated for a repository they had contributed to. The evaluation shows that developers generally perceive the produced documentation as valuable, comprehensible, and broadly accurate with respect to the source code, while also highlighting limitations in diagram quality, high-level context modeling, and deployment views. We also assessed the operational cost of the process, finding that generating a complete architectural document requires only a few minutes and is inexpensive to run. Overall, the results indicate that a structured, standards-oriented approach can effectively guide LLMs in producing system-level architectural documentation that is both usable and cost-effective.

en cs.SE, cs.AI

Detail Sumber

arXiv Open Access 2025

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

Sebastian Baltes, Florian Angermeir, Chetan Arora et al.

Large Language Models (LLMs) are now ubiquitous in software engineering (SE) research and practice, yet their non-determinism, opaque training data, and rapidly evolving models threaten the reproducibility and replicability of empirical studies. We address this challenge through a collaborative effort of 22 researchers, presenting a taxonomy of seven study types that organizes the landscape of LLM involvement in SE research, together with eight guidelines for designing and reporting such studies. Each guideline distinguishes requirements (must) from recommended practices (should) and is contextualized by the study types it applies to. Our guidelines recommend that researchers: (1) declare LLM usage and role; (2) report model versions, configurations, and customizations; (3) document the tool architecture beyond the model; (4) disclose prompts, their development, and interaction logs; (5) validate LLM outputs with humans; (6) include an open LLM as a baseline; (7) use suitable baselines, benchmarks, and metrics; and (8) articulate limitations and mitigations. We complement the guidelines with an applicability matrix mapping guidelines to study types and a reporting checklist for authors and reviewers. We maintain the study types and guidelines online as a living resource for the community to use and shape (llm-guidelines$.$org).

en cs.SE

Detail Sumber

arXiv Open Access 2025

Automated Code Generation and Validation for Software Components of Microcontrollers

Sebastian Haug, Christoph Böhm, Daniel Mayer

This paper proposes a method for generating software components for embedded systems, integrating seamlessly into existing implementations without developer intervention. We demonstrate this by automatically generating hardware abstraction layer (HAL) code for GPIO operations on the STM32F407 microcontroller. Using Abstract Syntax Trees (AST) for code analysis and Retrieval-Augmented Generation (RAG) for component generation, our approach enables autonomous code completion for embedded applications.

en cs.SE, cs.LG

Detail Sumber

arXiv Open Access 2024

Automatic Data Labeling for Software Vulnerability Prediction Models: How Far Are We?

Triet H. M. Le, M. Ali Babar

Background: Software Vulnerability (SV) prediction needs large-sized and high-quality data to perform well. Current SV datasets mostly require expensive labeling efforts by experts (human-labeled) and thus are limited in size. Meanwhile, there are growing efforts in automatic SV labeling at scale. However, the fitness of auto-labeled data for SV prediction is still largely unknown. Aims: We quantitatively and qualitatively study the quality and use of the state-of-the-art auto-labeled SV data, D2A, for SV prediction. Method: Using multiple sources and manual validation, we curate clean SV data from human-labeled SV-fixing commits in two well-known projects for investigating the auto-labeled counterparts. Results: We discover that 50+% of the auto-labeled SVs are noisy (incorrectly labeled), and they hardly overlap with the publicly reported ones. Yet, SV prediction models utilizing the noisy auto-labeled SVs can perform up to 22% and 90% better in Matthews Correlation Coefficient and Recall, respectively, than the original models. We also reveal the promises and difficulties of applying noise-reduction methods for automatically addressing the noise in auto-labeled SV data to maximize the data utilization for SV prediction. Conclusions: Our study informs the benefits and challenges of using auto-labeled SVs, paving the way for large-scale SV prediction.

en cs.SE, cs.CR

Detail Sumber

arXiv Open Access 2023

Battle of the Blocs: Quantity and Quality of Software Engineering Research by Origin

Lorenz Graf-Vlachy

Software engineering capabilities are increasingly important to the success of economic and political blocs. This paper analyzes quantity and quality of software engineering research output originating from the US, Europe, and China over time. The results indicate that the quantity of research is increasing across the board with Europe leading the field. Depending of the scope of the analysis, either the US or China come in second. Regarding research quality, Europe appears to be lagging the other blocs, with China having caught up to and even having overtaken the US over time.

en cs.SE

Detail DOI Sumber

arXiv Open Access 2023

Evaluation and Measurement of Software Process Improvement -- A Systematic Literature Review

Michael Unterkalmsteiner, Tony Gorschek, A. K. M. Moinul Islam et al.

BACKGROUND: Software Process Improvement (SPI) is a systematic approach to increase the efficiency and effectiveness of a software development organization and to enhance software products. OBJECTIVE: This paper aims to identify and characterize evaluation strategies and measurements used to assess the impact of different SPI initiatives. METHOD: The systematic literature review includes 148 papers published between 1991 and 2008. The selected papers were classified according to SPI initiative, applied evaluation strategies, and measurement perspectives. Potential confounding factors interfering with the evaluation of the improvement effort were assessed. RESULTS: Seven distinct evaluation strategies were identified, wherein the most common one, "Pre-Post Comparison" was applied in 49 percent of the inspected papers. Quality was the most measured attribute (62 percent), followed by Cost (41 percent), and Schedule (18 percent). Looking at measurement perspectives, "Project" represents the majority with 66 percent. CONCLUSION: The evaluation validity of SPI initiatives is challenged by the scarce consideration of potential confounding factors, particularly given that "Pre-Post Comparison" was identified as the most common evaluation strategy, and the inaccurate descriptions of the evaluation context. Measurements to assess the short and mid-term impact of SPI initiatives prevail, whereas long-term measurements in terms of customer satisfaction and return on investment tend to be less used.

en cs.SE

Detail DOI Sumber

arXiv Open Access 2023

The Effect of Stereotypes on Perceived Competence of Indigenous Software Practitioners: A Professional Photo

Mary Sánchez-Gordón, Ricardo Colomo-Palacios, Cathy Guevara-Vega et al.

Context: Potential employers can readily find job candidates' photos through various online sources such as former employers' websites or professional and social networks. The alignment or 'fit' between a candidate and an organization is inferred in online photos through dress style and presentations of self. On the other hand, for candidates from under-represented groups like Indigenous people traditional clothing is an important and lively aspect that allows them to express belonging, enter ceremony, and show resistance.Objective: This exploratory study aims to empirically demonstrate whether traditional clothing in a picture affects the evaluation of candidates' competence for a position like a software developer in which clothing should not be crucial. Method: We plan a quasi-experimental design with both candidates (photo models) and participants (evaluators) from IT companies. It follows a 2 x 2 x 2 design with dress style (traditional / non-traditional clothing), gender and race/ethnicity of the candidates as within-subjects factors. In addition, we will explore the evaluator's gender and experience in hiring as between-subjects factors.

en cs.SE

Detail Sumber

CrossRef Open Access 2022

In pursuit of Japanese-style Research and Education of Computer Software: Experiences with Software-Centered Research Projects

Kazunori Ueda

en

Detail DOI Sumber

arXiv Open Access 2022

On the Use of Fine-grained Vulnerable Code Statements for Software Vulnerability Assessment Models

Triet H. M. Le, M. Ali Babar

Many studies have developed Machine Learning (ML) approaches to detect Software Vulnerabilities (SVs) in functions and fine-grained code statements that cause such SVs. However, there is little work on leveraging such detection outputs for data-driven SV assessment to give information about exploitability, impact, and severity of SVs. The information is important to understand SVs and prioritize their fixing. Using large-scale data from 1,782 functions of 429 SVs in 200 real-world projects, we investigate ML models for automating function-level SV assessment tasks, i.e., predicting seven Common Vulnerability Scoring System (CVSS) metrics. We particularly study the value and use of vulnerable statements as inputs for developing the assessment models because SVs in functions are originated in these statements. We show that vulnerable statements are 5.8 times smaller in size, yet exhibit 7.5-114.5% stronger assessment performance (Matthews Correlation Coefficient (MCC)) than non-vulnerable statements. Incorporating context of vulnerable statements further increases the performance by up to 8.9% (0.64 MCC and 0.75 F1-Score). Overall, we provide the initial yet promising ML-based baselines for function-level SV assessment, paving the way for further research in this direction.

en cs.SE, cs.CR

Detail Sumber

arXiv Open Access 2021

Section 108 and Software Collections, A User's Guide

Ana Enriquez

This user's guide explains Section 108 of the U.S. Copyright Act, a set of rights for libraries and archives, in the context of software collections. It also addresses the interaction between Section 108 and fair use (Section 107) in this context. The guide will help library and archives workers who preserve and provide access to software collections to navigate U.S. copyright law.

en cs.DL

Detail DOI Sumber

arXiv Open Access 2021

On the impact of Performance Antipatterns in multi-objective software model refactoring optimization

Vittorio Cortellessa, Daniele Di Pompeo, Vincenzo Stoico et al.

Software quality estimation is a challenging and time-consuming activity, and models are crucial to face the complexity of such activity on modern software applications. One main challenge is that the improvement of distinctive quality attributes may require contrasting refactoring actions on an application, as for trade-off between performance and reliability. In such cases, multi-objective optimization can provide the designer with a wider view on these trade-offs and, consequently, can lead to identify suitable actions that take into account independent or even competing objectives. In this paper, we present an approach that exploits the NSGA-II multi-objective evolutionary algorithm to search optimal Pareto solution frontiers for software refactoring while considering as objectives: i) performance variation, ii) reliability, iii) amount of performance antipatterns, and iv) architectural distance. The algorithm combines randomly generated refactoring actions into solutions (i.e., sequences of actions) and compares them according to the objectives. We have applied our approach on a train ticket booking service case study, and we have focused the analysis on the impact of performance antipatterns on the quality of solutions. Indeed, we observe that the approach finds better solutions when antipatterns enter the multi-objective optimization. In particular, performance antipatterns objective leads to solutions improving the performance by up to 15% with respect to the case where antipatterns are not considered, without affecting the solution quality on other objectives.

en cs.SE, cs.PF

Detail DOI Sumber

arXiv Open Access 2021

How Tertiary Studies perform Quality Assessment of Secondary Studies in Software Engineering

Dolors Costal, Carles Farré, Xavier Franch et al.

Context: Tertiary studies are becoming increasingly popular in software engineering as an instrument to synthesise evidence on a research topic in a systematic way. In order to understand and contextualize their findings, it is important to assess the quality of the selected secondary studies. Objective: This paper aims to provide a state of the art on the assessment of secondary studies' quality as conducted in tertiary studies in the area of software engineering, reporting the frameworks used as instruments, the facets examined in these frameworks, and the purposes of the quality assessment. Method: We designed this study as a systematic mapping responding to four research questions derived from the objective above. We applied a rigorous search protocol over the Scopus digital library, resulting in 47 papers after application of inclusion and exclusion criteria. The extracted data was synthesised using content analysis. Results: A majority of tertiary studies perform quality assessment. It is not often used for excluding studies, but to support some kind of investigation. The DARE quality assessment framework is the most frequently used, with customizations in some cases to cover missing facets. We outline the first steps towards building a new framework to address the shortcomings identified. Conclusion: This paper is a step forward establishing a foundation for researchers in two different ways. As authors of tertiary studies, understanding the different possibilities in which they can perform quality assessment of secondary studies. As readers, having an instrument to understand the methodological rigor upon which tertiary studies may claim their findings.

en cs.SE

Detail Sumber

arXiv Open Access 2020

Synergizing Domain Expertise with Self-Awareness in Software Systems: A Patternized Architecture Guideline

Tao Chen, Rami Bahsoon, Xin Yao

To promote engineering self-aware and self-adaptive software systems in a reusable manner, architectural patterns and the related methodology provide an unified solution to handle the recurring problems in the engineering process. However, in existing patterns and methods, domain knowledge and engineers' expertise that is built over time are not explicitly linked to the self-aware processes. This linkage is important, as the knowledge is a valuable asset for the related problems and its absence would cause unnecessary overhead, possibly misleading results and unwise waste of the tremendous benefit that could have been brought by the domain expertise. This paper highlights the importance of synergizing domain expertise and the self-awareness to enable better self-adaptation in software systems, relying on well-defined expertise representation, algorithms and techniques. In particular, we present a holistic framework of notions, enriched patterns and methodology, dubbed DBASES, that offers a principled guideline for the engineers to perform difficulty and benefit analysis on possible synergies, in an attempt to keep "engineers-in-the-loop". Through three tutorial case studies, we demonstrate how DBASES can be applied in different domains, within which a carefully selected set of candidates with different synergies can be used for quantitative investigation, providing more informed decisions of the design choices.

en cs.SE, cs.AI

Detail DOI Sumber

arXiv Open Access 2020

Systematic Literature Reviews in Software Engineering -- Enhancement of the Study Selection Process using Cohen's Kappa Statistic

Jorge Pérez, Jessica Díaz, Javier Garcia-Martin et al.

Context: Systematic literature reviews (SLRs) rely on a rigorous and auditable methodology for minimizing biases and ensuring reliability. A common kind of bias arises when selecting studies using a set of inclusion/exclusion criteria. This bias can be decreased through dual revision, which makes the selection process more time-consuming and remains prone to generating bias depending on how each researcher interprets the inclusion/exclusion criteria. Objective: To reduce the bias and time spent in the study selection process, this paper presents a process for selecting studies based on the use of Cohen's Kappa statistic. We have defined an iterative process based on the use of this statistic during which the criteria are refined until obtain almost perfect agreement (k>0.8). At this point, the two researchers interpret the selection criteria in the same way, and thus, the bias is reduced. Starting from this agreement, dual review can be eliminated; consequently, the time spent is drastically shortened. Method: The feasibility of this iterative process for selecting studies is demonstrated through a tertiary study in the area of software engineering on works that were published from 2005 to 2018. Results: The time saved in the study selection process was 28% (for 152 studies) and if the number of studies is sufficiently large, the time saved tend asymptotically to 50%. Conclusions: Researchers and students may take advantage of this iterative process for selecting studies when conducting SLRs to reduce bias in the interpretation of inclusion and exclusion criteria. It is especially useful for research with few resources.

en cs.SE

Detail DOI Sumber

arXiv Open Access 2019

SaaS CloudQual: A Quality Model for Evaluating Software as a Service on the Cloud Computing Environment

Dhanamma Jagli, Seema Purohit, N. Subash Chandra

Abstract The cloud computing is a key computing approach adopted by many organizations in order to share resources. It provides Everything As-A-Service (XaaS). Software-As-A-Service is an important resource on the cloud computing environment. Without installing any software locally, service user can use software as a utility. And enjoy the benefits of SaaS model. Hence SaaS usage is increased drastically, the demand for selecting quality is also increased. This paper presents a novel quality model intended for evaluating software as a service (SaaS), depending on the key features of Software as a service. Because SaaS key features are playing critical role in the quality and differentiating from conventional software quality.

en cs.SE

Detail DOI Sumber

arXiv Open Access 2019

Replicated Computational Results (RCR) Report for "Code Generation for Generally Mapped Finite Elements"

Neil Lindquist

"Code Generation for Generally Mapped Finite Elements" includes performance results for the finite element methods discussed in that manuscript. The authors provided a Zenodo archive with the Firedrake components and dependencies used, as well as the scripts that generated the results. The software was installed on two similar platforms; then, new results were gathered and compared to the original results. After completing this process, the results have been deemed replicable by the reviewer.

en cs.MS

Detail DOI Sumber

arXiv Open Access 2019

Towards Using Data to Inform Decisions in Agile Software Development: Views of Available Data

Christoph Matthies, Guenter Hesse

Software development comprises complex tasks which are performed by humans. It involves problem solving, domain understanding and communication skills as well as knowledge of a broad variety of technologies, architectures, and solution approaches. As such, software development projects include many situations where crucial decisions must be made. Making the appropriate organizational or technical choices for a given software team building a product can make the difference between project success or failure. Software development methods have introduced frameworks and sets of best practices for certain contexts, providing practitioners with established guidelines for these important choices. Current Agile methods employed in modern software development have highlighted the importance of the human factors in software development. These methods rely on short feedback loops and the self-organization of teams to enable collaborative decision making. While Agile methods stress the importance of empirical process control, i.e. relying on data to make decisions, they do not prescribe in detail how this goal should be achieved. In this paper, we describe the types and abstraction levels of data and decisions within modern software development teams and identify the benefits that usage of this data enables. We argue that the principles of data-driven decision making are highly applicable, yet underused, in modern Agile software development.

en cs.SE

Detail DOI Sumber

arXiv Open Access 2019

An Experience Report On Applying Software Testing Academic Results In Industry: We Need Usable Automated Test Generation

Andrea Arcuri

What is the impact of software engineering research on current practices in industry? In this paper, I report on my direct experience as a PhD/post-doc working in software engineering research projects, and then spending the following five years as an engineer in two different companies (the first one being the same I worked in collaboration with during my post-doc). Given a background in software engineering research, what cutting-edge techniques and tools from academia did I use in my daily work when developing and testing the systems of these companies? Regarding validation and verification (my main area of research), the answer is rather short: as far as I can tell, only FindBugs. In this paper, I report on why this was the case, and discuss all the challenging, complex open problems we face in industry and which somehow are "neglected" in the academic circles. In particular, I will first discuss what actual tools I could use in my daily work, such as JaCoCo and Selenium. Then, I will discuss the main open problems I faced, particularly related to environment simulators, unit and web testing. After that, popular topics in academia are presented, such as UML, regression and mutation testing. Their lack of impact on the type of projects I worked on in industry is then discussed. Finally, from this industrial experience, I provide my opinions about how this situation can be improved, in particular related to how academics are evaluated, and advocate for a greater involvement into open-source projects.

en cs.SE

Detail DOI Sumber

arXiv Open Access 2014

Software for Computing the Spheroidal Wave Functions Using Arbitrary Precision Arithmetic

Ross Adelman, Nail A. Gumerov, Ramani Duraiswami

The spheroidal wave functions, which are the solutions to the Helmholtz equation in spheroidal coordinates, are notoriously difficult to compute. Because of this, practically no programming language comes equipped with the means to compute them. This makes problems that require their use hard to tackle. We have developed computational software for calculating these special functions. Our software is called spheroidal and includes several novel features, such as: using arbitrary precision arithmetic; adaptively choosing the number of expansion coefficients to compute and use; and using the Wronskian to choose from several different methods for computing the spheroidal radial functions to improve their accuracy. There are two types of spheroidal wave functions: the prolate kind when prolate spheroidal coordinates are used; and the oblate kind when oblate spheroidal coordinate are used. In this paper, we describe both, methods for computing them, and our software. We have made our software freely available on our webpage.

en cs.MS, math.NA

Detail Sumber

arXiv Open Access 2013

Are Happy Developers more Productive? The Correlation of Affective States of Software Developers and their self-assessed Productivity

Daniel Graziotin, Xiaofeng Wang, Pekka Abrahamsson

For decades now, it has been claimed that a way to improve software developers' productivity is to focus on people. Indeed, while human factors have been recognized in Software Engineering research, few empirical investigations have attempted to verify the claim. Development tasks are undertaken through cognitive processing abilities. Affective states - emotions, moods, and feelings - have an impact on work-related behaviors, cognitive processing activities, and the productivity of individuals. In this paper, we report an empirical study on the impact of affective states on software developers' performance while programming. Two affective states dimensions are positively correlated with self-assessed productivity. We demonstrate the value of applying psychometrics in Software Engineering studies and echo a call to valorize the human, individualized aspects of software developers. We introduce and validate a measurement instrument and a linear mixed-effects model to study the correlation of affective states and the productivity of software developers.

en cs.SE, cs.HC

Detail DOI Sumber

Hasil untuk "Computer software"