Jia-Wei Bian, Ruo-Shan Tseng, Chih-Min Hsieh
et al.
Coastal sediment transport is a dynamic process influenced by the continuous interplay of waves and currents. However, traditional models based on linear wave theory often fail to capture the full complexity of nearshore turbulence. In this study, we present the first field-based application of nonlinear wave theories—including third-order Stokes and cnoidal wave formulations—to quantify wave-induced shear stress and evaluate its impact on turbidity. Utilizing in situ measurements from four extreme hydrodynamic events—Typhoon Dujuan (2015), Extreme Cold Surge (2016), Prolonged Heavy Rainfall (2018), and Typhoon Wipha (2019)—off the coast of Houwan, Taiwan, we reveal a compelling pattern: third-order Stokes theory not only predicts higher magnitudes of shear stress but also exhibits remarkable temporal alignment with observed turbidity surges. In contrast, current-induced shear stress remains relatively low. These findings challenge the prevalent reliance on linear wave assumptions and establish a validated nonlinear modeling framework for coastal morphodynamics. By capturing the episodic chaos associated with storm-driven seas, this study provides critical insights for predicting sediment transport in light of intensifying climatic extremes.
Simon Ghyselincks, Valeriia Okhmak, Stefano Zampini
et al.
Abstract Reconstructing the structural geology and mineral composition of the first few kilometers of the Earth's subsurface from sparse or indirect surface observations remains a long‐standing challenge with critical applications in mineral exploration, geohazard assessment, and geotechnical engineering. This inherently ill‐posed problem is often addressed by classical geophysical inversion methods, which typically yield a single maximum‐likelihood model that fails to capture the full range of plausible geology. The adoption of modern deep learning methods has been limited by the lack of large 3D training data sets. We address this gap with StructuralGeo, a geological simulation engine that mimics eons of tectonic, magmatic, and sedimentary processes to generate a virtually limitless supply of realistic synthetic 3D lithological models. Using this data set, we train both unconditional and conditional generative flow‐matching models with a 3D attention U‐Net architecture. The resulting foundation model can reconstruct multiple plausible 3D scenarios from surface topography and sparse borehole data, depicting structures such as layers, faults, folds, and dikes. By sampling many reconstructions from the same observations, we introduce a probabilistic framework for estimating the size and extent of subsurface features. While the realism of the output is bounded by the fidelity of the training data to true geology, this combination of simulation and generative AI functions offers a flexible prior for probabilistic modeling, regional fine‐tuning, and use as an AI‐based regularizer in traditional geophysical inversion workflows.
Geophysics. Cosmic physics, Information technology
We explore the concept of folklore within software engineering, drawing from folklore studies to define and characterize narratives, myths, rituals, humor, and informal knowledge that circulate within software development communities. Using a literature review and thematic analysis, we curated exemplar folklore items (e.g., beliefs about where defects occur, the 10x developer legend, and technical debt). We analyzed their narrative form, symbolic meaning, occupational relevance, and links to knowledge areas in software engineering. To ground these concepts in practice, we conducted semi-structured interviews with 12 industrial practitioners in Sweden to explore how such narratives are recognized or transmitted within their daily work and how they affect it. Synthesizing these results, we propose a working definition of software engineering folklore as informally transmitted, traditional, and emergent narratives and heuristics enacted within occupational folk groups that shape identity, values, and collective knowledge. We argue that making the concept of software engineering folklore explicit provides a foundation for subsequent ethnography and folklore studies and for reflective practice that can preserve context-effective heuristics while challenging unhelpful folklore.
Offshore drilling rigs equipped with cylinder lifting systems have been increasingly adopted in newly constructed drilling vessels, offering significant advantages over traditional winch-based lifting systems. These advantages include improved control over the ship's center of gravity, reduced installed power requirements, enhanced maintainability, and greater energy efficiency. In this study, the feasibility and performance of the drilling string compensation function provided by the new cylinder lifting system are investigated. The mechanism underlying the drilling string compensation function is analyzed, and the longitudinal vibration characteristics of the drilling string are examined during offshore deep-hole drilling operations. To further assess the system's capabilities, a detailed simulation model of the drilling string compensation function is developed using the AMESim software platform. This model allows for the evaluation of the system's performance under varying operating conditions and different drilling depths. The results demonstrate that the cylinder lifting system is capable of achieving both passive and semi-active compensation of the drilling string. Notably, the system can effectively control compensation load and mitigate fluctuations in bottomhole pressure, thereby meeting operational requirements. While operating conditions have a greater impact on passive compensation, the introduction of semi-active compensation significantly reduces the influence of these conditions, ensuring more stable and efficient drilling operations.
The intensification of the urban heat island (UHI) effect poses a serious threat to public health, particularly in cities. Effectively mitigating UHI has been a focus of national and international academic research over the last decades. However, most contemporary research has focused on land use mitigation measures within urban areas, with less emphasis on suburban land use. To address this research gap and explore spatial characteristics, we analyzed the driving mechanism of suburban land use patterns on UHI intensity (UHII) within the main urban area of Shenyang City based on high spatial resolution raster data, such as Landsat remote sensing images and land use, combined with extreme gradient boosting and SHapley Additive exPlanations models. The landscape fragmentation index of overall suburban land use provided a stronger contribution to the UHII in urban areas than the aggregation index. Increased cropland fragmentation and aggregation enhance UHII mitigation, whereas increased aggregation of impervious surfaces intensifies UHII. No significant difference was observed between the effects of various suburban gradient landscapes on UHII; however, the effects on different gradients in urban areas increased with decreasing distance from the countryside, with a minimal effect observed on the extreme center of the city (U1). The study provides a theoretical reference for mitigating land use pressure and reducing the UHI in urban areas based on suburban land use.
Abstract Deconvoluting drug targets is crucial in modern drug development, yet both traditional and artificial intelligence (AI)-driven methods face challenges in terms of completeness, accuracy, and efficiency. Identifying drug targets, especially within complex systems such as the p53 pathway, remains a formidable task. The regulation of this pathway by myriad stress signals and regulatory elements adds layers of complexity to the discovery of effective p53 pathway activators. Recent insights into p53 activation have led to two main screening strategies for p53 activators. The target-based approach focuses on p53 and its regulators (MDM2, MDMX, USP7, Sirt proteins), but requires separate systems for each target and may miss multi-target compounds. Phenotype-based screening can reveal new targets but involves a lengthy process to elucidate mechanisms and targets, hindering drug development. Knowledge graphs have emerged as powerful tools that offer strengths in link prediction and knowledge inference to address these issues. In this study, we constructed a protein-protein interaction knowledge graph (PPIKG) and pioneered an integrated drug target deconvolution system that combines AI with molecular docking techniques. Analysis based on the PPIKG narrowed down candidate proteins from 1088 to 35, significantly saving time and cost. Subsequent molecular docking led us to pinpoint USP7 as a direct target for the p53 pathway activator UNBS5162. Leveraging knowledge graphs and a multidisciplinary approach allows us to streamline the laborious and expensive process of reverse targeting drug discovery through phenotype screening. Our findings have the potential to revolutionize drug screening and open new avenues in pharmacological research, increasing the speed and efficiency of pursuing novel therapeutics. The code is available at https://github.com/Xiong-Jing/PPIKG .
Sentiment analysis is an essential technique for investigating the emotional climate within developer teams, contributing to both team productivity and project success. Existing sentiment analysis tools in software engineering primarily rely on English or non-German gold-standard datasets. To address this gap, our work introduces a German dataset of 5,949 unique developer statements, extracted from the German developer forum Android-Hilfe.de. Each statement was annotated with one of six basic emotions, based on the emotion model by Shaver et al., by four German-speaking computer science students. Evaluation of the annotation process showed high interrater agreement and reliability. These results indicate that the dataset is sufficiently valid and robust to support sentiment analysis in the German-speaking software engineering community. Evaluation with existing German sentiment analysis tools confirms the lack of domain-specific solutions for software engineering. We also discuss approaches to optimize annotation and present further use cases for the dataset.
Paris Avgeriou, Nauman bin Ali, Marcos Kalinowski
et al.
Increasingly, courses on Empirical Software Engineering research methods are being offered in higher education institutes across the world, mostly at the M.Sc. and Ph.D. levels. While the need for such courses is evident and in line with modern software engineering curricula, educators designing and implementing such courses have so far been reinventing the wheel; every course is designed from scratch with little to no reuse of ideas or content across the community. Due to the nature of the topic, it is rather difficult to get it right the first time when defining the learning objectives, selecting the material, compiling a reader, and, more importantly, designing relevant and appropriate practical work. This leads to substantial effort (through numerous iterations) and poses risks to the course quality. This chapter attempts to support educators in the first and most crucial step in their course design: creating the syllabus. It does so by consolidating the collective experience of the authors as well as of members of the Empirical Software Engineering community; the latter was mined through two working sessions and an online survey. Specifically, it offers a list of the fundamental building blocks for a syllabus, namely course aims, course topics, and practical assignments. The course topics are also linked to the subsequent chapters of this book, so that readers can dig deeper into those chapters and get support on teaching specific research methods or cross-cutting topics. Finally, we guide educators on how to take these building blocks as a starting point and consider a number of relevant aspects to design a syllabus to meet the needs of their own program, students, and curriculum.
The rapid advancement of AI-assisted software engineering has brought transformative potential to the field of software engineering, but existing tools and paradigms remain limited by cognitive overload, inefficient tool integration, and the narrow capabilities of AI copilots. In response, we propose Compiler.next, a novel search-based compiler designed to enable the seamless evolution of AI-native software systems as part of the emerging Software Engineering 3.0 era. Unlike traditional static compilers, Compiler.next takes human-written intents and automatically generates working software by searching for an optimal solution. This process involves dynamic optimization of cognitive architectures and their constituents (e.g., prompts, foundation model configurations, and system parameters) while finding the optimal trade-off between several objectives, such as accuracy, cost, and latency. This paper outlines the architecture of Compiler.next and positions it as a cornerstone in democratizing software development by lowering the technical barrier for non-experts, enabling scalable, adaptable, and reliable AI-powered software. We present a roadmap to address the core challenges in intent compilation, including developing quality programming constructs, effective search heuristics, reproducibility, and interoperability between compilers. Our vision lays the groundwork for fully automated, search-driven software development, fostering faster innovation and more efficient AI-driven systems.
While mastered by some, good scientific writing practices within Empirical Software Engineering (ESE) research appear to be seldom discussed and documented. Despite this, these practices are implicit or even explicit evaluation criteria of typical software engineering conferences and journals. In this pragmatic, educational-first document, we want to provide guidance to those who may feel overwhelmed or confused by writing ESE papers, but also those more experienced who still might find an opinionated collection of writing advice useful. The primary audience we had in mind for this paper were our own BSc, MSc, and PhD students, but also students of others. Our documented advice therefore reflects a subjective and personal vision of writing ESE papers. By no means do we claim to be fully objective, generalizable, or representative of the whole discipline. With that being said, writing papers in this way has worked pretty well for us so far. We hope that this guide can at least partially do the same for others.
ObjectiveAs the traditional ship trajectory prediction method is prone to gradient explosion and long calculation time, this paper seeks to improve its accuracy and calculation efficiency by proposing a ship trajectory prediction model based on an improved Bayesian optimization algorithm (IBOA) and temporal convolution network (TCN). MethodA temporal pattern attention (TPA) mechanism is introduced to extract the weights of each input feature and ensure the timing of the historical flight track data. At the same time, a reversible residual network (RevNet) is introduced to reduce the memory occupied by TCN model training. The IBOA is then used to find the optimality of the hyperparameters in the TCN (size of kernel K, expansion coefficient d). The model is finally validated using a five-fold cross-validation method, and trajectory prediction is carried out after obtaining the optimal model. ResultThe trajectory data is collected by automatic identification system (AIS) and verified. The root mean square error (RMSE) is found to be increased by 5.5×10−5, 3.5×10−4 and 6×10−4 in weak coupling, medium coupling and strong coupling track prediction respectively.ConclusionThe proposed network has good adaptability to complex trajectories and higher accuracy than the traditional model and long short-term memory (LSTM) model, while maintaining high prediction accuracy for trajectories with strong coupling.
Large Language Models (LLMs) have recently shown remarkable capabilities in various software engineering tasks, spurring the rapid growth of the Large Language Models for Software Engineering (LLM4SE) area. However, limited attention has been paid to developing efficient LLM4SE techniques that demand minimal computational cost, time, and memory resources, as well as green LLM4SE solutions that reduce energy consumption, water usage, and carbon emissions. This paper aims to redirect the focus of the research community towards the efficiency and greenness of LLM4SE, while also sharing potential research directions to achieve this goal. It commences with a brief overview of the significance of LLM4SE and highlights the need for efficient and green LLM4SE solutions. Subsequently, the paper presents a vision for a future where efficient and green LLM4SE revolutionizes the LLM-based software engineering tool landscape, benefiting various stakeholders, including industry, individual practitioners, and society. The paper then delineates a roadmap for future research, outlining specific research paths and potential solutions for the research community to pursue. While not intended to be a definitive guide, the paper aims to inspire further progress, with the ultimate goal of establishing efficient and green LLM4SE as a central element in the future of software engineering.
Ethnography has become one of the established methods for empirical research on software engineering. Although there is a wide variety of introductory books available, there has been no material targeting software engineering students particularly, until now. In this chapter we provide an introduction to teaching and learning ethnography for faculty teaching ethnography to software engineering graduate students and for the students themselves of such courses. The contents of the chapter focuses on what we think is the core basic knowledge for newbies to ethnography as a research method. We complement the text with proposals for exercises, tips for teaching, and pitfalls that we and our students have experienced. The chapter is designed to support part of a course on empirical software engineering and provides pointers and literature for further reading.
Juan M. Murillo, Jose Garcia-Alonso, Enrique Moguel
et al.
As quantum computers advance, the complexity of the software they can execute increases as well. To ensure this software is efficient, maintainable, reusable, and cost-effective -key qualities of any industry-grade software-mature software engineering practices must be applied throughout its design, development, and operation. However, the significant differences between classical and quantum software make it challenging to directly apply classical software engineering methods to quantum systems. This challenge has led to the emergence of Quantum Software Engineering as a distinct field within the broader software engineering landscape. In this work, a group of active researchers analyse in depth the current state of quantum software engineering research. From this analysis, the key areas of quantum software engineering are identified and explored in order to determine the most relevant open challenges that should be addressed in the next years. These challenges help identify necessary breakthroughs and future research directions for advancing Quantum Software Engineering.
Ranim Khojah, Mazen Mohamad, Philipp Leitner
et al.
Large Language Models (LLMs) are frequently discussed in academia and the general public as support tools for virtually any use case that relies on the production of text, including software engineering. Currently there is much debate, but little empirical evidence, regarding the practical usefulness of LLM-based tools such as ChatGPT for engineers in industry. We conduct an observational study of 24 professional software engineers who have been using ChatGPT over a period of one week in their jobs, and qualitatively analyse their dialogues with the chatbot as well as their overall experience (as captured by an exit survey). We find that, rather than expecting ChatGPT to generate ready-to-use software artifacts (e.g., code), practitioners more often use ChatGPT to receive guidance on how to solve their tasks or learn about a topic in more abstract terms. We also propose a theoretical framework for how (i) purpose of the interaction, (ii) internal factors (e.g., the user's personality), and (iii) external factors (e.g., company policy) together shape the experience (in terms of perceived usefulness and trust). We envision that our framework can be used by future research to further the academic discussion on LLM usage by software engineering practitioners, and to serve as a reference point for the design of future empirical LLM research in this domain.