PURPOSE OR GOAL: This study investigates how GenAI can be integrated with a criterion-referenced grading framework to improve the efficiency and quality of grading for mathematical assessments in engineering. It specifically explores the challenges demonstrators face with manual, model solution-based grading and how a GenAI-supported system can be designed to reliably identify student errors, provide high-quality feedback, and support human graders. The research also examines human graders' perceptions of the effectiveness of this GenAI-assisted approach. ACTUAL OR ANTICIPATED OUTCOMES: The study found that GenAI achieved an overall grading accuracy of 92.5%, comparable to two experienced human graders. The two researchers, who also served as subject demonstrators, perceived the GenAI as a helpful second reviewer that improved accuracy by catching small errors and provided more complete feedback than they could manually. A central outcome was the significant enhancement of formative feedback. However, they noted the GenAI tool is not yet reliable enough for autonomous use, especially with unconventional solutions. CONCLUSIONS/RECOMMENDATIONS/SUMMARY: This study demonstrates that GenAI, when paired with a structured, criterion-referenced framework using binary questions, can grade engineering mathematical assessments with an accuracy comparable to human experts. Its primary contribution is a novel methodological approach that embeds the generation of high-quality, scalable formative feedback directly into the assessment workflow. Future work should investigate student perceptions of GenAI grading and feedback.
As edge computing environments become increasingly dynamic, the need for efficient job scheduling and proactive fault prevention is becoming paramount. In such environments, minimizing machine downtime and maintaining productivity are critical challenges. In this paper, we propose an integrated approach to scheduling optimization that combines deep learning-based fault prediction with Satisfiability Modulo Theories (SMT)-based scheduling techniques. The proposed system predicts fault probabilities for machines in real time by leveraging operational state features such as temperature, vibration, tool wear, and operating hours. These fault predictions are then used as inputs to the SMT solver, which dynamically optimizes job scheduling. The system ensures task completion within deadlines while minimizing fault risks and optimizing resource utilization. To achieve this, the deep learning model continuously updates fault probabilities through a rolling prediction mechanism, allowing the scheduling system to proactively adapt to changing machine conditions. The SMT solver incorporates these predictions into its optimization process, ensuring that the schedule dynamically reflects the latest system state. The proposed method has been evaluated in simulated production line scenarios, demonstrating significant reductions in machine faults, improved scheduling efficiency, and enhanced overall system reliability. By integrating predictive maintenance with optimization techniques, this research contributes to the development of robust and adaptive scheduling systems for dynamic production environments.
In order to study the law of influence of rubber particle size on concrete frost resistance characteristics, this paper systematically evaluates the freeze–thaw characteristics of rubber concrete containing different particle sizes. Rubber concrete containing different particle sizes is subjected to 25, 50, 75, 100, and 125 freeze–thaw cycles. After the freeze–thaw cycles, the specimens are observed or measured for appearance, mass change rate, relative dynamic elastic modulus, internal damage degree, compressive strength, and tensile strength. The results show that the frost resistance of concrete mixed with rubber of different particle sizes is more excellent, and the surface of concrete specimens after different numbers of freezing and thawing cycles shows different degrees of spalling. Meanwhile, due to the presence of rubber, the compressive and tensile strengths of rubberized concrete are significantly inferior. Finally, the microscopic scanning results reveal the mechanism of rubber’s incorporation into concrete. The incorporation of rubber effectively reduces its internal pore development. To summarize, it can be seen that rubber incorporated into concrete is a worthwhile method to consider for frost resistance of engineering materials.
From its first adoption in the late 80s, qualitative research has slowly but steadily made a name for itself in what was, and perhaps still is, the predominantly quantitative software engineering (SE) research landscape. As part of our regular column on empirical software engineering (ACM SIGSOFT SEN-ESE), we reflect on the state of qualitative SE research with a focus group of experts. Among other things, we discuss why qualitative SE research is important, how it evolved over time, common impediments faced while practicing it today, and what the future of qualitative SE research might look like. Joining the conversation are Rashina Hoda (Monash University, Australia), Carolyn Seaman (University of Maryland, United States), and Klaas Stol (University College Cork, Ireland). The content of this paper is a faithful account of our conversation from October 25, 2025, which we moderated and edited for our column.
Applications of Large Language Models (LLMs) are rapidly growing in industry and academia for various software engineering (SE) tasks. As these models become more integral to critical processes, ensuring their reliability and trustworthiness becomes essential. Consequently, the concept of trust in these systems is becoming increasingly critical. Well-calibrated trust is important, as excessive trust can lead to security vulnerabilities, and risks, while insufficient trust can hinder innovation. However, the landscape of trust-related concepts in LLMs in SE is relatively unclear, with concepts such as trust, distrust, and trustworthiness lacking clear conceptualizations in the SE community. To bring clarity to the current research status and identify opportunities for future work, we conducted a comprehensive review of $88$ papers: a systematic literature review of $18$ papers focused on LLMs in SE, complemented by an analysis of 70 papers from broader trust literature. Additionally, we conducted a survey study with 25 domain experts to gain insights into practitioners' understanding of trust and identify gaps between existing literature and developers' perceptions. The result of our analysis serves as a roadmap that covers trust-related concepts in LLMs in SE and highlights areas for future exploration.
Foundation models (FMs), particularly large language models (LLMs), have shown significant promise in various software engineering (SE) tasks, including code generation, debugging, and requirement refinement. Despite these advances, existing evaluation frameworks are insufficient for assessing model performance in iterative, context-rich workflows characteristic of SE activities. To address this limitation, we introduce \emph{SWE-Arena}, an interactive platform designed to evaluate FMs in SE tasks. SWE-Arena provides a transparent, open-source leaderboard, supports multi-round conversational workflows, and enables end-to-end model comparisons. The platform introduces novel metrics, including \emph{model consistency score} that measures the consistency of model outputs through self-play matches, and \emph{conversation efficiency index} that evaluates model performance while accounting for the number of interaction rounds required to reach conclusions. Moreover, SWE-Arena incorporates a new feature called \emph{RepoChat}, which automatically injects repository-related context (e.g., issues, commits, pull requests) into the conversation, further aligning evaluations with real-world development processes. This paper outlines the design and capabilities of SWE-Arena, emphasizing its potential to advance the evaluation and practical application of FMs in software engineering.
Silvânia Alves Braga de Castro, André Carlos Silva
Abstract The modeling of mineral deposits has been improved over the years with the incorporation of mineralogical and metallurgical information obtained from drilling samples that make up the pillars for the construction of resource models. However, sampling data is being made available in large quantities, causing current databases to grow exponentially. The use of machine learning (ML) algorithms has been applied to deal with multidimensional data problems. Principal component analysis (PCA) is a multivariate analysis (MA) technique whose aim is to reduce the dimension of multivariate data. Studies show that results obtained with the reduction of variables were satisfactory in different areas of activity. The purpose of this article is to test variable selection criteria using PCA for geometallurgical data and to check the feasibility of the technique for simplifying variable types and defining typological domains.
Ali Raza, Khaled Mohamed Elhadi, Muhammad Abid
et al.
Waste tyre rubber has become an environmental and health concern that needs to be sustainably managed to avoid fire hazards and save natural resources. This research work aims to study the structural behavior of glass fiber reinforced polymer (glass-FRP) reinforced rubberized concrete (GRC) compressive elements under monotonic axial compression loads. Nine GRC circular compressive elements with different axial and crosswise reinforcement ratios were fabricated. All the elements were 300 mm in diameter and 1200 mm in height. A 3D nonlinear finite element equation (FEM) was suggested for the GRC compressive elements using a commercial package ABAQUS. A parametric study has been done to examine the effect of various parameters of GRC elements. The test outcomes revealed that the ductility of GRC elements ameliorated with the lessening in the spaces of glass-FRP ties. The addition of rubberized concrete improved the ductility of GRC elements. The damage to GRC elements occurred due to the vertical cracking along the height of the elements. The estimates of FEM were in close agreement with the test outcomes. The suggested empirical equation depending on the 600 test elements, which considered the lateral confinement effect of FRP ties, presented higher accuracy than previous equations.
Diana Robinson, Christian Cabrera, Andrew D. Gordon
et al.
What if end users could own the software development lifecycle from conception to deployment using only requirements expressed in language, images, video or audio? We explore this idea, building on the capabilities that generative Artificial Intelligence brings to software generation and maintenance techniques. How could designing software in this way better serve end users? What are the implications of this process for the future of end-user software engineering and the software development lifecycle? We discuss the research needed to bridge the gap between where we are today and these imagined systems of the future.
Abstract The extraction effect and feature matching are poor in the three‐dimensional reconstruction of metal cultural relics, which leads to a poor reconstruction effect. Therefore, an optimization method of three‐dimensional reconstruction of metal cultural relics based on 3D laser scanning data reduction is proposed. The overall technical design route of the method is shown as follows. Based on this model, the 3D laser scanning method is used to collect 3D images of metal artefacts, combined with colour space and 2D entropy detection methods to pre‐process 3D images, and feature matching of point clouds is carried out to extract and optimize the significant value of superpixels, and a 3D reconstructed visual model is constructed. Affine transformation is used to obtain the affine invariant moment of the structural light parameters of the visual feature lines of metal cultural relics. The light stability adjustment of the three‐dimensional reconstruction of metal cultural relics is realized by using the linear structural light adjustment method in the HSV colour space. The rigid and non‐rigid registration methods are introduced to match the point cloud. The product quantization algorithm is used to linearize the error function, and the block feature detection and matching model of spatial image is obtained. The noise number is judged by the threshold value, and the three‐dimensional reconstruction technology is combined to realize the three‐dimensional reconstruction of metal relics. The simulation results show that this method has good visual expression ability, high feature recognition rate, and improves the three‐dimensional reconstruction ability of metal relics.
Vision is extremely important in our lives. The loss of sight is a serious issue for anyone. According to the WHO, one-sixth of the world's population suffers from vision impairment. According to World Health Organization (WHO) statistics published in December 2021, more than 283 million people worldwide suffer from sight problems, including 39 million blind people and 228 million people with low vision. Navigation in unfamiliar environments is a significant challenge for the partially sighted and visually impaired. Improving visual information on object location and content can aid navigation in unfamiliar environments. Many efforts have been made over the years to develop various devices to assist the visually impaired and improve their quality of life. Numerous efforts have been made over the decades to develop gadgets to support the visually impaired as well as enhance the quality of their lives by trying to make them skilled. There are many existing navigation alternatives that can aid these people. However, in practice, navigation alternatives are infrequently adopted and implemented. For universal use, many of these gadgets are either too heavy or too expensive. While emphasizing related strengths and limitations, it is necessary to produce a minimally expensive assistive device for people with visual disabilities. The proposed model provides an efficient solution for VIPs to roam from place to place by themselves through smart applications with AI and sensor technology. The smart application captures and classifies the images. The obstacles are detected through ultrasonic sensors. The user can get a sense of the obstacles in the path through voice command. The proposed model is very helpful for the VIPs in terms of qualitative and quantitative performance measures. This enables a ranking of the evaluated systems according to their potential influence on Visually Impaired people's lives.
The impact of the built environment on the ridership of ride-hailing results depends on the spatial grid scale. The existing research on the demand model of ride-hailing ignores the modifiable areal unit problem (MAUP). Taking Chengdu as an example, and taking the density of pick-ups and drop-offs as dependent variables, 12 explanatory variables were selected as independent variables according to the “5D” built environment theory. The nugget–sill ratio (NSR) method and optimal parameter-based geographical detector (OPGD) model were used to determine the optimal grid scale for the aggregation of the built environment variables and the ridership of ride-hailing. Based on the optimal grid scale, the optimal data discretization method of the explanatory variables was determined by comparing the results of the geographic detector under different discretization methods (such as the natural break method, k-means clustering method, equidistant method, and quantile method); we utilized the geographic detector model to explore the relative importance and the interactive impacts of the explanatory variables on the ridership of ride-hailing under the optimal grid scale and optimal data discretization method. The results indicated that: (1) the suggested grid scale for the aggregation of the built environment and ride-hailing ridership in Chengdu is 1100 m; (2) the optimal data discretization method is the quantile method; (3) the floor area ratio (FAR), distance from the nearest subway station, and residential POI (point of interest) density resulted in a relatively high importance of the explanatory variable that affects the ridership of ride-hailing; and (4) the interactions of the diversity index of mixed land use ∩ FAR, distance to the nearest subway station ∩ FAR, transportation POI density ∩ FAR, and distance to the central business district (CBD) ∩ FAR made a higher contribution to ride-hailing ridership than the single-factor effect of FAR, which had the highest contribution compared with the other explanatory variables. The proposed grid scale can provide the basis for the partitioning management and scheduling optimization of ride-hailing. In the process of adjusting the ride-hailing demand, the ranking results of the importance and interaction of the built-environment explanatory variables offer valuable references for formulating the priority renewal order and proposing a scientific combination scheme of the built-environment factors.
Elizabeth Bjarnason, Mirko Morandini, Markus Borg
et al.
The RET (Requirements Engineering and Testing) workshop series provides a meeting point for researchers and practitioners from the two separate fields of Requirements Engineering (RE) and Testing. The goal is to improve the connection and alignment of these two areas through an exchange of ideas, challenges, practices, experiences and results. The long term aim is to build a community and a body of knowledge within the intersection of RE and Testing, i.e. RET. The 2nd workshop was held in co-location with ICSE 2015 in Florence, Italy. The workshop continued in the same interactive vein as the 1st one and included a keynote, paper presentations with ample time for discussions, and a group exercise. For true impact and relevance this cross-cutting area requires contribution from both RE and Testing, and from both researchers and practitioners. A range of papers were presented from short experience papers to full research papers that cover connections between the two fields. One of the main outputs of the 2nd workshop was a categorization of the presented workshop papers according to an initial definition of the area of RET which identifies the aspects RE, Testing and coordination effect.