Research Software Engineers (RSEs) have become indispensable to computational research and scholarship. The fast rise of RSEs in higher education and the trend of universities to be slow creating or adopting models for new technology roles means a lack of structured career pathways that recognize technical mastery, scholarly impact, and leadership growth. In response to an immense demand for RSEs at Princeton University, and dedicated funding to grow the RSE group at least two-fold, Princeton was forced to strategize how to cohesively define job descriptions to match the rapid hiring of RSE positions but with enough flexibility to recognize the unique nature of each individual position. This case study describes our design and implementation of a comprehensive RSE career ladder spanning Associate through Principal levels, with parallel team-lead and managerial tracks. We outline the guiding principles, competency framework, Human Resources (HR) alignment, and implementation process, including engagement with external consultants and mapping to a standard job leveling framework utilizing market benchmarks. We share early lessons learned and outcomes including improved hiring efficiency, clearer promotion pathways, and positive reception among staff.
Pulmonary tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), necessitates early diagnosis for effective patient care. Despite advancements in TB diagnostics, there remains an urgent need to discover innovative non-sputum-based methods to detect Mtb-specific antigens for TB patient identification. We have developed a polymer-based electrochemical biosensor for detecting an Mtb-specific antigen, the 6-kDa early secreted antigenic target (ESAT-6), in blood. Using a gold electrode (Au), the biosensor is created by electropolymerizing poly(3,4-ethylene dioxythiophene) with carboxyl groups (PEDOT-COOH), which is activated with 3-ethylcarbodiimide hydrochloride and N-hydroxysuccinimide (EDC-NHS), conjugated with an ESAT-6 polyclonal antibody (Ab), treated with bovine serum albumin (BSA) to block non-specific binding, forming BSA/Ab-EDC-NHS/PEDOT-COOH/Au. Using differential pulse voltammetry measurements, the electrode demonstrated an excellent linear response (R2 = 0.99) for ESAT-6 detection across a concentration range of 24.2 pM (0.81 ng/mL) to 50 nM (1.69 μg/mL), with a low detection limit of 1.39 pM (0.047 ng/mL) and a rapid detection time of under 4 min. This biosensor for ESAT-6 detection effectively distinguished pulmonary TB patients from healthy individuals, achieving 95.0 % sensitivity and 100 % specificity at a cut-off value of 97.0 ng/mL. It demonstrated a diagnostic accuracy of 97.1 %, outperforming the 82.9 % achieved by a commercial ELISA kit. Moreover, biosensor-detected ESAT-6 levels were significantly higher in smear-positive TB patients compared to the smear-negative group (p = 0.014), whereas ELISA-based detection showed no significant difference (p = 0.197). In conclusion, the PEDOT-COOH biosensor enables rapid and effective detection of plasma ESAT-6, facilitates TB diagnosis, and correlates with Mtb bacterial burden, highlighting its potential for disease monitoring.
Precise segmentation of individual teeth from digital three-dimensional (3D) tooth models is critical in computer-assisted orthodontic surgery. This study explores the application of Point Multi-Layer Perceptron (PointMLP) in processing 3D tooth models and introduces an innovative integration of the Graph Attentional Convolution (GAC) Layer with a graph attention mechanism. By incorporating the GAC Layer into PointMLP, the model can focus on key local regions in the 3D tooth model and dynamically adjust the attention applied to these areas. This enhanced attention mechanism allows the model to better capture subtle surface structures, facilitating the accurate extraction of valuable local features. Compared to other traditional segmentation algorithms, the proposed method shows improvements of 1.1, 2.04, 1.06, 2.2, and 1.8 percentage points in Overall Accuracy (OA), Sensitivity (SEN), Positive Predictive Value (PPV), and Intersection Over Union (IoU), respectively. At the same number of training epochs, our method outperforms both GAC and PointMLP in segmentation performance.
Control engineering systems. Automatic machinery (General), Systems engineering
Khaled Hamdaoui, Ali Benzaamia, Billal Sari Ahmed
et al.
Abstract This study introduces a novel, interpretable machine learning framework for predicting the compression index (Cc) of clay soils by integrating three advanced gradient boosting algorithms—XGBoost, CatBoost, and LightGBM—with SHapley Additive exPlanations (SHAP). A comprehensive dataset of 1,243 clay samples, compiled from peer-reviewed literature, includes four geotechnical input variables: plastic limit (PL), plasticity index (PI), initial void ratio (e₀) and water content (w). Data were standardized and partitioned into training (70%) and testing (30%) subsets. Model development employed fivefold cross-validation and Optuna-based hyperparameter optimization. Among the models, XGBoost demonstrated the highest generalization capability, achieving an R2 of 0.913, RMSE of 0.197, and MAE of 0.100 on the test set. SHAP analysis revealed that initial void ratio (e₀) and water content (w) were the most influential features, with mean SHAP values of 0.20 and 0.10, respectively, aligning with established geotechnical principles. The proposed framework enhances transparency in machine learning predictions by making the model’s decision process understandable, thereby addressing the limitations of traditional “black-box” AI. It offers a reliable and efficient alternative to conventional oedometer testing, particularly beneficial for preliminary geotechnical design where timely and interpretable predictions are essential. Graphical Abstract
Alexandra Mazak-Huemer, Christian Huemer, Michael Vierhauser
et al.
With the increasing significance of Research, Technology, and Innovation (RTI) policies in recent years, the demand for detailed information about the performance of these sectors has surged. Many of the current tools are limited in their application purpose. To address these issues, we introduce a requirements engineering process to identify stakeholders and elicitate requirements to derive a system architecture, for a web-based interactive and open-access RTI system monitoring tool. Based on several core modules, we introduce a multi-tier software architecture of how such a tool is generally implemented from the perspective of software engineers. A cornerstone of this architecture is the user-facing dashboard module. We describe in detail the requirements for this module and additionally illustrate these requirements with the real example of the Austrian RTI Monitor.
Jannatul Bushra, Md Habibor Rahman, Mohammed Shafae
et al.
Reverse engineering can be used to derive a 3D model of an existing physical part when such a model is not readily available. For parts that will be fabricated with subtractive and formative manufacturing processes, existing reverse engineering techniques can be readily applied, but parts produced with additive manufacturing can present new challenges due to the high level of process-induced distortions and unique part attributes. This paper introduces an integrated 3D scanning and process simulation data-driven framework to minimize distortions of reverse-engineered additively manufactured components. This framework employs iterative finite element simulations to predict geometric distortions to minimize errors between the predicted and measured geometrical deviations of the key dimensional characteristics of the part. The effectiveness of this approach is then demonstrated by reverse engineering two Inconel-718 components manufactured using laser powder bed fusion additive manufacturing. This paper presents a remanufacturing process that combines reverse engineering and additive manufacturing, leveraging geometric feature-based part compensation through process simulation. Our approach can generate both compensated STL and parametric CAD models, eliminating laborious experimentation during reverse engineering. We evaluate the merits of STL-based and CAD-based approaches by quantifying the errors induced at the different steps of the proposed approach and analyzing the influence of varying part geometries. Using the proposed CAD-based method, the average absolute percent error between simulation-predicted distorted dimensions and actual measured dimensions of the manufactured parts was 0.087%, with better accuracy than the STL-based method.
Davide Venturelli, Erik Gustafson, Doga Kurkcuoglu
et al.
We review the prospects to build quantum processors based on superconducting transmons and radiofrequency cavities for testing applications in the NISQ era. We identify engineering opportunities and challenges for implementation of algorithms in simulation, combinatorial optimization, and quantum machine learning in qudit-based quantum computers.
Modern engineering, spanning electrical, mechanical, aerospace, civil, and computer disciplines, stands as a cornerstone of human civilization and the foundation of our society. However, engineering design poses a fundamentally different challenge for large language models (LLMs) compared with traditional textbook-style problem solving or factual question answering. Although existing benchmarks have driven progress in areas such as language understanding, code synthesis, and scientific problem solving, real-world engineering design demands the synthesis of domain knowledge, navigation of complex trade-offs, and management of the tedious processes that consume much of practicing engineers' time. Despite these shared challenges across engineering disciplines, no benchmark currently captures the unique demands of engineering design work. In this work, we introduce EngDesign, an Engineering Design benchmark that evaluates LLMs' abilities to perform practical design tasks across nine engineering domains. Unlike existing benchmarks that focus on factual recall or question answering, EngDesign uniquely emphasizes LLMs' ability to synthesize domain knowledge, reason under constraints, and generate functional, objective-oriented engineering designs. Each task in EngDesign represents a real-world engineering design problem, accompanied by a detailed task description specifying design goals, constraints, and performance requirements. EngDesign pioneers a simulation-based evaluation paradigm that moves beyond textbook knowledge to assess genuine engineering design capabilities and shifts evaluation from static answer checking to dynamic, simulation-driven functional verification, marking a crucial step toward realizing the vision of engineering Artificial General Intelligence (AGI).
Daniel R. Clarkson, Lawrence A. Bull, Chandula T. Wickramarachchi
et al.
Regression is a fundamental prediction task common in data-centric engineering applications that involves learning mappings between continuous variables. In many engineering applications (e.g.\ structural health monitoring), feature-label pairs used to learn such mappings are of limited availability which hinders the effectiveness of traditional supervised machine learning approaches. The current paper proposes a methodology for overcoming the issue of data scarcity by combining active learning with hierarchical Bayesian modelling. Active learning is an approach for preferentially acquiring feature-label pairs in a resource-efficient manner. In particular, the current work adopts a risk-informed approach that leverages contextual information associated with regression-based engineering decision-making tasks (e.g.\ inspection and maintenance). Hierarchical Bayesian modelling allow multiple related regression tasks to be learned over a population, capturing local and global effects. The information sharing facilitated by this modelling approach means that information acquired for one engineering system can improve predictive performance across the population. The proposed methodology is demonstrated using an experimental case study. Specifically, multiple regressions are performed over a population of machining tools, where the quantity of interest is the surface roughness of the workpieces. An inspection and maintenance decision process is defined using these regression tasks which is in turn used to construct the active-learning algorithm. The novel methodology proposed is benchmarked against an uninformed approach to label acquisition and independent modelling of the regression tasks. It is shown that the proposed approach has superior performance in terms of expected cost -- maintaining predictive performance while reducing the number of inspections required.
Autonomous vehicles (AV) hold great potential to increase road safety, reduce traffic congestion, and improve mobility systems. However, the deployment of AVs introduces new liability challenges when they are involved in car accidents. A new legal framework should be developed to tackle such a challenge. This paper proposes a legal framework, incorporating liability rules to rear-end crashes in mixed-traffic platoons with AVs and human-propelled vehicles (HV). We leverage a matrix game approach to understand interactions among players whose utility captures crash loss for drivers according to liability rules. We investigate how liability rules may impact the game equilibrium between vehicles and whether human drivers’ moral hazards arise if liability is not designed properly. We find that compared to the no-fault liability rule, contributory and comparative rules make road users have incentives to execute a smaller reaction time to improve road safety. There exists moral hazards for human drivers when risk-averse AV players are in the car platoon.
【Objective】 Restoring coal mining waste dumps is a way to alleviate their detrimental impact on environment. In this paper, we present the results of an experimental study on distribution and stability of soil aggregates in a reclaimed coal mining overburden dump. 【Method】 The experiment was carried out at a reclaimed coal mine dump site in a grassland region in northern China. We measured the development of fissures from Zone I (GF) to Zone three (GFIII) in the fissure zone. The composition and distribution soil aggregates in these zones were determined using dry-wet sieve method. Aggregate stability and its relationship with the fissures was analyzed. 【Result】 The content of the >0.25 mm air-dried aggregates over the fissure zones was 23.02%~42.70%, and content of the >0.25 mm water-stable soil aggregates was 16.9%~29.52%. There was no significant difference between air-dried aggregates and water-stable aggregates. The content of the >0.25 mm water-stable soil aggregates in the 0~60 cm soil layer in GF, GFⅡ and GF Ⅲ was 25.26%, 26.57%, 23.62%, respectively, while the percentage of aggregate destruction in the three fissure zones was 20.77%~36.17%, 20.52%~25.00%, and 26.58%~40.56%, respectively. The percentage of aggregate destruction in 0~10, 10~20, 20~30, 30~40, 40~50, and 50~60 cm soil layers was 28.81%, 29.96%, 26.19%, 23.50%, 24.91%, and 29.38%, respectively. The fractal dimension of air-dried and water-stable soil aggregates was 2.847~2.919 and 2.898~2.942, respectively. Small aggregates and fine particles are the dominant aggregates. The mean mass diameter (MWD) and geometric mean diameter (GMD) of the air-dried aggregates in three fissure zones were 1.11, 1.05, 1.28 mm, and 0.45, 0.44, 0.49 mm, respectively. The MWD and GMD of water-stable soil aggregates in the three fissure zones were 0.67, 0.73, 0.72 mm, and 0.36, 0.38, 0.37 mm, respectively. Soil in GFⅡ had good structure and aggregate stability. Most of water-stable soil aggregates in the fissure zones were unstable due to the formation and development of fissures. 【Conclusion】 The formation and development of fissures in the reclaimed coal mining overburden dump reduced the stability of soil aggregates, thereby resulting in aggregate segmentation. The larger and wider the fissures were, the less stable the soil aggregates were.
Agriculture (General), Irrigation engineering. Reclamation of wasteland. Drainage
Reza Saeidi, Younes Noorollahi, Soowon Chang
et al.
A horizontal ground heat exchanger installation is less expensive than a vertical ground heat exchanger, but more land is required. The system’s required pipe length and land area can be decreased by improving the system. On the other side, adding fins is one strategy to increase heat transfer. In this paper, cylindrical fins are explored in horizontal ground heat exchangers for the first time to improve heat transfer and, the overall efficiency of the ground-source heat pump. These fins were examined by varying parameters such as length, diameter, position, and material. The heat transfer rate changes with and without fins were also investigated as soil properties changed. The heat transfer simulations in cooling mode for a 1D-3D model using COMSOL Multiphysics revealed that changing the fin diameter directly affects the outlet temperature and, the non-isothermal pipe flow (niofl) is used for the pipe. There is minimal difference between improving and increasing the heat transfer rate when the fin length is increased to more than 1 m. Moreover, the distance between the ground heat exchanger and the fin is critical; if it exceeds 5 cm, it loses some effectiveness, although it is still useful for soil recovery. In general, the fin increases the soil contact area, and a fin of length one meter can increase heat transfer per unit of pipe length up to 20.7 %. Comparing horizontal and vertical ground heat exchangers in finned and non-finned modes revealed that both modes increase performance with the fin. Still, the horizontal pipe increases performance by about 3 % more than the vertical spiral with the same number of fins.
Three levels, namely the device level, the connection level, and the systems management level, are frequently used to conceptualize intelligent machinery and Industry 4 [...]
In software development, code comments play a crucial role in enhancing code comprehension and collaboration. This research paper addresses the challenge of objectively classifying code comments as "Useful" or "Not Useful." We propose a novel solution that harnesses contextualized embeddings, particularly BERT, to automate this classification process. We address this task by incorporating generated code and comment pairs. The initial dataset comprised 9048 pairs of code and comments written in C, labeled as either Useful or Not Useful. To augment this dataset, we sourced an additional 739 lines of code-comment pairs and generated labels using a Large Language Model Architecture, specifically BERT. The primary objective was to build classification models that can effectively differentiate between useful and not useful code comments. Various machine learning algorithms were employed, including Logistic Regression, Decision Tree, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Gradient Boosting, Random Forest, and a Neural Network. Each algorithm was evaluated using precision, recall, and F1-score metrics, both with the original seed dataset and the augmented dataset. This study showcases the potential of generative AI for enhancing binary code comment quality classification models, providing valuable insights for software developers and researchers in the field of natural language processing and software engineering.
Exploiting the recent advancements in artificial intelligence, showcased by ChatGPT and DALL-E, in real-world applications necessitates vast, domain-specific, and publicly accessible datasets. Unfortunately, the scarcity of such datasets poses a significant challenge for researchers aiming to apply these breakthroughs in engineering design. Synthetic datasets emerge as a viable alternative. However, practitioners are often uncertain about generating high-quality datasets that accurately represent real-world data and are suitable for the intended downstream applications. This study aims to fill this knowledge gap by proposing comprehensive guidelines for generating, annotating, and validating synthetic datasets. The trade-offs and methods associated with each of these aspects are elaborated upon. Further, the practical implications of these guidelines are illustrated through the creation of a turbo-compressors dataset. The study underscores the importance of thoughtful sampling methods to ensure the appropriate size, diversity, utility, and realism of a dataset. It also highlights that design diversity does not equate to performance diversity or realism. By employing test sets that represent uniform, real, or task-specific samples, the influence of sample size and sampling strategy is scrutinized. Overall, this paper offers valuable insights for researchers intending to create and publish synthetic datasets for engineering design, thereby paving the way for more effective applications of AI advancements in the field. The code and data for the dataset and methods are made publicly accessible at https://github.com/cyrilpic/radcomp .