Accurate crop yield prediction relies on diverse data streams, including satellite, meteorological, soil, and topographic information. However, despite rapid advances in machine learning, existing approaches remain crop- or region-specific and require data engineering efforts. This limits scalability, reproducibility, and operational deployment. This study introduces UniCrop, a universal and reusable data pipeline designed to automate the acquisition, cleaning, harmonisation, and engineering of multi-source environmental data for crop yield prediction. For any given location, crop type, and temporal window, UniCrop automatically retrieves, harmonises, and engineers over 200 environmental variables (Sentinel-1/2, MODIS, ERA5-Land, NASA POWER, SoilGrids, and SRTM), reducing them to a compact, analysis-ready feature set utilising a structured feature reduction workflow with minimum redundancy maximum relevance (mRMR). To validate, UniCrop was applied to a rice yield dataset comprising 557 field observations. Using only the selected 15 features, four baseline machine learning models (LightGBM, Random Forest, Support Vector Regression, and Elastic Net) were trained. LightGBM achieved the best single-model performance (RMSE = 465.1 kg/ha, $R^2 = 0.6576$), while a constrained ensemble of all baselines further improved accuracy (RMSE = 463.2 kg/ha, $R^2 = 0.6604$). UniCrop contributes a scalable and transparent data-engineering framework that addresses the primary bottleneck in operational crop yield modelling: the preparation of consistent and harmonised multi-source data. By decoupling data specification from implementation and supporting any crop, region, and time frame through simple configuration updates, UniCrop provides a practical foundation for scalable agricultural analytics. The code and implementation documentation are shared in https://github.com/CoDIS-Lab/UniCrop.
Robin Kimmel, Judith Michael, Andreas Wortmann
et al.
Digital twins promise a better understanding and use of complex systems. To this end, they represent these systems at their runtime and may interact with them to control their processes. Software engineering is a wicked challenge in which stakeholders from many domains collaborate to produce software artifacts together. In the presence of skilled software engineer shortage, our vision is to leverage DTs as means for better rep- resenting, understanding, and optimizing software engineering processes to (i) enable software experts making the best use of their time and (ii) support domain experts in producing high-quality software. This paper outlines why this would be beneficial, what such a digital twin could look like, and what is missing for realizing and deploying software engineering digital twins.
Unsupervised Domain Adaptation~(UDA) focuses on transferring knowledge from a labeled source domain to an unlabeled target domain, addressing the challenge of \emph{domain shift}. Significant domain shifts hinder effective knowledge transfer, leading to \emph{negative transfer} and deteriorating model performance. Therefore, mitigating negative transfer is essential. This study revisits negative transfer through the lens of causally disentangled learning, emphasizing cross-domain discriminative disagreement on non-causal environmental features as a critical factor. Our theoretical analysis reveals that overreliance on non-causal environmental features as the environment evolves can cause discriminative disagreements~(termed \emph{environmental disagreement}), thereby resulting in negative transfer. To address this, we propose Reducing Environmental Disagreement~(RED), which disentangles each sample into domain-invariant causal features and domain-specific non-causal environmental features via adversarially training domain-specific environmental feature extractors in the opposite domains. Subsequently, RED estimates and reduces environmental disagreement based on domain-specific non-causal environmental features. Experimental results confirm that RED effectively mitigates negative transfer and achieves state-of-the-art performance.
Abhik Roychoudhury, Corina Pasareanu, Michael Pradel
et al.
Large Language Models (LLMs) have shown surprising proficiency in generating code snippets, promising to automate large parts of software engineering via artificial intelligence (AI). We argue that successfully deploying AI software engineers requires a level of trust equal to or even greater than the trust established by human-driven software engineering practices. The recent trend toward LLM agents offers a path toward integrating the power of LLMs to create new code with the power of analysis tools to increase trust in the code. This opinion piece comments on whether LLM agents could dominate software engineering workflows in the future and whether the focus of programming will shift from programming at scale to programming with trust.
Wildfires are an integral component of Mediterranean ecosystems. The forest management practices implemented following such forest fires can significantly influence soil chemistry and metal dynamics. This study investigates the effects of different forest management strategies, including natural regeneration, grading (e.g., gradoni terrace making), and subsoiling with ripper on soil levels of major, trace, and heavy metals in a fire-affected forest in the southwestern part of Türkiye. Soil samples were collected 2.5 years after the containment of the wildfire and analyzed for selected metals (Fe, Ca, Al, Mn, Cr, Ni, Zn, Cu, Pb, Co, As, and Hg) concentrations. The findings indicated that subsoiling with a ripper resulted in elevated levels of multiple potentially toxic metals, including Cr (223.22 ± 60.47 mg/kg), Ni (150.54 ± 27.33 mg/kg), Zn (156.18 ± 66.14 mg/kg), and As (6.72 ± 1.30 mg/kg), compared to other treatments. These findings demonstrate that management interventions such as subsoiling with a ripper can significantly alter the distribution and concentration of trace metals. Future research integrating topographic variation and earlier sampling would further strengthen our understanding of post-fire metal dynamics.
Ali Nasiri, Esmaeil Salimi, Morteza Delfan Azari
et al.
Flood zoning has extensive applications in flood management and is considered one of the fundamental and critical pieces of information in flood risk management. Flood zoning in urban areas is much more challenging than modeling in floodplain and river areas due to the two-dimensional nature of the flow and, on the other hand, the density of urban features such as buildings, streets, boulevards, and public pathways. In this study, flood zoning for districts 21 and 22 of Tehran was conducted under the current conditions, where the area is almost devoid of surface water collection channels, using a physically-based rainfall-runoff model and two-dimensional hydraulic routing which is the novelty aspect of the article. For this purpose, the HEC-HMS model was used to estimate the runoff from the mountains, and the MIKE model was used to simulate urban rainfall-runoff. According to the modeling results, the areas affected by a 50-year flood event were identified using an integrated modeling approach in districts 21 and 22, covering 8% of these areas. In these areas, the maximum flood depth is 11.8 meters in Vardavard river and the highest speed is 4.5 meters per second at the beginning of Hashemzadeh street (south of Kharrazi highway). The results indicate that in the event of extreme events such as a 50-year rainfall, a significant portion of the highways and main communication arteries of Tehran leading westward would be disrupted, and traffic would be impossible. Moreover, various land uses would fall within the flood zone, and due to the absence of a surface water network, waterlogging conditions throughout districts 21 and 22 of Tehran are predictable. Therefore, the development of a surface water collection network is one of the main priorities for reducing flood risk in these areas.
Risk in industry. Risk management, Industrial safety. Industrial accident prevention
خیس شدن پارچه باعث تغییر رنگ آن میشود. بنابراین، برای کنترل رنگ پارچه در فرایند رنگرزی، پیشبینی رنگ پارچه در حالت خیس بسیار مهم است. در این مقاله از مدل هندسی برای پیشبینی طیف انعکاسی پارچه نایلونی خیس بر اساس طیف انعکاسی حالت خشک آن استفاده شد. برای این منظور، نمونههای پارچه نایلونی با رنگزای اسیدی قرمز، آبی و زرد به صورت تکی و مخلوط رنگرزی شدند. آنالیز عاملهای رنگی نمونهها نشان داد خیس شدن سبب تغییر رنگ، کاهش روشنایی و افزایش عمق رنگی پارچه میشود. از یک مدل هندسی و کیوبلکا-مانک، برای پیشبینی طیف انعکاسی پارچه نایلونی خیس استفاده شد. به منظور پیشبینی انعکاس پارچه در حالت خیس به روش مدل هندسی، از مقادیر ضریب جذب مولار رنگزا (ɛ)، ضریب جذب مولار رنگزا اصلاح شده، k/s واحد و k/s واحد اصلاح شده استفاده شد. خطای پیشبینی بر حسب اختلاف رنگ (ΔECMC) در چهار روش پیشبینی، استفاده از مقادیر ضریب جذب مولار رنگزا، ضریب جذب مولار رنگزا اصلاحی، k/s واحد و k/s واحد اصلاح شده، به ترتیب 18.69، 15.51، 6.87 و 5.71 است. بهترین پیشبینی توسط مدل هندسی با استفاده از k/s واحد اصلاح شده به دست آمد.
Abstract Activating ground-state molecular oxygen (O2) without added oxidants or external energy is a central challenge in aerobic catalysis because triplet O2 imposes spin and electron-transfer constraints. Herein, we report a high-rate, energy-neutral O2 activation platform that converts ambient air O2 directly to singlet oxygen (1O2) under room-temperature, bias-free conditions. By engineering atomically adjacent Co-Mo dual sites, Co-Mo d-d coupling and electron delocalization create a short-range electron transfer pathway that strengthens O2 adsorption, weaken the O-O bond via π* orbital population, and limit solvent-induced dissipation, thereby favoring selective 1O2 formation. These features enable the catalyst 1O2 productivity and pollutant degradation rates up to three orders of magnitude higher than previously reported air-fed O2 heterogeneous catalysts and comparable to oxidant-driven processes, yet without chemical inputs or energy bias. The catalyst is robust and versatile across diverse applications, including the degradation of organic contaminants, transformation of inorganic ions and antibacterial applications. This work establishes a new approach for sustainable O2 activation, pointing toward next-generation energy-neutral catalytic technologies.
Selective perchlorate (ClO4−) removal from surface water is a pressing need due to the stringent perchlorate drinking water limits around the world. Herein, we anchored N+–C–H hydrogen bond donors in hydrophobic cavities via interactions of cationic surfactants with montmorillonite to prioritize perchlorate bonding. The prepared adsorbent exhibited high selectivity over commonly occurring competing anions, including SO42−, NO3−, PO43−, HCO3−, and halide anions. High adsorption capacity, fast adsorption kinetics, and excellent regeneration ability (removal efficiency ≥ 80% after 20 cycles) were confirmed via batch experiments. Unconventional CH···O hydrogen bonding was verified as the primary driving force for perchlorate adsorption, which relies on the higher bond energy (∼80 kcal·mol−1) than conventional bonding. The removal efficiency of anions followed the order of the Hofmeister Series, demonstrating the importance of hydrophobic cavities formed by the tail groups of cationic surfactants. The hydrophobic cavities sheltered the C–H bonds from interacting with anions of low hydration energy (e.g., perchlorate). Furthermore, a fixed-bed column test demonstrated that about 2900 bed volumes of the feeding streams (∼500 μg·L−1) can be treated to ≤ 70 μg·L−1, with an enrichment factor of 10.3. Overall, on the basis of the hydrophobicity-induced hydrogen bonding mechanism, a series of low-cost adsorbents can be synthesized and applied for specific perchlorate removal.
Muhammad Afaq Hussain, Zhanlong Chen, Biswajeet Pradhan
et al.
Study region: The National Highways 85 and 50, key routes of the China–Pakistan Economic Corridor (CPEC) in Balochistan, Pakistan. Study focus: Flooding is a natural disaster that is becoming increasingly frequent and severe. The National Highways 85 and 50 are vulnerable, necessitating accurate flood susceptibility mapping (FSM). Current machine learning (ML) models for FSM often suffer from low efficiency and overfitting. This study introduces an innovative hybrid FSM approach using four heterogeneous ensemble learning (HEL) techniques combined with three ML models: Random Forest (RF), Support Vector Machine (SVM), and Light Gradient Boosting Machine (LGBM). The proposed method was tested using satellite data from Sentinel-1, Sentinel-2, and Landsat-8, analyzing 1371 flood locations and 12 contributing variables. RF, variable importance factors (VIF), and information gain ratio (IGR) were applied to assess multicollinearity. The dataset was split (70:30) for model training and testing, with HEL-based models achieving superior performance over single ML models. New hydrological insights for the region: The stacking model yielded the highest AUROC (0.98), Kappa (0.82), accuracy (0.927), precision (0.963), Matthew’s correlation coefficient (0.820), and F1-score (0.950). HEL-based models proved more stable and resistant to overfitting. IGR analysis identified slope and distance from streams as key factors in FSM. The resulting flood-prone maps provide insights for disaster management adaptation strategies, demonstrating the broader applicability of the developed approach to enhance FSM accuracy and reliability.
LLMs are transforming software engineering by accelerating development, reducing complexity, and cutting costs. When fully integrated into the software lifecycle they will drive design, development and deployment while facilitating early bug detection, continuous improvement, and rapid resolution of critical issues. However, trustworthy LLM-driven software engineering requires addressing multiple challenges such as accuracy, scalability, bias, and explainability.
Researchers have recently achieved significant advances in deep learning techniques, which in turn has substantially advanced other research disciplines, such as natural language processing, image processing, speech recognition, and software engineering. Various deep learning techniques have been successfully employed to facilitate software engineering tasks, including code generation, software refactoring, and fault localization. Many papers have also been presented in top conferences and journals, demonstrating the applications of deep learning techniques in resolving various software engineering tasks. However, although several surveys have provided overall pictures of the application of deep learning techniques in software engineering, they focus more on learning techniques, that is, what kind of deep learning techniques are employed and how deep models are trained or fine-tuned for software engineering tasks. We still lack surveys explaining the advances of subareas in software engineering driven by deep learning techniques, as well as challenges and opportunities in each subarea. To this end, in this paper, we present the first task-oriented survey on deep learning-based software engineering. It covers twelve major software engineering subareas significantly impacted by deep learning techniques. Such subareas spread out the through the whole lifecycle of software development and maintenance, including requirements engineering, software development, testing, maintenance, and developer collaboration. As we believe that deep learning may provide an opportunity to revolutionize the whole discipline of software engineering, providing one survey covering as many subareas as possible in software engineering can help future research push forward the frontier of deep learning-based software engineering more systematically.
<p>The application of machine learning (ML) including deep learning models in hydrogeology to model and predict groundwater level in monitoring wells has gained some traction in recent years. Currently, the dominant model class is the so-called single-well model, where one model is trained for each well separately. However, recent developments in neighbouring disciplines including hydrology (rainfall–runoff modelling) have shown that global models, being able to incorporate data of several wells, may have advantages. These models are often called “entity-aware models“, as they usually rely on static data to differentiate the entities, i.e. groundwater wells in hydrogeology or catchments in surface hydrology. We test two kinds of static information to characterize the groundwater wells in a global, entity-aware deep learning model set-up: first, environmental features that are continuously available and thus theoretically enable spatial generalization (regionalization), and second, time-series features that are derived from the past time series at the respective well. Moreover, we test random integer features as entity information for comparison. We use a published dataset of 108 groundwater wells in Germany, and evaluate the performance of the models in terms of Nash–Sutcliffe efficiency (NSE) in an in-sample and an out-of-sample setting, representing temporal and spatial generalization. Our results show that entity-aware models work well with a mean performance of NSE <span class="inline-formula">>0.8</span> in an in-sample setting, thus being comparable to, or even outperforming, single-well models. However, they do not generalize well spatially in an out-of-sample setting (mean NSE <span class="inline-formula"><0.7</span>, i.e. lower than a global model without entity information). Strikingly, all model variants, regardless of the type of static features used, basically perform equally well both in- and out-of-sample. The conclusion is that the model in fact does not show entity awareness, but uses static features merely as unique identifiers, raising the research question of how to properly establish entity awareness in deep learning models. Potential future avenues lie in bigger datasets, as the relatively small number of wells in the dataset might not be enough to take full advantage of global models. Also, more research is needed to find meaningful static features for ML in hydrogeology.</p>
<p>This paper describes a neural network cloud masking scheme from PARASOL (Polarization and Anisotropy of Reflectances for Atmospheric Science coupled with Observations from a Lidar) multi-angle polarimetric measurements. The algorithm has been trained on synthetic measurements and has been applied to the processing of 1 year of PARASOL data. Comparisons of the retrieved cloud fraction with MODIS (Moderate Resolution Imaging Spectroradiometer) products show overall agreement in spatial and temporal patterns, but the PARASOL neural network (PARASOL-NN) retrieves lower cloud fractions. Comparisons with a goodness-of-fit mask from aerosol retrievals suggest that the NN cloud mask flags fewer clear pixels as cloudy than MODIS (<span class="inline-formula">∼</span> 3 % of the clear pixels versus <span class="inline-formula">∼</span> 15 % by MODIS). On the other hand the NN classifies more pixels incorrectly as clear than MODIS (<span class="inline-formula">∼</span> 20 % by NN, versus <span class="inline-formula">∼</span> 15 % by MODIS). Additionally, the NN and MODIS cloud mask have been applied to the aerosol retrievals from PARASOL using the Remote Sensing of Trace Gas and Aerosol Products (RemoTAP) algorithm. Validation with AERONET shows that the NN cloud mask performs comparably with MODIS in screening residual cloud contamination in retrieved aerosol properties. Our study demonstrates that cloud masking from multi-angle polarimeter (MAP) aerosol retrievals can be performed based on the MAP measurements themselves, making the retrievals independent of the availability of a cloud imager.</p>
The East Asian region is typically characterized by warm and humid conditions from late spring to summer. However, in recent decades, this region has experienced an increase in severe drying conditions, deviating from historical climatological patterns. This study investigated the precipitation − evaporation ( P − E ) trends across land and sea regions in East Asia (EA) during the extended summer season (April–September) from 1980 to 2022, and the key physical processes driving these trends through moisture budget decomposition and numerical experiments. The results reveal pronounced drying trends in southeastern China and the Yellow Sea and parts of the Korea Strait and Korean Peninsula over the past 43 years. The underlying physical processes driving these drying conditions differ between land and sea in EA. In southeastern China, the drying is driven by dynamic processes, particularly moisture divergence related to wind changes. This is linked to anomalous strengthening of descending motion due to the Indo-Pacific warm pool warming induced by both anthropogenic global warming and natural Pacific Decadal Oscillation-like sea surface temperature (SST) patterns. Conversely, drying in the Yellow Sea and adjacent areas is influenced by thermodynamic moisture advection. The altered humidity distribution due to global warming-induced SST patterns, which are higher over the Northwest Pacific marginal sea and lower in inland China, drives dry air transport from inland China to the Yellow Sea via background southwesterly wind. These findings enhance our understanding of the drying trend and their distinct processes in EA’s land and sea areas during the extended summer.
Concrete made using geopolymer technology is environmental friendly and could be considered as part of the sustainable development. Even though aggregate constitutes major volume in geopolymer concrete, only limited study related to this parameter has been reported. This paper presents the summary of study carried out to understand the influence of aggregate content on the engineering properties of geopolymer concrete. Influence of other parameters on engineering properties of geopolymer concrete such as curing temperature, period of curing, ratio of sodium silicate to sodium hydroxide, ratio of alkali to fly ash and molarity of sodium hydroxide are also discussed in this paper. Based on the study carried out, it could be concluded that a geopolymer concrete with proper proportioning of total aggregate content and ratio of fine aggregate to total aggregate, along with the optimum values of other parameters, can have better engineering properties than the corresponding properties of ordinary cement concrete.