Software systems are widely observed to grow in size, complexity, and interdependence over time, yet many large-scale systems remain stable despite persistent structural burden. This apparent tension suggests a limitation in one-dimensional views of software evolution. This paper introduces a graph-based, discrete-time probabilistic framework that separates structural burden from uncertainty. Change effort is modeled as a stochastic variable determined by the dependency neighborhood of the changed entity and by residual variability. Within this framework, burden is defined as expected effort and uncertainty as variance of effort. We show that, under explicit assumptions on non-decreasing average structural load, structural regularization, process stabilization, and covariance control, there exists a regime in which uncertainty decreases while structural burden does not. This regime formalizes the phenomenon of stabilization without simplification. The proposed framework provides a minimal theoretical explanation for how software systems can become more predictable over time without necessarily becoming structurally simpler, and offers a foundation for further theoretical and empirical studies of software evolution.
The conventional manufacturing of ophthalmic lenses is an inefficient subtractive process where up to 97% of the material is discarded through grinding, polishing, and edging. Fluidic Shaping has emerged as a powerful alternative, utilizing surface tension to form optical-quality surfaces. While the approach enabled the creation of ophthalmic lenses without grinding or polishing, it was limited to lenses with a circular or elliptical footprint and still required the wasteful edging process to fit the lenses into the eyewear rims. Here, the Cookie Cutter algorithm is introduced, generalizing the Fluidic Shaping approach to be applicable to arbitrary domains, thus eliminating all subtractive processes. This mathematical framework calculates the unique varying edge-height required for a boundary frame, allowing a liquid polymer to naturally settle into a target spherocylindrical prescription within an arbitrary rim footprint. By utilizing neutral buoyancy to negate gravity, the liquid polymer is shaped solely by surface tension and subsequently cured, resulting in a lens that fits directly into commercial eyewear rims without any mechanical post-processing. The method is validated experimentally, demonstrating the fabrication of lenses compatible with standard eyewear rims. This approach represents a complete additive manufacturing solution, enabling end-to-end zero-waste production of prescription eyeglasses.
LiDAR point clouds are widely used in autonomous driving and consist of large numbers of 3D points captured at high frequency to represent surrounding objects such as vehicles, pedestrians, and traffic signs. While this dense data enables accurate perception, it also increases computational cost and power consumption, which can limit real-time deployment. Existing point cloud sampling methods typically face a trade-off: very fast approaches tend to reduce accuracy, while more accurate methods are computationally expensive. To address this limitation, we propose an efficient learned point cloud simplification method for LiDAR data. The method combines a feature embedding module with an attention-based sampling module to prioritize task-relevant regions and is trained end-to-end. We evaluate the method against farthest point sampling (FPS) and random sampling (RS) on 3D object detection on the KITTI dataset and on object classification across four datasets. The method was consistently faster than FPS and achieved similar, and in some settings better, accuracy, with the largest gains under aggressive downsampling. It was slower than RS, but it typically preserved accuracy more reliably at high sampling ratios.
Street intersection counts and densities are ubiquitous measures in transport geography and planning. However, typical street network data and typical street network analysis tools can substantially overcount them. This article explains the three main reasons why this happens and presents solutions to each. It contributes algorithms to automatically simplify spatial graphs of urban street networks---via edge simplification and node consolidation---resulting in faster parsimonious models and more accurate network measures like intersection counts and densities, street segment lengths, and node degrees. These algorithms' information compression improves downstream graph analytics' memory and runtime efficiency, boosting analytical tractability without loss of model fidelity. Finally, this article validates these algorithms and empirically assesses intersection count biases worldwide to demonstrate the problem's widespread prevalence. Without consolidation, traditional methods would overestimate the median urban area intersection count by 14\%. However, this bias varies drastically across regions, underscoring these algorithms' importance for consistent comparative empirical analyses.
Most languages lack sufficient data for large-scale monolingual pretraining, creating a "data wall." Multilingual pretraining helps but is limited by language imbalance and the "curse of multilinguality." An alternative is to translate high-resource text with machine translation (MT), which raises three questions: (1) How does MT-derived data scale with model capacity? (2) Can source-side transformations (e.g., simplifying English with an LLM) improve generalization to native text? (3) How well do models pretrained on MT-derived data adapt when continually trained on limited native text? We investigate these questions by translating English into Indonesian and Tamil--two typologically distant, lower-resource languages--and pretraining GPT-2 models (124M-774M) on native or MT-derived corpora from raw and LLM-simplified English. We evaluate cross-entropy loss on native text, along with accuracy on syntactic probes and downstream tasks. Our results show that (1) MT-pretrained models benefit from scaling; (2) source-side simplification harms generalization to native text; and (3) adapting MT-pretrained models on native text often yields better performance than native-only models, even with less native data. However, tasks requiring cultural nuance (e.g., toxicity detection) demand more exposure to native data.
Francesca Conserva, Fabio Busacca, Corrado Puligheddu
et al.
The transition towards 6G presents unique challenges and opportunities in mobile networks design and standardization. Addressing these challenges requires a robust methodology for analyzing and selecting innovations that can be effectively translated into 3rd Generation Partnership Project (3GPP) contributions. This paper presents a systematic approach to bridging research and standardization, ensuring that cutting-edge advancements extend beyond academia and translate into concrete standardization efforts. The proposed methodology has been applied within the Italian RESTART framework to two ongoing research areas: Morphable Programmable Networks (MPNs) and Network Digital Twins (NDTs), both key enablers of next-generation networks. MPNs enhance dynamic adaptability and resource management, while NDTs enable real-time simulation, predictive analytics, and intelligent decision-making. Their integration into 3GPP Release 20 will be instrumental in shaping a flexible and future-proof mobile ecosystem. These innovations exemplify how research-driven solutions can align with 6G standardization objectives. By applying the proposed methodology, we aim to establish a systematic pathway for transitioning research into impactful 3GPP contributions, ultimately driving the evolution of next-generation networks.
This study presents a closed-loop biorefinery strategy that thermochemically upcycles fermentation residues (FRs) from photo-fermentative biohydrogen production (PFHP) into functional biochar catalysts, thereby enhancing the efficiency of the initial PFHP process. Four FRs derived from hydrothermal and ethylene glycol-pretreated corn stover were pyrolyzed at 700°C. Multi-model kinetic analyses revealed diffusion-controlled mechanisms with activation energies ranging from 157 to 278 kJ/mol, while thermodynamic profiling highlighted the influence of feedstock composition on reaction spontaneity and entropy. Pyrolysis effectively restored porosity compromised during fermentation, yielding biochar with tailored properties: microporous BC3 (185 m2/g) from oxygen-rich precursors and mesoporous BC4 (76.58 m2/g) from graphitized residues. When reintroduced into PFHP, BC3 maximized cumulative hydrogen yield (570 mL) via pH buffering, and BC4 achieved the highest production rate (14.91 mL/h) through electron shuttle mechanisms. The integrated process concurrently generated syngas, bio-oil, and catalytic biochar, enabling waste valorization, renewable energy output, and process enhancement within a circular bioeconomy framework.
Earthwork-related locations (ERLs), such as construction sites, earth dumping ground, and concrete mixing stations, are major sources of urban dust pollution (particulate matters). The effective management of ERLs is crucial and requires timely and efficient tracking of these locations throughout the city. This work aims to identify and classify urban ERLs using GPS trajectory data of over 16,000 construction waste hauling trucks (CWHTs), as well as 58 urban features encompassing geographic, land cover, POI and transport dimensions. We compare several machine learning models and examine the impact of various spatial-temporal features on classification performance using real-world data in Chengdu, China. The results demonstrate that 77.8% classification accuracy can be achieved with a limited number of features. This classification framework was implemented in the Alpha MAPS system in Chengdu, which has successfully identified 724 construction cites/earth dumping ground, 48 concrete mixing stations, and 80 truck parking locations in the city during December 2023, which has enabled local authority to effectively manage urban dust pollution at low personnel costs.
Laura Fernández Díaz, Miriam Fernández Díaz, José Ramón Quevedo
et al.
This paper copes with the COGERSA waste collection process. Up to now, experts have been manually designed the process using a trial and error mechanism. This process is not globally optimized, since it has been progressively and locally built as council demands appear. Planning optimization algorithms usually solve it, but they need a fitness function to evaluate a route planning quality. The drawback is that even experts are not able to propose one in a straightforward way due to the complexity of the process. Hence, the goal of this paper is to build a fitness function though a preference framework, taking advantage of the available expert knowledge and expertise. Several key performance indicators together with preference judgments are carefully established according to the experts for learning a promising fitness function. Particularly, the additivity property of them makes the task be much more affordable, since it allows to work with routes rather than with route plannings. Besides, a feature selection analysis is performed over such indicators, since the experts suspect of a potential existing (but unknown) redundancy among them. The experiment results confirm this hypothesis, since the best $C-$index ($98\%$ against around $94\%$) is reached when 6 or 8 out of 21 indicators are taken. Particularly, truck load seems to be a highly promising key performance indicator, together to the travelled distance along non-main roads. A comparison with other existing approaches shows that the proposed method clearly outperforms them, since the $C-$index goes from $72\%$ or $90\%$ to $98\%$.
Photoreforming is a clean photocatalytic technology for simultaneous plastic waste degradation and hydrogen fuel production, but there are still limited active and stable catalysts for this process. This work introduces the brookite polymorph of TiO2 as an active photocatalyst for photoreforming with an activity higher than anatase and rutile polymorphs for both hydrogen production and plastic degradation. Commercial brookite successfully converts polyethylene terephthalate (PET) plastic to acetic acid under light. The high activity of brookite is attributed to good charge separation, slow decay and moderate electron trap energy, which lead to a higher generation of hydrogen and hydroxyl radicals and accordingly enhanced photo-oxidation of PET plastic. These results introduce brookite as a stable and active catalyst for the photoconversion of water contaminated with microplastics to value-added organic compounds and hydrogen.
Automatic text simplification (TS) aims to automate the process of rewriting text to make it easier for people to read. A pre-requisite for TS to be useful is that it should convey information that is consistent with the meaning of the original text. However, current TS evaluation protocols assess system outputs for simplicity and meaning preservation without regard for the document context in which output sentences occur and for how people understand them. In this work, we introduce a human evaluation framework to assess whether simplified texts preserve meaning using reading comprehension questions. With this framework, we conduct a thorough human evaluation of texts by humans and by nine automatic systems. Supervised systems that leverage pre-training knowledge achieve the highest scores on the reading comprehension (RC) tasks amongst the automatic controllable TS systems. However, even the best-performing supervised system struggles with at least 14% of the questions, marking them as "unanswerable'' based on simplified content. We further investigate how existing TS evaluation metrics and automatic question-answering systems approximate the human judgments we obtained.
Dlab--Ringel's standardization method gives a realization of a standardly stratified algebra. In this paper, we construct mixed stratified algebras, which are a generalization of standardly stratified algebras, following Dlab--Ringel's standardization method. Moreover, we study a Ringel duality of mixed stratified algebras from the viewpoint of stratifying systems.
Exponential tail bounds for sums play an important role in statistics, but the example of the $t$-statistic shows that the exponential tail decay may be lost when population parameters need to be estimated from the data. However, it turns out that if Studentizing is accompanied by estimating the location parameter in a suitable way, then the $t$-statistic regains the exponential tail behavior. Motivated by this example, the paper analyzes other ways of empirically standardizing sums and establishes tail bounds that are sub-Gaussian or even closer to normal for the following settings: Standardization with Studentized contrasts for normal observations, standardization with the log likelihood ratio statistic for observations from an exponential family, and standardization via self-normalization for observations from a symmetric distribution with unknown center of symmetry. The latter standardization gives rise to a novel scan statistic for heteroscedastic data whose asymptotic power is analyzed in the case where the observations have a log-concave distribution.
Giovambattista Ianni, Francesco Pacenza, Jessica Zangari
The repeated execution of reasoning tasks is desirable in many applicative scenarios, such as stream reasoning and event processing. When using answer set programming in such contexts, one can avoid the iterative generation of ground programs thus achieving a significant payoff in terms of computing time. However, this may require some additional amount of memory and/or the manual addition of operational directives in the declarative knowledge base at hand. We introduce a new strategy for generating series of monotonically growing propositional programs. The proposed overgrounded programs with tailoring (OPTs) can be updated and reused in combination with consecutive inputs. With respect to earlier approaches, our tailored simplification technique reduces the size of instantiated programs. A maintained OPT slowly grows in size from an iteration to another while the update cost decreases, especially in later iterations. In this paper we formally introduce tailored embeddings, a family of equivalence-preserving ground programs which are at the theoretical basis of OPTs and we describe their properties. We then illustrate an OPT update algorithm and report about our implementation and its performance. This paper is under consideration in Theory and Practice of Logic Programming (TPLP).
Hamed Majidifard, Nader Tabatabaee, William Buttlar
The environmental and economic benefits of recycling asphalt pavements have received much attention in recent years. Because of the increase in the cost of raw materials and energy carriers, the reuse of large portions of reclaimed asphalt pavement (RAP) is critical in reducing both the cost and environmental footprint of asphalt pavements. High-RAP mixtures are more prone to low temperature cracking and poor mixture workability because of the higher stiffness of RAP binder. Recycling agents are one of the additives which are used to improve these deficiencies. However, there is some ambiguity about the optimum content of recycling agent to assure proper performance of recycled asphalt pavement during its service life. The current study used 60% and 100% fractionated RAP with waste cooking oil as a recycling agent and crumb rubber to alleviate the aforementioned problems. Laboratory evaluation showed that increasing the amount of recycling agent in the high-RAP mixtures improved their workability and low temperature performance while decreasing moisture damage and rutting resistance. The long-term susceptibility to aging of recycled binder with the organically-based recycling agent was also investigated. A procedure to obtain the optimum percentage of recycling agent was devised to strike a balance between the performance characteristics of mixtures with a high-RAP content.
A. Stephen McGough, Matthew Forshaw, John Brennan
et al.
High Throughput Computing (HTC) provides a convenient mechanism for running thousands of tasks. Many HTC systems exploit computers which are provisioned for other purposes by utilising their idle time - volunteer computing. This has great advantages as it gives access to vast quantities of computational power for little or no cost. The downside is that running tasks are sacrificed if the computer is needed for its primary use. Normally terminating the task which must be restarted on a different computer - leading to wasted energy and an increase in task completion time. We demonstrate, through the use of simulation, how we can reduce this wasted energy by targeting tasks at computers less likely to be needed for primary use, predicting this idle time through machine learning. By combining two machine learning approaches, namely Random Forest and MultiLayer Perceptron, we save 51.4% of the energy without significantly affecting the time to complete tasks.