Symbolic regression (SR) aims to discover interpretable analytical expressions that accurately describe observed data. Amortized SR promises to be much more efficient than the predominant genetic programming SR methods, but currently struggles to scale to realistic scientific complexity. We find that a key obstacle is the lack of a fast reduction of equivalent expressions to a concise normalized form. Amortized SR has addressed this by general-purpose Computer Algebra Systems (CAS) like SymPy, but the high computational cost severely limits training and inference speed. We propose SimpliPy, a rule-based simplification engine achieving a 100-fold speed-up over SymPy at comparable quality. This enables substantial improvements in amortized SR, including scalability to much larger training sets, more efficient use of the per-expression token budget, and systematic training set decontamination with respect to equivalent test expressions. We demonstrate these advantages in our Flash-ANSR framework, which achieves much better accuracy than amortized baselines (NeSymReS, E2E) on the FastSRB benchmark. Moreover, it performs on par with state-of-the-art direct optimization (PySR) while recovering more concise instead of more complex expressions with increasing inference budget.
Linear constraints are one of the most fundamental constraints in fields such as computer science, operations research and optimization. Many applications reduce to the task of model counting over integer linear constraints (MCILC). In this paper, we design an exact approach to MCILC based on an exhaustive DPLL architecture. To improve the efficiency, we integrate several effective simplification techniques from mixed integer programming into the architecture. We compare our approach to state-of-the-art MCILC counters and propositional model counters on 2840 random and 4131 application benchmarks. Experimental results show that our approach significantly outperforms all exact methods in random benchmarks solving 1718 instances while the state-of-the-art approach only computes 1470 instances. In addition, our approach is the only approach to solve all 4131 application instances.
Quantum computing is an emerging computational paradigm with the potential to outperform classical computers in solving a variety of problems. To achieve this, quantum programs are typically represented as quantum circuits, which must be optimized and adapted for target hardware through quantum circuit compilation. We introduce ZX-DB, a data-driven system that performs quantum circuit simplification and rewriting inside a graph database using ZX-calculus, a complete graphical formalism for quantum mechanics. ZX-DB encodes ZX-calculus rewrite rules as standard openCypher queries and executes them on an example graph database engine, Memgraph, enabling efficient, database-native transformations of large-scale quantum circuits. ZX-DB integrates correctness validation via tensor and graph equivalence checks and is evaluated against the state-of-the-art PyZX framework. Experimental results show that ZX-DB achieves up to an order-of-magnitude speedup for independent rewrites, while exposing pattern-matching bottlenecks in current graph database engines. By uniting quantum compilation and graph data management, ZX-DB opens a new systems direction toward scalable, database-supported quantum computing pipelines.
Although North Korea's nuclear program has been the subject of extensive scrutiny, estimates of its fissile material stockpiles remain fraught with uncertainty. In potential future disarmament agreements, inspectors may need to use nuclear archaeology methods to verify or gain confidence in a North Korean fissile material declaration. This study explores the potential utility of a Bayesian inference-based analysis of the isotopic composition of reprocessing waste to reconstruct the operating history of the 5 MWe reactor and estimate its plutonium production history. We simulate several scenarios that reflect different assumptions and varying levels of prior knowledge about the reactor. The results show that correct prior assumptions can be confirmed and incorrect prior information (or a false declaration) can be detected. Model comparison techniques can distinguish between scenarios with different numbers of core discharges, a capability that could provide important insights into the early stages of operation of the 5 MWe reactor. Using these techniques, a weighted plutonium estimate can be calculated, even in cases where the number of core discharges is not known with certainty.
Recent advancements in large language models (LLMs) have enabled a wide range of natural language processing (NLP) tasks to be performed through simple prompt-based interactions. Consequently, several approaches have been proposed to engineer prompts that most effectively enable LLMs to perform a given task (e.g., chain-of-thought prompting). In settings with a well-defined metric to optimize model performance, automatic prompt optimization (APO) methods have been developed to refine a seed prompt. Advancing this line of research, we propose APIO, a simple but effective prompt induction and optimization approach for the tasks of Grammatical Error Correction (GEC) and Text Simplification, without relying on manually specified seed prompts. APIO achieves a new state-of-the-art performance for purely LLM-based prompting methods on these tasks. We make our data, code, prompts, and outputs publicly available.
Daniel Errandonea, Robin Turnbull, Josu Sanchez-Martin
et al.
We present a comparative study of the high-pressure behaviours of the nuclear waste immobilisation materials zirconolite-2M, -4M, -3O, and -3T. The materials are studied under high-pressure conditions using synchrotron powder X-ray diffraction. For zirconolite-2M we also performed density-functional theory calculations. A new triclinic crystal structure (space group P-1), instead of the previously assigned monoclinic structure (space group C2/c) is proposed for zirconolite-2M. We named the triclinic structure as zirconolite-2TR. We also found that zirconolite-2TR undergoes a phase transition at 14.7 GPa to a monoclinic structure described by space group C2/c, which is different than the high-pressure structure previously proposed in the literature. These results are discussed in comparison with previous studies on zirconolite-2M and the related compound calzirtite. For the other three zirconolite structures (4M, 3O, and 3T) this is the first high-pressure study, and we find no evidence for pressure induced phase transitions in any of them. The linear compressibility of the studied compounds, as well as a room-temperature pressure-volume equation of state, are also presented and discussed.
Street intersection counts and densities are ubiquitous measures in transport geography and planning. However, typical street network data and typical street network analysis tools can substantially overcount them. This article explains the three main reasons why this happens and presents solutions to each. It contributes algorithms to automatically simplify spatial graphs of urban street networks -- via edge simplification and node consolidation -- resulting in faster parsimonious models and more accurate network measures like intersection counts and densities, street segment lengths, and node degrees. These algorithms' information compression improves downstream graph analytics' memory and runtime efficiency, boosting analytical tractability without loss of model fidelity. Finally, this article validates these algorithms and empirically assesses intersection count biases worldwide to demonstrate the problem's widespread prevalence. Without consolidation, traditional methods would overestimate the median urban area intersection count by 14\%. However, this bias varies drastically across regions, underscoring these algorithms' importance for consistent comparative empirical analyses.
This paper presents a simplification of robotic system model analysis due to the transfer of Robotic System Hierarchical Petri Net (RSHPN) meta-model properties onto the model of a designed system. Key contributions include: 1) analysis of RSHPN meta-model properties; 2) decomposition of RSHPN analysis into analysis of individual Petri nets, thus the reduction of state space explosion; and 3) transfer of RSHPN meta-model properties onto the produced models, hence elimination of the need for full re-analysis of the RSHPN model when creating new robotic systems. Only task-dependent parts of the model need to be analysed. This approach streamlines the analysis thus reducing the design time. Moreover, it produces a specification which is a solid foundation for the implementation of the system. The obtained results highlight the potential of Petri nets as a valuable formal framework for analysing robotic system properties.
Let $ D=(V,E) $ be a (possibly infinite) digraph and $ A,B\subseteq V $. A hindrance consists of an $ AB $-separator $ S $ together with a set of disjoint $ AS $-paths linking a proper subset of $ A $ onto $ S $. Hindrances and configurations guaranteeing the existence of hindrances play an essential role in the proof of the infinite version of Menger's theorem and are important in the context of certain open problems as well. This motivates the investigation of circumstances under which hindrances appear. In this paper we show that if there is a ``wasteful partial linkage'', i.e. a set $ \mathcal{P} $ of disjoint $ AB $-paths with fewer unused vertices in $ B $ than in $ A $, then there exists a hindrance.
Construction waste hauling trucks (or `slag trucks') are among the most commonly seen heavy-duty diesel vehicles in urban streets, which not only produce significant carbon, NO$_{\textbf{x}}$ and PM$_{\textbf{2.5}}$ emissions but are also a major source of on-road and on-site fugitive dust. Slag trucks are subject to a series of spatial and temporal access restrictions by local traffic and environmental policies. This paper addresses the practical problem of predicting levels of slag truck activity at a city scale during heavy pollution episodes, such that environmental law enforcement units can take timely and proactive measures against localized truck aggregation. A deep ensemble learning framework (coined AI-Truck) is designed, which employs a soft vote integrator that utilizes Bi-LSTM, TCN, STGCN, and PDFormer as base classifiers. AI-Truck employs a combination of downsampling and weighted loss is employed to address sample imbalance, and utilizes truck trajectories to extract more accurate and effective geographic features. The framework was deployed for truck activity prediction at a resolution of 1km$\times$1km$\times$0.5h, in a 255 km$^{\textbf{2}}$ area in Chengdu, China. As a classifier, AI-Truck achieves a macro F1 of 0.747 in predicting levels of slag truck activity for 0.5-h prediction time length, and enables personnel to spot high-activity locations 1.5 hrs ahead with over 80\% accuracy.
Construction waste hauling trucks (CWHTs), as one of the most commonly seen heavy-duty vehicles in major cities around the globe, are usually subject to a series of regulations and spatial-temporal access restrictions because they not only produce significant NOx and PM emissions but also causes on-road fugitive dust. The timely and accurate prediction of CWHTs' destinations and dwell times play a key role in effective environmental management. To address this challenge, we propose a prediction method based on an interpretable activity-based model, input-output hidden Markov model (IOHMM), and validate it on 300 CWHTs in Chengdu, China. Contextual factors are considered in the model to improve its prediction power. Results show that the IOHMM outperforms several baseline models, including Markov chains, linear regression, and long short-term memory. Factors influencing the predictability of CWHTs' transportation activities are also explored using linear regression models. Results suggest the proposed model holds promise in assisting authorities by predicting the upcoming transportation activities of CWHTs and administering intervention in a timely and effective manner.
Caitlin Condon, Mark Abkowitz, Harish Gadey
et al.
START, the Stakeholder Tool for Assessing Radioactive Transportation is a web-based, decision-support tool developed by the U.S. Department of Energy (DOE) to support the Office of Integrated Waste Management (IWM). Its purpose is to provide visualization and analysis of geospatial data relevant to planning and operating large-scale spent nuclear fuel (SNF) and high-level radioactive waste transport to storage and/or disposal facilities. At present, the primary transport method for these shipments is expected to be via rail, operating predominantly on mainline track. For many shipment sites, however, access to this network will typically require initial use of a local/regional (short line) railroad or involve intermodal transport where the access leg is a movement performed by heavy-haul truck and/or barge. START has the ability to represent and analyze all of these transport options, with each transportation network segment containing site-specific physical and operational attributes. Of particular note are segment-specific accident rates and travel speeds, derived from recent data provided by the U.S. Department of Transportation, Bureau of Transportation Statistics, and other publicly available resources. DOE anticipates that START users will include federal, State, Tribal and local government officials; nuclear utilities; transportation carriers; support contractors; citizen scientists; and other stakeholders. For this reason, START is designed to enable the user to represent a wide range of operating scenarios and performance objectives, with an emphasis on providing flexibility. In doing so, the tool makes extensive use of geographic information systems (GIS) technology for performing spatial analysis and map creation.
The municipal solid waste system is a complex reverse logistic chain which comprises several optimisation problems. Although these problems are interdependent, i.e., the solution to one of the problems restricts the solution to the other, they are usually solved sequentially in the related literature because each is usually a computationally complex problem. We address two of the tactical planning problems in this chain by means of a Benders decomposition approach: determining the location and/or capacity of garbage accumulation points, and the design and schedule of collection routes for vehicles. Our approach manages to solve medium-sized real-world instances in the city of Bahía Blanca, Argentina, showing smaller computing times than solving a full MIP model.
Retinal degenerative diseases cause profound visual impairment in more than 10 million people worldwide, and retinal prostheses are being developed to restore vision to these individuals. Analogous to cochlear implants, these devices electrically stimulate surviving retinal cells to evoke visual percepts (phosphenes). However, the quality of current prosthetic vision is still rudimentary. Rather than aiming to restore "natural" vision, there is potential merit in borrowing state-of-the-art computer vision algorithms as image processing techniques to maximize the usefulness of prosthetic vision. Here we combine deep learning--based scene simplification strategies with a psychophysically validated computational model of the retina to generate realistic predictions of simulated prosthetic vision, and measure their ability to support scene understanding of sighted subjects (virtual patients) in a variety of outdoor scenarios. We show that object segmentation may better support scene understanding than models based on visual saliency and monocular depth estimation. In addition, we highlight the importance of basing theoretical predictions on biologically realistic models of phosphene shape. Overall, this work has the potential to drastically improve the utility of prosthetic vision for people blinded from retinal degenerative diseases.
Annika Bonerath, Jan-Henrik Haunert, Joseph S. B. Mitchell
et al.
Let $P$ be a crossing-free polygon and $\mathcal C$ a set of shortcuts, where each shortcut is a directed straight-line segment connecting two vertices of $P$. A shortcut hull of $P$ is another crossing-free polygon that encloses $P$ and whose oriented boundary is composed of elements from $\mathcal C$. Shortcut hulls find their application in geo-related problems such as the simplification of contour lines. We aim at a shortcut hull that linearly balances the enclosed area and perimeter. If no holes in the shortcut hull are allowed, the problem admits a straight-forward solution via shortest paths. For the more challenging case that the shortcut hull may contain holes, we present a polynomial-time algorithm that is based on computing a constrained, weighted triangulation of the input polygon's exterior. We use this problem as a starting point for investigating further variants, e.g., restricting the number of edges or bends. We demonstrate that shortcut hulls can be used for drawing the rough extent of point sets as well as for the schematization of polygons.
Background: Projected costs of cancer care are expected to reach $172.8 billion dollars by 2020[1]. Cost of oncology drugs have outpaced other areas[2] with novel drugs priced at a median of $115,981/year between 2009 and 2013[3], making cost control an important priority. Rounding drug doses to the nearest vial size, as long as the difference is smaller than an accepted percentage, has been shown to reduce the cost of care[4]. This also allows workflow efficiency by simplifying compounding and decreasing waste documentation[5] and reduces the potential for medication errors[6]. Dose rounding has been recommended by the Hematology/Oncology Pharmacy Association (HOPA)[5] and the HOPA position paper was endorsed by NCCN[7]. Methods In accordance with HOPA position statement, an integrated community-based healthcare system comprised of 13 providers across 8 infusion centers implemented dose rounding to the nearest vial if within 10% based on actual calculated dose for biological and cytotoxic agents. Performed manually by oncology pharmacists at the time of initial introduction in January 2020, this was transitioned to automated dose rounding by the oncology chemotherapy software (Epic Beacon, Epic Systems, Verona, WI) in June 2020. To facilitate provider adoption and maximize cost-savings, we compiled a list of 31 biological and chemotherapy agents that provided the greatest therapeutic margin and potential for cost savings. Dose rounding parameters for each agent was developed to standardize and facilitate automated dose rounding. Lower and upper bounds were set so that the vial size was within 10% of the actual calculated dose (e.g. doses between 91 mg and 111 mg to a 100 mg vial, Table 1). Cost avoidance was calculated based on the acquisition price of the lowest vial size available and necessary to make the pre-rounded dose. Results Total cost avoidance between January 2020 and July 2020 by dose rounding 31 different chemotherapy and biological agents was $679,780.02. Automated dose rounding drugs introduced in June 2020 high value drugs resulted in cost savings of $ 112,994.12 over a seven week period. Dose rounding of biological drugs accounted for 89.4% of the total cost savings. Rounding of cytotoxic drugs resulted in a cost saving of $71,588.35. Trastuzumab dose rounding was associated with the greatest amount of cost avoidance of $147,194.44, accounting for 21.65% of total cost avoidance. Dose rounding up to the nearest vial ranged between 0.13%-9.75% (median 3.52%). Dose rounding down to the nearest vial ranged between -0.10% -9.93% (median -3.36%). Conclusions: We demonstrate feasibility of implementing an automated EHR based dose rounding protocol in an integrated delivery network, widespread adoption across multiple centers and significant cost avoidance accrued from the intervention. Similar dose rounding protocols should prioritize biologics agents due to their high utilization and costs. References: 1. Mariotto A B, K.Y.R., Shao Y, Feuer E J, Brown M L, Projections of the Cost of Cancer Care in the United States: 2010-2020. Journal of the National Cancer Institute, 2011. 103(2): p. 117-128. W2. Medicines Use and Spending in the US: a Review of 2016 and Outlook to 2021.https://www.iqvia.com/institute/reports/medicines-use-and-spending-in-the-us-a-review-of-2016. 2016, IQVIA. W3. Mailankody S, P.V., Five Years of Cancer Drug Approvals: Innovation, Efficacy, and Costs. JAMA Oncology, 2015. 15(4): p. 539-540. W4. Vandyke TH, A.P., Ballmer CM, Kintzel PE., Cost avoidance from dose rounding biologic and cytotoxic antineoplastics. Journal of oncology pharmacy practice, 2017. 23(5): p. 379-383. W5. Fahrenbruch, R., Kintzel, P., Bott, A. M., Gilmore, S., Markham, R, Dose Rounding of Biologic and Cytotoxic Anticancer Agents: A Position Statement of the Hematology/Oncology Pharmacy Association. Journal of oncology practice, 2018. 14(3): p. e130-e136. W6. Goldspiel, B., Hoffman, J. M., Griffith, N. L., Goodin, S., DeChristoforo, R., Montello, C. M., Chase, J. L., Bartel, S., Patel, J. T, ASHP guidelines on preventing medication errors with chemotherapy and biotherapy. American journal of health-system pharmacy, 2015. 72(8): p. e6-e35. W7. National Comprehensive Cancer Network. NCCN Chemotherapy Order Templates (NCCN Templates®). https://www.nccn.org/professionals/OrderTemplates/PDF/HOPA.pdf. Accessed July 30, 2020. 2020, NCCN: Plymouth Meeting, PA 19462. No relevant conflicts of interest to declare.
Every year, millions of pounds of medicines remain unused in the U.S. and are subject to an in-home disposal, i.e., kept in medicine cabinets, flushed in toilet or thrown in regular trash. In-home disposal, however, can negatively impact the environment and public health. The drug take-back programs (drug take-backs) sponsored by the Drug Enforcement Administration (DEA) and its state and industry partners collect unused consumer medications and provide the best alternative to in-home disposal of medicines. However, the drug take-backs are expensive to operate and not widely available. In this paper, we show that artificial intelligence (AI) can be applied to drug take-backs to render them operationally more efficient. Since identification of any waste is crucial to a proper disposal, we showed that it is possible to accurately identify loose consumer medications solely based on the physical features and visual appearance. We have developed an automatic technique that uses deep neural networks and computer vision to identify and segregate solid medicines. We applied the technique to images of about one thousand loose pills and succeeded in correctly identifying the pills with an accuracy of 0.912 and top-5 accuracy of 0.984. We also showed that hazardous pills could be distinguished from non-hazardous pills within the dataset with an accuracy of 0.984. We believe that the power of artificial intelligence could be harnessed in products that would facilitate the operation of the drug take-backs more efficiently and help them become widely available throughout the country.
The study aims to develop standardized BIM methods for buildings that are part of the Moscow city renovation program. This problem is extremely urgent, since there are no digitized sources of typical projects of the last century, and this complicates the process of calculating the generation of construction and demolition waste and casts doubt on its accuracy. At the stage of high quality and timely dismantling phase (calculation of the weight of construction and demolition waste, scheduling of work and transportation of waste), this process can be automated using information modeling software – Autodesk Revit 2019. This will complement the existing information model by including in it one of the final stages of the building’s life cycle – dismantling, which in transparency, accessibility and completeness will not be inferior to the blocks of design, examination, construction and operation. On the basis of albums of typical building projects, you can get a lot of information to create the necessary project families and specifications, which will greatly simplify and speed up the process of developing building demolition projects in the future, since with each new project libraries of elements, materials, families will be supplemented. The article presents the process of developing a project template for the phase of the dismantling of buildings. A new project parameter has been created, assigned to each material of the project template, the necessary parametric and design characteristics have been assigned for further calculations based on the simulated building design, a design specification has been developed, on the basis of which data on the weight of construction and demolition waste can be quickly obtained, relevant checks carried out for different types of developed and basic project families.