Hasil "Standardization. Simplification. Waste"

arXiv Open Access 2025

An Architectural Advantage of The Instruction-Tuned LLM in Containing The Readability-Accuracy Tension in Text Simplification

P. Bilha Githinji, Aikaterini Meilliou, Zeming Liang et al.

The increasing health-seeking behavior and digital consumption of biomedical information by the general public necessitate scalable solutions for automatically adapting complex scientific and technical documents into plain language. Automatic text simplification solutions, including advanced large language models (LLMs), however, continue to face challenges in reliably arbitrating the tension between optimizing readability performance and ensuring preservation of discourse fidelity. This report empirically assesses two major classes of general-purpose LLMs, demonstrating how they navigate the readability-accuracy tension compared to a human benchmark. Using a comparative analysis of the instruction-tuned Mistral-Small 3 24B and the reasoning-augmented QWen2.5 32B, we identify an architectural advantage in the instruction-tuned LLM. Mistral exhibits a tempered lexical simplification strategy that enhances readability across a suite of metrics while preserving human-level discourse with a BERTScore of 0.91. QWen also attains enhanced readability performance and a reasonable BERTScore of 0.89, but its operational strategy shows a disconnect in balancing between readability and accuracy. Additionally, a comprehensive correlation analysis of a suite of 21 metrics spanning readability, discourse fidelity, content safety, and underlying distributional measures for mechanistic insights, confirms strong functional redundancies, and informs metric selection and domain adaptation for text simplification.

en cs.CL

Detail Sumber

arXiv Open Access 2025

A Block-Based Heuristic Algorithm for the Three-Dimensional Nuclear Waste Packing Problem

Yajie Wen, Defu Zhang

In this study, we present a block-based heuristic search algorithm to address the nuclear waste container packing problem in the context of real-world nuclear power plants. Additionally, we provide a dataset comprising 1600 problem instances for future researchers to use. Experimental results on this dataset demonstrate that the proposed algorithm effectively enhances the disposal pool's space utilization while minimizing the radiation dose within the pool. The code and data employed in this study are publicly available to facilitate reproducibility and further investigation.

en math.OC, cs.AI

Detail Sumber

arXiv Open Access 2025

An AI-Driven Live Systematic Reviews in the Brain-Heart Interconnectome: Minimizing Research Waste and Advancing Evidence Synthesis

Arya Rahgozar, Pouria Mortezaagha, Jodi Edwards et al.

The Brain-Heart Interconnectome (BHI) combines neurology and cardiology but is hindered by inefficiencies in evidence synthesis, poor adherence to quality standards, and research waste. To address these challenges, we developed an AI-driven system to enhance systematic reviews in the BHI domain. The system integrates automated detection of Population, Intervention, Comparator, Outcome, and Study design (PICOS), semantic search using vector embeddings, graph-based querying, and topic modeling to identify redundancies and underexplored areas. Core components include a Bi-LSTM model achieving 87% accuracy for PICOS compliance, a study design classifier with 95.7% accuracy, and Retrieval-Augmented Generation (RAG) with GPT-3.5, which outperformed GPT-4 for graph-based and topic-driven queries. The system provides real-time updates, reducing research waste through a living database and offering an interactive interface with dashboards and conversational AI. While initially developed for BHI, the system's adaptable architecture enables its application across various biomedical fields, supporting rigorous evidence synthesis, efficient resource allocation, and informed clinical decision-making.

en cs.AI, cs.CL

Detail Sumber

arXiv Open Access 2025

UM_FHS at the CLEF 2025 SimpleText Track: Comparing No-Context and Fine-Tune Approaches for GPT-4.1 Models in Sentence and Document-Level Text Simplification

Primoz Kocbek, Gregor Stiglic

This work describes our submission to the CLEF 2025 SimpleText track Task 1, addressing both sentenceand document-level simplification of scientific texts. The methodology centered on using the gpt-4.1, gpt-4.1mini, and gpt-4.1-nano models from OpenAI. Two distinct approaches were compared: a no-context method relying on prompt engineering and a fine-tuned (FT) method across models. The gpt-4.1-mini model with no-context demonstrated robust performance at both levels of simplification, while the fine-tuned models showed mixed results, highlighting the complexities of simplifying text at different granularities, where gpt-4.1-nano-ft performance stands out at document-level simplification in one case.

en cs.CL

Detail Sumber

arXiv Open Access 2024

Hydroelectric energy conversion of waste flows through hydroelectronic drag

Baptiste Coquinot, Lydéric Bocquet, Nikita Kavokine

Hydraulic energy is a key component of the global energy mix, yet there exists no practical way of harvesting it at small scales, from flows at low Reynolds number. This has triggered the search for alternative hydroelectric conversion methodologies, leading to unconventional proposals based on droplet triboelectricity, water evaporation, osmotic energy or flow-induced ionic Coulomb drag. Yet, these approaches systematically rely on ions as intermediate charge carriers, limiting the achievable power density. Here, we predict that the kinetic energy of small-scale "waste" flows can be directly and efficiently converted into electricity thanks to the hydro-electronic drag effect, by which an ion-free liquid induces an electronic current in the solid wall along which it flows. This effect originates in the fluctuation-induced coupling between fluid motion and electron transport. We develop a non-equilibrium thermodynamic formalism to assess the efficiency of such hydroelectric energy conversion, dubbed hydronic energy. We find that hydronic energy conversion is analogous to thermoelectricity, with the efficiency being controlled by a dimensionless figure of merit. However, in contrast to its thermoelectric analogue, this figure of merit combines independently tunable parameters of the solid and the liquid, and can thus significantly exceed unity. Our findings suggest new strategies for blue energy harvesting without electrochemistry, and for waste flow mitigation in membrane-based filtration processes.

en physics.flu-dyn, cond-mat.mes-hall

Detail DOI Sumber

arXiv Open Access 2024

Low carbon optimal scheduling of integrated energy system considering waste heat utilization under the coordinated operation of incineration power plant and P2G

Limeng Wang, Shuo Wang, Na Wang et al.

In order to improve energy utilization and reduce carbon emissions, this paper presents a comprehensive energy system economic operation strategy of Incineration power plant Power-to-gas (P2G) with waste heat recovery. First, consider the coordinated operation of Incineration power plant - P2G, introduce the refined Power-to-gas two-stage operation process, add Hydrogen fuel cells on the basis of traditional Power-to-gas to reduce the energy ladder loss, and recycle the Methanation reaction heat; Secondly, in order to improve the energy utilization efficiency of Incineration, it is considered to install a waste heat recovery device containing a water source heat pump to recover the waste heat of flue gas and consume some electric energy, sourced from wind power, and add a CO2 separation device to combine the recovered CO2 with P2G to synthesize CH4 to achieve carbon recycling. Finally, within the framework of a tiered carbon trading mechanism an IES optimization model for electricity-heat with the goal of minimizing the system operating cost is constructed, and the GUROBI modeling optimization engine is used to solve this model. The results verify the effectiveness of the model.

en eess.SY

Detail DOI Sumber

arXiv Open Access 2024

ARTiST: Automated Text Simplification for Task Guidance in Augmented Reality

Guande Wu, Jing Qian, Sonia Castelo et al.

Text presented in augmented reality provides in-situ, real-time information for users. However, this content can be challenging to apprehend quickly when engaging in cognitively demanding AR tasks, especially when it is presented on a head-mounted display. We propose ARTiST, an automatic text simplification system that uses a few-shot prompt and GPT-3 models to specifically optimize the text length and semantic content for augmented reality. Developed out of a formative study that included seven users and three experts, our system combines a customized error calibration model with a few-shot prompt to integrate the syntactic, lexical, elaborative, and content simplification techniques, and generate simplified AR text for head-worn displays. Results from a 16-user empirical study showed that ARTiST lightens the cognitive load and improves performance significantly over both unmodified text and text modified via traditional methods. Our work constitutes a step towards automating the optimization of batch text data for readability and performance in augmented reality.

en cs.HC, cs.CL

Detail DOI Sumber

S2 Open Access 2022

Combustion of C1 and C2 PFAS: Kinetic modeling and experiments

J. Krug, P. Lemieux, Chun-Wai Lee et al.

ABSTRACT A combustion model, originally developed to simulate the destruction of chemical warfare agents, was modified to include C1-C3 fluorinated organic reactions and kinetics compiled by the National Institute of Standards and Technology (NIST). A simplified plug flow reactor version of this model was used to predict the destruction efficiency (DE) and formation of products of incomplete combustion (PICs) for three C1 and C2 per- and poly-fluorinated alkyl substances (PFAS) (CF4, CHF3, and C2F6) and compare predicted values to Fourier Transform Infrared spectroscopy (FTIR)-based measurements made from a pilot-scale EPA research combustor (40–64 kW, natural gas-fired, 20% excess air). PFAS were introduced through the flame, and at post-flame locations along a time-temperature profile allowing for simulation of direct flame and non-flame injection, and examination of the sensitivity of PFAS destruction on temperature and free radical flame chemistry. Results indicate that CF4 is particularly difficult to destroy with DEs ranging from ~60 to 95% when introduced through the flame at increasing furnace loads. Due to the presence of lower energy C-H and C-C bonds to initiate molecular dissociation reactions, CHF3 and C2F6 were easier to destroy, exhibiting DEs >99% even when introduced post-flame. However, these lower bond energies may also lead to the formation of CF2 and CF3 radicals at thermal conditions unable to fully de-fluorinate these species and formation of fluorinated PICs. DEs determined by the model agreed well with the measurements for CHF3 and C2F6 but overpredicted DEs at high temperatures and underpredicted DEs at low temperatures for CF4. However, high DEs do not necessarily mean absence of PICs, with both model predictions and limited FTIR measurements indicating the presence of similar fluorinated PICs in the combustion emissions. The FTIR was able to provide real-time emission measurements and additional model development may improve prediction of PFAS destruction and PIC formation. Implications: The widespread use of PFAS for over 70 years has led to their presence in multiple environmental matrixes including human tissues. While the chemical and thermal stability of PFAS are related to their desirable properties, this stability means that PFAS are very slow to degrade naturally and potentially difficult to destroy completely through thermal treatment processes often used for organic waste destruction. In this applied combustion study, model PFAS compounds were introduced to a pilot-scale EPA research furnace. Real-time FTIR measurements were performed of the injected compound and trace products of incomplete combustion (PICs) at operationally relevant conditions, and the results were successfully compared to kinetic model predictions of those same PFAS destruction efficiencies and trace gas-phase PIC constituents. This study represents a significant potential enhancement in available tools to support effective management of PFAS-containing wastes.

49 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 2014

Sustainable manufacturing-greening processes using specific Lean Production tools: an empirical observation from European motorcycle component manufacturers

A. Chiarini

305 sitasi en Engineering

Detail DOI Sumber

S2 Open Access 2023

Designing NLP applications to support ICD coding: an impact analysis and guidelines to enhance baseline performance when processing patient discharge notes

Jessica Jha, Mario Almagro, Hegler C. Tissot

Financial costs are a major concern in the healthcare system, with medical billing and coding playing a key role in facilitating transactions and financing procedures. Billing involves filing claims with insurance companies and requires scrutiny of clinical summaries and electronic health records to correctly match diagnoses, prescriptions, and procedures to standardized codes. Accuracy in assigning International Classification of Diseases (ICD) codes is critical to proper reimbursement of care. Incorrect codes waste time and resources, and cause administrative and financial problems for hospitals, insurance companies and patients. Manual medical coding is a labor-intensive and error-prone process that creates additional administrative burden and inconvenience for hospitals, insurance companies, and patients. To simplify the process, clinical records are often processed to automatically identify and extract clinical concepts and corresponding ICD codes. Deep learning and natural language processing techniques have shown promise in a variety of tasks but applying them to medical coding has been challenging. Accurate coding requires a deep understanding of medical terminology, context, and guidelines that may be difficult to capture with traditional deep learning methods. Although deep learning shows promise in healthcare, its specific impact on ICD coding is not fully understood, and translating scalable deep learning methods into practical improvements in ICD coding remains a challenge. Evaluating deep learning models under the scenarios of real-world coding and comparing them to established practice is critical to determining their true effectiveness. In this work, we address the automation of ICD coding by highlighting pitfalls and contrasting different perspectives. We investigated automatic ICD coding using baseline machine learning models, with a focus on identifying ICD-9 codes in discharge notes from Medical Information Mart for Intensive Care (MIMIC) database. A thorough evaluation of different models and approaches is crucial to avoid over-reliance on any method. Our findings show that simpler methods can achieve comparable results to deep learning models while still requiring fewer computational resources.

3 sitasi en

Detail DOI Sumber

arXiv Open Access 2023

Metric-Based In-context Learning: A Case Study in Text Simplification

Subha Vadlamannati, Gözde Gül Şahin

In-context learning (ICL) for large language models has proven to be a powerful approach for many natural language processing tasks. However, determining the best method to select examples for ICL is nontrivial as the results can vary greatly depending on the quality, quantity, and order of examples used. In this paper, we conduct a case study on text simplification (TS) to investigate how to select the best and most robust examples for ICL. We propose Metric-Based in-context Learning (MBL) method that utilizes commonly used TS metrics such as SARI, compression ratio, and BERT-Precision for selection. Through an extensive set of experiments with various-sized GPT models on standard TS benchmarks such as TurkCorpus and ASSET, we show that examples selected by the top SARI scores perform the best on larger models such as GPT-175B, while the compression ratio generally performs better on smaller models such as GPT-13B and GPT-6.7B. Furthermore, we demonstrate that MBL is generally robust to example orderings and out-of-domain test sets, and outperforms strong baselines and state-of-the-art finetuned language models. Finally, we show that the behaviour of large GPT models can be implicitly controlled by the chosen metric. Our research provides a new framework for selecting examples in ICL, and demonstrates its effectiveness in text simplification tasks, breaking new ground for more accurate and efficient NLG systems.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2023

A threshold model of plastic waste fragmentation: New insights into the distribution of microplastics in the ocean and its evolution over time

Matthieu George, Frédéric Nallet, Pascale Fabre

Plastic pollution in the aquatic environment has been assessed for many years by ocean waste collection expeditions around the globe or by river sampling. While the total amount of plastic produced worldwide is well documented, the amount of plastic found in the ocean, the distribution of particles on its surface and its evolution over time are still the subject of much debate. In this article, we propose a general fragmentation model, postulating the existence of a critical size below which particle fragmentation becomes extremely unlikely. In the frame of this model, an abundance peak appears for sizes around 1mm, in agreement with real environmental data. Using, in addition, a realistic exponential waste feed to the ocean, we discuss the relative impact of fragmentation and feed rates, and the temporal evolution of microplastics (MP) distribution. New conclusions on the temporal trend of MP pollution are drawn.

en cond-mat.soft, cond-mat.mtrl-sci

Detail DOI Sumber

arXiv Open Access 2023

Unsupervised Lexical Simplification with Context Augmentation

Takashi Wada, Timothy Baldwin, Jey Han Lau

We propose a new unsupervised lexical simplification method that uses only monolingual data and pre-trained language models. Given a target word and its context, our method generates substitutes based on the target context and also additional contexts sampled from monolingual data. We conduct experiments in English, Portuguese, and Spanish on the TSAR-2022 shared task, and show that our model substantially outperforms other unsupervised systems across all languages. We also establish a new state-of-the-art by ensembling our model with GPT-3.5. Lastly, we evaluate our model on the SWORDS lexical substitution data set, achieving a state-of-the-art result.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2023

Language Models for German Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training

Miriam Anschütz, Joshua Oehms, Thomas Wimmer et al.

Automatic text simplification systems help to reduce textual information barriers on the internet. However, for languages other than English, only few parallel data to train these systems exists. We propose a two-step approach to overcome this data scarcity issue. First, we fine-tuned language models on a corpus of German Easy Language, a specific style of German. Then, we used these models as decoders in a sequence-to-sequence simplification task. We show that the language models adapt to the style characteristics of Easy Language and output more accessible texts. Moreover, with the style-specific pre-training, we reduced the number of trainable parameters in text simplification models. Hence, less parallel data is sufficient for training. Our results indicate that pre-training on unaligned data can reduce the required parallel data while improving the performance on downstream tasks.

en cs.CL

Detail DOI Sumber

arXiv Open Access 2023

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Taichi Higasa, Keitaro Tanaka, Qi Feng et al.

Language learners should regularly engage in reading challenging materials as part of their study routine. Nevertheless, constantly referring to dictionaries is time-consuming and distracting. This paper presents a novel gaze-driven sentence simplification system designed to enhance reading comprehension while maintaining their focus on the content. Our system incorporates machine learning models tailored to individual learners, combining eye gaze features and linguistic features to assess sentence comprehension. When the system identifies comprehension difficulties, it provides simplified versions by replacing complex vocabulary and grammar with simpler alternatives via GPT-3.5. We conducted an experiment with 19 English learners, collecting data on their eye movements while reading English text. The results demonstrated that our system is capable of accurately estimating sentence-level comprehension. Additionally, we found that GPT-3.5 simplification improved readability in terms of traditional readability metrics and individual word difficulty, paraphrasing across different linguistic levels.

en cs.CL, cs.HC

Detail DOI Sumber

S2 Open Access 2022

Effect of the mixing ratio on the composting of OFMSW digestate: assessment of compost quality

F. Nuñez, M. Pérez, L. F. Leon-Fernandez et al.

This study presents the results obtained in compostability tests of organic fraction of municipal solid waste (OFMSW) digestate. The final aim was to obtain mature compost without phytotoxic effects. For the evaluation of the composting process, a novel parameter describing the performance of the composting process, the relative heat generation standardized with the initial volatile solid content (RHGVS0), was defined and evaluated at laboratory-scale. From these laboratory-scale test, the optimum operational conditions were obtained, a mixing ratio (v/v) of 1:1:0 (bulking agent:digestate:co-substrate) and with 15% of mature compost as inoculum. Subsequently, these optimum operational conditions were applied in the active phase of the composting pilot-scale reactor. The active composting stage took 7 days, subsequently a curing phase of 60 days was carried out at ambient conditions. After 30 days of curing, the mature compost showed a specific oxygen uptake rate (SOUR) of 0.14 mg O2/g VS·h, a germination index (GI) of 99.63% and a low volatile fatty acids (VFA) concentration (41.3 AcH mg/kgdm), being indicative of the good compost stability and maturity of the compost. The very good quality of the final compost obtained indicated that the RHGVS0 accurately describes the performance of the composting process.

21 sitasi en

Detail DOI Sumber

S2 Open Access 2021

Novel electrochemically driven and internal circulation process for valuable metals recycling from spent lithium-ion batteries.

Shuzhen Li, Xin Wu, Youzhou Jiang et al.

The sustainable recycling of valuable metals from spent lithium-ion batteries (LIBs) is impeded by the issues of extensive chemicals consumption, tedious separation process and deficient selectivity. Here, novel electrochemically driven and internal circulation strategy was developed for the direct and selective recycling of valuable metals from waste LiCoO2 of spent LIBs. Firstly, the waste LiCoO2 can be efficiently dissolved by generated acid (H2SO4) during electro-deposition of Cu from CuSO4 electrolyte. Then, Co2+ ions in the lixivium can be electrodeposited and recovered as metallic Co with a coinstantaneous regeneration of H2SO4 and regenerated acid can be reused as leachant without obvious shrinking of leaching capability based on circulating leaching results. Over 92% Co and 97% Li can be leached, and 100% Cu and 93% Co are recovered as their metallic forms under the optimized experimental conditions. Results of leaching kinetics suggest that the leaching of Co and Li is controlled by internal diffusion with significantly reduced apparent activation energies (Ea) for Li and Co. Finally, Li2CO3 can be recovered from Li+ enriched lixivium after circulating leaching. This recycling process is a simplified route without any input of leachant and reductant, and valuable metals can be selectively recovered in a closed-loop way with high efficiency.

52 sitasi en Medicine

Detail DOI Sumber

arXiv Open Access 2022

Automatic Text Simplification of News Articles in the Context of Public Broadcasting

Diego Maupomé, Fanny Rancourt, Thomas Soulas et al.

This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Université de Montréal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).

en cs.CL, cs.AI

Detail Sumber

S2 Open Access 2019

Environmental Toxicology and Chemistry

T. Gouin, R. Becker, A.-G. Collot et al.

80 sitasi en

Detail Sumber

S2 Open Access 2017

Nitrogen removal from digested slurries using a simplified ammonia stripping technique.

G. Provolo, F. Perazzolo, Gabriele Mattachini et al.

104 sitasi en Chemistry, Medicine

Detail DOI Sumber

Hasil untuk "Standardization. Simplification. Waste"