Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.
This paper explores how AI-powered tools could be leveraged to streamline the process of identifying, screening, and analyzing relevant literature in academic research. More specifically, we examine the documented relationship between environmental, social, and governance (ESG) factors and the cost of capital (CoC). By applying an AI-assisted workflow, we identified 36 published studies, synthesized their key findings, and highlighted relevant theories, moderators, and methodological challenges. Our analyses demonstrate the value of AI tools in enhancing business research processes and also contribute to the growing literature on the importance of ESG in the field of corporate finance.
Daniil Sherki, Daniil Merkulov, Alexandra Savina
et al.
We present PERELMAN (PipEline foR sciEntific Literature Meta-ANalysis), an agentic framework designed to extract specific information from a large corpus of scientific articles to support large-scale literature reviews and meta-analyses. Our central goal is to reliably transform heterogeneous article content into a unified, machine-readable representation. PERELMAN first elicits domain knowledge-including target variables, inclusion criteria, units, and normalization rules-through a structured dialogue with a subject-matter expert. This domain knowledge is then reused across multiple stages of the pipeline and guides coordinated agents in extracting evidence from narrative text, tables, and figures, enabling consistent aggregation across studies. In order to assess reproducibility and validate our implementation, we evaluate the system on the task of reproducing the meta-analysis of layered Li-ion cathode properties (NMC811 material). We describe our solution, which has the potential to reduce the time required to prepare meta-analyses from months to minutes.
Hein Htet, Amgad Ahmed Ali Ibrahim, Yutaka Sasaki
et al.
The oxygen reduction reaction (ORR) catalyst plays a critical role in enhancing fuel cell efficiency, making it a key focus in material science research. However, extracting structured information about ORR catalysts from vast scientific literature remains a significant challenge due to the complexity and diversity of textual data. In this study, we propose a named entity recognition (NER) and relation extraction (RE) approach using DyGIE++ with multiple pre-trained BERT variants, including MatSciBERT and PubMedBERT, to extract ORR catalyst-related information from the scientific literature, which is compiled into a fuel cell corpus for materials informatics (FC-CoMIcs). A comprehensive dataset was constructed manually by identifying 12 critical entities and two relationship types between pairs of the entities. Our methodology involves data annotation, integration, and fine-tuning of transformer-based models to enhance information extraction accuracy. We assess the impact of different BERT variants on extraction performance and investigate the effects of annotation consistency. Experimental evaluations demonstrate that the fine-tuned PubMedBERT model achieves the highest NER F1-score of 82.19% and the MatSciBERT model attains the best RE F1-score of 66.10%. Furthermore, the comparison with human annotators highlights the reliability of fine-tuned models for ORR catalyst extraction, demonstrating their potential for scalable and automated literature analysis. The results indicate that domain-specific BERT models outperform general scientific models like BlueBERT for ORR catalyst extraction.
Duarte Sampaio de Almeida, Fernando Brito e Abreu, Inês Boavida-Portugal
Purpose: This systematic literature review (SLR) characterizes the current state of the art on digital twinning (DT) technology in tourism-related applications. We aim to evaluate the types of DTs described in the literature, identifying their purposes, the areas of tourism where they have been proposed, their main components, and possible future directions based on current work. Design/methodology/approach: We conducted this SLR with bibliometric analysis based on an existing, validated methodology. Thirty-four peer-reviewed studies from three major scientific databases were selected for review. They were categorized using a taxonomy that included tourism type, purpose, spatial scale, data sources, data linkage, visualization, and application. Findings: The topic is at an early, evolving stage, as the oldest study found dates back to 2021. Most reviewed studies deal with cultural tourism, focusing on digitising cultural heritage. Destination management is the primary purpose of these DTs, with mainly site-level spatial scales. In many studies, the physical-digital data linkage is unilateral, lacking twin synchronization. In most DTs considered bilateral, the linkage is indirect. There are more applied than theoretical studies, suggesting progress in applying DTs in the field. Finally, there is an extensive research gap regarding DT technology in tourism, which is worth filling. Originality/Value: This paper presents a novel SLR with a bibliometric analysis of DTs' applied and theoretical application in tourism. Each reviewed publication is assessed and characterized, identifying the current state of the topic, possible research gaps, and future directions.
Christian Näther, Daniel Herzinger, Stefan-Lukas Gazdag
et al.
Networks such as the Internet are essential for our connected world. Quantum computing poses a threat to this heterogeneous infrastructure since it threatens fundamental security mechanisms. Therefore, a migration to post-quantum-cryptography (PQC) is necessary for networks and their components. At the moment, there is little knowledge on how such migrations should be structured and implemented in practice. Our systematic literature review addresses migration approaches for IP networks towards PQC. It surveys papers about the migration process and exemplary real-world software system migrations. On the process side, we found that terminology, migration steps, and roles are not defined precisely or consistently across the literature. Still, we identified four major phases and appropriate substeps which we matched with also emerging archetypes of roles. In terms of real-world migrations, we see that reports used several different PQC implementations and hybrid solutions for migrations of systems belonging to a wide range of system types. Across all papers we noticed three major challenges for adopters: missing experience of PQC and a high realization effort, concerns about the security of the upcoming system, and finally, high complexity. Our findings indicate that recent standardization efforts already push quantum-safe networking forward. However, the literature is still not in consensus about definitions and best practices. Implementations are mostly experimental and not necessarily practical, leading to an overall chaotic situation. To better grasp this fast moving field of (applied) research, our systematic literature review provides a comprehensive overview of its current state and serves as a starting point for delving into the matter of PQC migration.
This systematic literature review explores sustainability assessment frameworks (SAFs) across diverse industries. The review focuses on SAF design approaches including the methods used for Sustainability Indicator (SI) selection, relative importance assessment, and interdependency analysis. Various methods, including literature reviews, stakeholder interviews, questionnaires, Pareto analysis, SMART approach, and adherence to sustainability standards, contribute to the complex SI selection process. Fuzzy-AHP stands out as a robust technique for assessing relative SI importance. While dynamic sustainability and performance indices are essential, methods like DEMATEL, VIKOR, correlation analysis, and causal models for interdependency assessment exhibit static limitations. The review presents strengths and limitations of SAFs, addressing gaps in design approaches and contributing to a comprehensive understanding. The insights of this review aim to benefit policymakers, administrators, leaders, and researchers, fostering sustainability practices. Future research recommendations include exploring multi-criteria decision-making models and hybrid approaches, extending sustainability evaluation across organizational levels and supply chains. Emphasizing adaptability to industry specifics and dynamic global adjustments is proposed for holistic sustainability practices, further enhancing organizational sustainability.
Chemputation is the process of programming chemical robots to do experiments using a universal symbolic language, but the literature can be error prone and hard to read due to ambiguities. Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including natural language processing, robotic control, and more recently, chemistry. Despite significant advancements in standardizing the reporting and collection of synthetic chemistry data, the automatic reproduction of reported syntheses remains a labour-intensive task. In this work, we introduce an LLM-based chemical research agent workflow designed for the automatic validation of synthetic literature procedures. Our workflow can autonomously extract synthetic procedures and analytical data from extensive documents, translate these procedures into universal XDL code, simulate the execution of the procedure in a hardware-specific setup, and ultimately execute the procedure on an XDL-controlled robotic system for synthetic chemistry. This demonstrates the potential of LLM-based workflows for autonomous chemical synthesis with Chemputers. Due to the abstraction of XDL this approach is safe, secure, and scalable since hallucinations will not be chemputable and the XDL can be both verified and encrypted. Unlike previous efforts, which either addressed only a limited portion of the workflow, relied on inflexible hard-coded rules, or lacked validation in physical systems, our approach provides four realistic examples of syntheses directly executed from synthetic literature. We anticipate that our workflow will significantly enhance automation in robotically driven synthetic chemistry research, streamline data extraction, improve the reproducibility, scalability, and safety of synthetic and experimental chemistry.
AbstractIn recent years, reinforcement learning (RL) systems have shown impressive performance and remarkable achievements. Many achievements can be attributed to combining RL with deep learning. However, those systems lack explainability, which refers to our understanding of the system’s decision-making process. In response to this challenge, the new explainable RL (XRL) field has emerged and grown rapidly to help us understand RL systems. This systematic literature review aims to give a unified view of the field by reviewing ten existing XRL literature reviews and 189 XRL studies from the past five years. Furthermore, we seek to organize these studies into a new taxonomy, discuss each area in detail, and draw connections between methods and stakeholder questions (e.g., “how can I get the agent to do _?”). Finally, we look at the research trends in XRL, recommend XRL methods, and present some exciting research directions for future research. We hope stakeholders, such as RL researchers and practitioners, will utilize this literature review as a comprehensive resource to overview existing state-of-the-art XRL methods. Additionally, we strive to help find research gaps and quickly identify methods that answer stakeholder questions.
The main objective of this study is to conduct a bibliometric analysis of scholarly publications of Authorship Pattern. The present study covers 1723 research papers published in the area of authorship pattern and indexed in Scopus database from the year 2013 to 2022. These research publications considered for the present study have been analysed based on their year wise growth, pattern of authorship, times citations, type of publication, most productive publication source as well as countries and institutions. The study shows the positive growth of the literatures with collaborative authorship pattern and good citation status. This original research paper will be helpful to the researchers of library and information science, especially who are working in the area of bibliometrics studies.
Queries with similar information needs tend to have similar document clicks, especially in biomedical literature search engines where queries are generally short and top documents account for most of the total clicks. Motivated by this, we present a novel architecture for biomedical literature search, namely Log-Augmented DEnse Retrieval (LADER), which is a simple plug-in module that augments a dense retriever with the click logs retrieved from similar training queries. Specifically, LADER finds both similar documents and queries to the given query by a dense retriever. Then, LADER scores relevant (clicked) documents of similar queries weighted by their similarity to the input query. The final document scores by LADER are the average of (1) the document similarity scores from the dense retriever and (2) the aggregated document scores from the click logs of similar queries. Despite its simplicity, LADER achieves new state-of-the-art (SOTA) performance on TripClick, a recently released benchmark for biomedical literature retrieval. On the frequent (HEAD) queries, LADER largely outperforms the best retrieval model by 39% relative NDCG@10 (0.338 v.s. 0.243). LADER also achieves better performance on the less frequent (TORSO) queries with 11% relative NDCG@10 improvement over the previous SOTA (0.303 v.s. 0.272). On the rare (TAIL) queries where similar queries are scarce, LADER still compares favorably to the previous SOTA method (NDCG@10: 0.310 v.s. 0.295). On all queries, LADER can improve the performance of a dense retriever by 24%-37% relative NDCG@10 while not requiring additional training, and further performance improvement is expected from more logs. Our regression analysis has shown that queries that are more frequent, have higher entropy of query similarity and lower entropy of document similarity, tend to benefit more from log augmentation.
Open Learning Analytics (OLA) is an emerging research area that aims at improving learning efficiency and effectiveness in lifelong learning environments. OLA employs multiple methods to draw value from a wide range of educational data coming from various learning environments and contexts in order to gain insight into the learning processes of different stakeholders. As the research field is still relatively young, only a few technical platforms are available and a common understanding of requirements is lacking. This paper provides a systematic literature review of tools available in the learning analytics literature from 2011-2019 with an eye on their support for openness. 137 tools from nine academic databases are collected to form the base for this review. The analysis of selected tools is performed based on four dimensions, namely 'Data, Environments, Context (What?)', 'Stakeholders (Who?)', 'Objectives (Why?)', and 'Methods (How?)'. Moreover, five well-known OLA frameworks available in the community are systematically compared. The review concludes by eliciting the main requirements for an effective OLA platform and by identifying key challenges and future lines of work in this emerging field.
I Kristina Lugns diktsamling Hej då, ha det så bra! (2003) liksom i den postuma ”Inte alls dåligt” (2022) drivs den egna poetiken till sin spets. Metaforikens oegentlighet och bokstavlighetens absurditet blottas. Enligt Lars Elleström ringar Lugns dikter ”in människans konkret kroppsliga, och därmed också mentala, villkor” (Elleström 2006, 28). I min artikel gäller detta resonemang inte bara den kroppsliga metaforiken utan hela den bokstavliga språkdräkten.
Ann-Helén Andersson formulerar i sin avhandling ”Jag är baserad på verkliga personer”. Ironi och röstgivande i Kristina Lugns författarskap att ”I Lugns texter kan det komiska ses som ett element inom en dominerande ironisk tendens.” (Andersson 2010, 115). Med hjälp av Luigi Pirandellos essä L’umorismo (1908) och begreppet sentimento del contrario, kommer jag här istället argumentera för att Lugns bokstavliga språkdräkt, föremålet för min analys, öppnar för en humoristisk, snarare än ironisk, verkan som kan få verkligheten att framträda som aningen mer fattbar och gripbar – i all sin fasansfulla absurditet.
Yulia Rodina, Alexandra Bogoyavlenskaya, Natalia Mitrofanova
et al.
The present study aims at obtaining a comprehensive picture of language development in Russian heritage language (RHL) by bringing together evidence from previous investigations focusing on morphosyntax and global accent as well as from a newly conducted analysis of a less-studied domain–lexical development. Our investigation is based on a narrative sample of 143 pre- and primary-school bilinguals acquiring RHL in Norway, Germany, and the United Kingdom. We performed a multiple-way analysis of lexical production in RHL across the different national contexts, across both languages (heritage and societal), also comparing bilinguals and monolinguals. The results revealed a clear and steady increase with age in narrative length and lexical diversity for all bilingual groups in both of their languages. The variation in lexical productivity as well as the differences between the bilingual groups and between bilinguals and monolinguals were attributed to input factors with language exposure in the home and age of starting preschool as the major predictors. We conclude that, overall, the results from lexical, grammatical, and phonological acquisition in RHL support the view that having longer exclusive or uninterrupted exposure to a heritage language in early childhood is beneficial for its development across domains.
The Internet of Vehicles IoV commonly referred to as connected automobiles is a vast network that connects various entities including users sensors and vehicles They will connect across a network to lessen traffic accidents and improve both the security and safety of smart vehicles The Internet of Vehicles is subject to a wide variety of threats including spoofing attacks recognition attacks privacy attacks and verification attacks Our the primary concern when creating any new smart gadget is the users safety which will be improved by identifying solutions to the various cyber threats Therefore we will cover the security of smart automobiles in this literature review including their attacks and solutions.
This paper investigates cases of compounding in the heritage language American Norwegian (AmNo), where elements from Norwegian and English are mixed word-internally, e.g., <i>hoste-candy</i> ‘cough candy’, where the Norwegian item <i>hoste</i> ‘cough’ is combined with the English item <i>candy</i>. Norwegian and English create compounds in similar ways, but with certain important differences, e.g., the use of linking elements. Based on data from the <i>Corpus of American Nordic Speech</i>, we investigate the encounter of these two languages within one word and find that both Norwegian and English lexical items occur as both left-hand and right-hand members of mixed compounds. Moreover, these mixed compounds are generally accompanied by Norwegian functional items. Hence, we argue that the overall structure of mixed compounds in AmNo is Norwegian, and English lexical items may be inserted into specific positions. This is successfully analyzed in a DM/exoskeletal model of grammar. We show that our results are in line with what we expect based on previous accounts of AmNo language mixing and Norwegian compounds, and our specific focus on compound-internal mixing provides a novel perspective and new insights into both the structure of compounds and the nature of language mixing.
Abstract Background The existing literature indicates that unemployment leads to deteriorated mental and somatic health, poorer self-assessed health, and higher mortality. However, it is not clear whether and to what extent the health consequences of unemployment differ between men and women. According to social role theory, women can alternate between several roles (mother, wife, friend, etc.) that make it easier to deal with unemployment, whereas the worker role is more important for men, and unemployment could therefore be more harmful to them. Thus, gender differences in the health consequences of unemployment should decrease as society grows more gender equal. Accordingly, this study examines changes over time in the gendered health consequences of unemployment in Norway. Methods Linked Norwegian administrative register data, covering the period from 2000 to 2017, were analysed by means of linear probability models and logistic regression. Four health outcomes were investigated: hospitalisation, receiving sick pay, disability benefit utilisation, and the likelihood of mortality. Two statistical models were estimated: adjusted for (1) age, and (2) additional sociodemographic covariates. All analyses were run split by gender. Three different unemployment cohorts (2000, 2006, and 2011) that experienced similar economic conditions were followed longitudinally until 2017. Results The empirical findings show, first, that hospital admission is somewhat more common among unemployed males than among unemployed females. Second, receiving sick pay is much more common post-unemployment for men than for women. Third, excess mortality is higher among unemployed males than among unemployed females. Fourth, there is no gender component in disability benefit utilisation. There is a remarkable pattern of similarity when comparing the results for the three different unemployment cohorts (2000; 2006; 2011). Thus, the gendered health consequences of unemployment have hardly changed since the turn of the century. Conclusion This paper demonstrates that the health consequences of unemployment are serious, gendered, and enduring in Norway.
Email is a channel of communication which is considered to be a confidential medium of communication for exchange of information among individuals and organisations. The confidentiality consideration about e-mail is no longer the case as attackers send malicious emails to users to deceive them into disclosing their private personal information such as username, password, and bank card details, etc. In search of a solution to combat phishing cybercrime attacks, different approaches have been developed. However, the traditional exiting solutions have been limited in assisting email users to identify phishing emails from legitimate ones. This paper reveals the different email and website phishing solutions in phishing attack detection. It first provides a literature analysis of different existing phishing mitigation approaches. It then provides a discussion on the limitations of the techniques, before concluding with an exploration into how phishing detection can be improved.
Fernando Kamei, Igor Wiese, Crescencio Lima
et al.
Context: Grey Literature (GL) recently has grown in Software Engineering (SE) research since the increased use of online communication channels by software engineers. However, there is still a limited understanding of how SE research is taking advantage of GL. Objective: This research aimed to understand how SE researchers use GL in their secondary studies. Method: We conducted a tertiary study of studies published between 2011 and 2018 in high-quality software engineering conferences and journals. We then applied qualitative and quantitative analysis to investigate 446 potential studies. Results: From the 446 selected studies, 126 studies cited GL but only 95 of those used GL to answer a specific research question representing almost 21% of all the 446 secondary studies. Interestingly, we identified that few studies employed specific search mechanisms and used additional criteria for assessing GL. Moreover, by the time we conducted this research, 49% of the GL URLs are not working anymore. Based on our findings, we discuss some challenges in using GL and potential mitigation plans. Conclusion: In this paper, we summarized the last 10 years of software engineering research that uses GL, showing that GL has been essential for bringing practical new perspectives that are scarce in traditional literature. By drawing the current landscape of use, we also raise some awareness of related challenges (and strategies to deal with them).