História Oral em Movimento (Há 30 anos)
Igor Lemos Moreira
Esta resenha analisa a coletânea O desafio do diálogo: reflexões sobre história oral nos 30 anos da ABHO, organizada por Marieta de Moraes Ferreira e Ricardo Santhiago. A obra reúne dezesseis artigos que mapeiam a trajetória da História Oral no Brasil, abordando suas bases teóricas, metodológicas e temáticas. O livro destaca a centralidade do diálogo e da colaboração na prática historiográfica, evidenciando o papel da Associação Brasileira de História Oral (ABHO) como eixo estruturante do campo. A coletânea apresenta-se como um balanço crítico da área, além de um convite à reflexão e ao engajamento de novos pesquisadores.
Epistemology. Theory of knowledge, History (General)
Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in E-commerce Applications
Teri Rumble, Zbyněk Gazdík, Javad Zarrin
et al.
Recent advancements in Large Language Models (LLMs) have transformed Natural Language Processing (NLP), enabling complex information retrieval and generation tasks. Retrieval-Augmented Generation (RAG) has emerged as a key innovation, enhancing factual accuracy and contextual grounding by integrating external knowledge sources with generative models. Although RAG demonstrates strong performance on unstructured text, its application to structured knowledge graphs presents challenges: scaling retrieval across connected graphs and preserving contextual relationships during response generation. Cross-encoders refine retrieval precision, yet their integration with structured data remains underexplored. Addressing these challenges is crucial for developing domain-specific assistants that operate in production environments. This study presents the design and comparative evaluation of multiple Retriever-Reranker pipelines for knowledge graph natural language queries in e-Commerce contexts. Using the STaRK Semi-structured Knowledge Base (SKB), a production-scale e-Commerce dataset, we evaluate multiple RAG pipeline configurations optimized for language queries. Experimental results demonstrate substantial improvements over published benchmarks, achieving 20.4% higher Hit@1 and 14.5% higher Mean Reciprocal Rank (MRR). These findings establish a practical framework for integrating domain-specific SKBs into generative systems. Our contributions provide actionable insights for the deployment of production-ready RAG systems, with implications that extend beyond e-Commerce to other domains that require information retrieval from structured knowledge bases.
COVID-19 vaccines, sexual reproductive health and rights: Negotiating sensitive terrain in Zimbabwe
Molly Manyonganise
The COVID-19 period caused a lot of suffering globally, as millions lost their lives while others went through the pain of being infected. The introduction of vaccines to minimise chances of infection and death was a welcome development. However, it was also fraught with its own challenges in the area of sexual health and rights of both women and men. Scholarship on gender and religion noted the way women failed to access contraception in a period in which sexual activity had increased as most couples were together for long periods of time. The introduction of vaccines was accompanied by a lot of misinformation. Lack of clarity on the effect of the vaccines on pregnant and lactating mothers caused a lot of anxiety. This was exacerbated by the information that was being circulated on social media platforms that the vaccines would interfere with individuals’ reproductive capacity. Yet African religio-cultural beliefs and practices place so much importance on both women and men’s ability to have children. In fact, one’s respectability in African indigenous societies is greatly linked to their ability to have children. This article seeks to examine the fears of some Zimbabweans to accept COVID-19 vaccines, establishing how these fears were tied to issues of sexual reproductive health and rights. The article focuses on showing how the terrain of sexual health and rights is a sensitive one which called for caution in a COVID-19 context in Zimbabwe. Data for the article were gathered through informal interviews and social media platforms.
Contribution: The article makes a significant contribution to the way COVID-19 interfaced with issues to do with SRHR in Zimbabwe.
Epistemology. Theory of knowledge
Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs
Alexander Scarlatos, Ryan S. Baker, Andrew Lan
Recent advances in large language models (LLMs) have led to the development of artificial intelligence (AI)-powered tutoring chatbots, showing promise in providing broad access to high-quality personalized education. Existing works have studied how to make LLMs follow tutoring principles, but have not studied broader uses of LLMs for supporting tutoring. Up until now, tracing student knowledge and analyzing misconceptions has been difficult and time-consuming to implement for open-ended dialogue tutoring. In this work, we investigate whether LLMs can be supportive of this task: we first use LLM prompting methods to identify the knowledge components/skills involved in each dialogue turn, i.e., a tutor utterance posing a task or a student utterance that responds to it. We also evaluate whether the student responds correctly to the tutor and verify the LLM's accuracy using human expert annotations. We then apply a range of knowledge tracing (KT) methods on the resulting labeled data to track student knowledge levels over an entire dialogue. We conduct experiments on two tutoring dialogue datasets, and show that a novel yet simple LLM-based method, LLMKT, significantly outperforms existing KT methods in predicting student response correctness in dialogues. We perform extensive qualitative analyses to highlight the challenges in dialogueKT and outline multiple avenues for future work.
Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models
Tinghui Zhu, Qin Liu, Fei Wang
et al.
Large Vision-Language Models (LVLMs) have demonstrated impressive capabilities for capturing and reasoning over multimodal inputs. However, these models are prone to parametric knowledge conflicts, which arise from inconsistencies of represented knowledge between their vision and language components. In this paper, we formally define the problem of $\textbf{cross-modality parametric knowledge conflict}$ and present a systematic approach to detect, interpret, and mitigate them. We introduce a pipeline that identifies conflicts between visual and textual answers, showing a persistently high conflict rate across modalities in recent LVLMs regardless of the model size. We further investigate how these conflicts interfere with the inference process and propose a contrastive metric to discern the conflicting samples from the others. Building on these insights, we develop a novel dynamic contrastive decoding method that removes undesirable logits inferred from the less confident modality components based on answer confidence. For models that do not provide logits, we also introduce two prompt-based strategies to mitigate the conflicts. Our methods achieve promising improvements in accuracy on both the ViQuAE and InfoSeek datasets. Specifically, using LLaVA-34B, our proposed dynamic contrastive decoding improves an average accuracy of 2.24%.
Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization
Miyoung Ko, Sue Hyun Park, Joonsuk Park
et al.
Despite the advances in large language models (LLMs), how they use their knowledge for reasoning is not yet well understood. In this study, we propose a method that deconstructs complex real-world questions into a graph, representing each question as a node with predecessors of background knowledge needed to solve the question. We develop the DepthQA dataset, deconstructing questions into three depths: (i) recalling conceptual knowledge, (ii) applying procedural knowledge, and (iii) analyzing strategic knowledge. Based on a hierarchical graph, we quantify forward discrepancy, a discrepancy in LLM performance on simpler sub-problems versus complex questions. We also measure backward discrepancy where LLMs answer complex questions but struggle with simpler ones. Our analysis shows that smaller models exhibit more discrepancies than larger models. Distinct patterns of discrepancies are observed across model capacity and possibility of training data memorization. Additionally, guiding models from simpler to complex questions through multi-turn interactions improves performance across model sizes, highlighting the importance of structured intermediate steps in knowledge reasoning. This work enhances our understanding of LLM reasoning and suggests ways to improve their problem-solving abilities.
Solving Decision Theory Problems with Probabilistic Answer Set Programming
Damiano Azzolini, Elena Bellodi, Rafael Kiesel
et al.
Solving a decision theory problem usually involves finding the actions, among a set of possible ones, which optimize the expected reward, possibly accounting for the uncertainty of the environment. In this paper, we introduce the possibility to encode decision theory problems with Probabilistic Answer Set Programming under the credal semantics via decision atoms and utility attributes. To solve the task we propose an algorithm based on three layers of Algebraic Model Counting, that we test on several synthetic datasets against an algorithm that adopts answer set enumeration. Empirical results show that our algorithm can manage non trivial instances of programs in a reasonable amount of time. Under consideration in Theory and Practice of Logic Programming (TPLP).
KaPQA: Knowledge-Augmented Product Question-Answering
Swetha Eppalapally, Daksh Dangi, Chaithra Bhat
et al.
Question-answering for domain-specific applications has recently attracted much interest due to the latest advancements in large language models (LLMs). However, accurately assessing the performance of these applications remains a challenge, mainly due to the lack of suitable benchmarks that effectively simulate real-world scenarios. To address this challenge, we introduce two product question-answering (QA) datasets focused on Adobe Acrobat and Photoshop products to help evaluate the performance of existing models on domain-specific product QA tasks. Additionally, we propose a novel knowledge-driven RAG-QA framework to enhance the performance of the models in the product QA task. Our experiments demonstrated that inducing domain knowledge through query reformulation allowed for increased retrieval and generative performance when compared to standard RAG-QA methods. This improvement, however, is slight, and thus illustrates the challenge posed by the datasets introduced.
IERL: Interpretable Ensemble Representation Learning -- Combining CrowdSourced Knowledge and Distributed Semantic Representations
Yuxin Zi, Kaushik Roy, Vignesh Narayanan
et al.
Large Language Models (LLMs) encode meanings of words in the form of distributed semantics. Distributed semantics capture common statistical patterns among language tokens (words, phrases, and sentences) from large amounts of data. LLMs perform exceedingly well across General Language Understanding Evaluation (GLUE) tasks designed to test a model's understanding of the meanings of the input tokens. However, recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs when processing inputs that were seen rarely during training, or inputs that are associated with diverse contexts (e.g., well-known hallucination phenomenon in language generation tasks). Crowdsourced and expert-curated knowledge graphs such as ConceptNet are designed to capture the meaning of words from a compact set of well-defined contexts. Thus LLMs may benefit from leveraging such knowledge contexts to reduce inconsistencies in outputs. We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations of input tokens. IERL has the distinct advantage of being interpretable by design (when was the LLM context used vs. when was the knowledge context used?) over state-of-the-art (SOTA) methods, allowing scrutiny of the inputs in conjunction with the parameters of the model, facilitating the analysis of models' inconsistent or irrelevant outputs. Although IERL is agnostic to the choice of LLM and crowdsourced knowledge, we demonstrate our approach using BERT and ConceptNet. We report improved or competitive results with IERL across GLUE tasks over current SOTA methods and significantly enhanced model interpretability.
Building Open Knowledge Graph for Metal-Organic Frameworks (MOF-KG): Challenges and Case Studies
Yuan An, Jane Greenberg, Xintong Zhao
et al.
Metal-Organic Frameworks (MOFs) are a class of modular, porous crystalline materials that have great potential to revolutionize applications such as gas storage, molecular separations, chemical sensing, catalysis, and drug delivery. The Cambridge Structural Database (CSD) reports 10,636 synthesized MOF crystals which in addition contains ca. 114,373 MOF-like structures. The sheer number of synthesized (plus potentially synthesizable) MOF structures requires researchers pursue computational techniques to screen and isolate MOF candidates. In this demo paper, we describe our effort on leveraging knowledge graph methods to facilitate MOF prediction, discovery, and synthesis. We present challenges and case studies about (1) construction of a MOF knowledge graph (MOF-KG) from structured and unstructured sources and (2) leveraging the MOF-KG for discovery of new or missing knowledge.
Knowledge Informed Machine Learning using a Weibull-based Loss Function
Tim von Hahn, Chris K Mechefske
Machine learning can be enhanced through the integration of external knowledge. This method, called knowledge informed machine learning, is also applicable within the field of Prognostics and Health Management (PHM). In this paper, the various methods of knowledge informed machine learning, from a PHM context, are reviewed with the goal of helping the reader understand the domain. In addition, a knowledge informed machine learning technique is demonstrated, using the common IMS and PRONOSTIA bearing data sets, for remaining useful life (RUL) prediction. Specifically, knowledge is garnered from the field of reliability engineering which is represented through the Weibull distribution. The knowledge is then integrated into a neural network through a novel Weibull-based loss function. A thorough statistical analysis of the Weibull-based loss function is conducted, demonstrating the effectiveness of the method on the PRONOSTIA data set. However, the Weibull-based loss function is less effective on the IMS data set. The results, shortcomings, and benefits of the approach are discussed in length. Finally, all the code is publicly available for the benefit of other researchers.
World Knowledge in Multiple Choice Reading Comprehension
Adian Liusie, Vatsal Raina, Mark Gales
Recently it has been shown that without any access to the contextual passage, multiple choice reading comprehension (MCRC) systems are able to answer questions significantly better than random on average. These systems use their accumulated "world knowledge" to directly answer questions, rather than using information from the passage. This paper examines the possibility of exploiting this observation as a tool for test designers to ensure that the use of "world knowledge" is acceptable for a particular set of questions. We propose information-theory based metrics that enable the level of "world knowledge" exploited by systems to be assessed. Two metrics are described: the expected number of options, which measures whether a passage-free system can identify the answer a question using world knowledge; and the contextual mutual information, which measures the importance of context for a given question. We demonstrate that questions with low expected number of options, and hence answerable by the shortcut system, are often similarly answerable by humans without context. This highlights that the general knowledge 'shortcuts' could be equally used by exam candidates, and that our proposed metrics may be helpful for future test designers to monitor the quality of questions.
Dynamic Relation Repairing for Knowledge Enhancement
Rui Kang, Hongzhi Wang
Dynamic relation repair aims to efficiently validate and repair the instances for knowledge graph enhancement (KGE), where KGE captures missing relations from unstructured data and leads to noisy facts to the knowledge graph. With the prosperity of unstructured data, an online approach is asked to clean the new RDF tuples before adding them to the knowledge base. To clean the noisy RDF tuples, graph constraint processing is a common but intractable approach. Plus, when adding new tuples to the knowledge graph, new graph patterns would be created, whereas the explicit discovery of graph constraints is also intractable. Therefore, although the dynamic relation repair has an unfortunate hardness, it is a necessary approach for enhancing knowledge graphs effectively under the fast-growing unstructured data. Motivated by this, we establish a dynamic repairing and enhancing structure to analyze its hardness on basic operations. To ensure dynamic repair and validation, we introduce implicit graph constraints, approximate graph matching, and linkage prediction based on localized graph patterns. To validate and repair the RDF tuples efficiently, we further study the cold start problems for graph constraint processing. Experimental results on real datasets demonstrate that our proposed approach can capture and repair instances with wrong relation labels dynamically and effectively.
Query-Specific Knowledge Graphs for Complex Finance Topics
Iain Mackie, Jeffrey Dalton
Across the financial domain, researchers answer complex questions by extensively "searching" for relevant information to generate long-form reports. This workshop paper discusses automating the construction of query-specific document and entity knowledge graphs (KGs) for complex research topics. We focus on the CODEC dataset, where domain experts (1) create challenging questions, (2) construct long natural language narratives, and (3) iteratively search and assess the relevance of documents and entities. For the construction of query-specific KGs, we show that state-of-the-art ranking systems have headroom for improvement, with specific failings due to a lack of context or explicit knowledge representation. We demonstrate that entity and document relevance are positively correlated, and that entity-based query feedback improves document ranking effectiveness. Furthermore, we construct query-specific KGs using retrieval and evaluate using CODEC's "ground-truth graphs", showing the precision and recall trade-offs. Lastly, we point to future work, including adaptive KG retrieval algorithms and GNN-based weighting methods, while highlighting key challenges such as high-quality data, information extraction recall, and the size and sparsity of complex topic graphs.
Do it Like the Doctor: How We Can Design a Model That Uses Domain Knowledge to Diagnose Pneumothorax
Glen Smith, Qiao Zhang, Christopher MacLellan
Computer-aided diagnosis for medical imaging is a well-studied field that aims to provide real-time decision support systems for physicians. These systems attempt to detect and diagnose a plethora of medical conditions across a variety of image diagnostic technologies including ultrasound, x-ray, MRI, and CT. When designing AI models for these systems, we are often limited by little training data, and for rare medical conditions, positive examples are difficult to obtain. These issues often cause models to perform poorly, so we needed a way to design an AI model in light of these limitations. Thus, our approach was to incorporate expert domain knowledge into the design of an AI model. We conducted two qualitative think-aloud studies with doctors trained in the interpretation of lung ultrasound diagnosis to extract relevant domain knowledge for the condition Pneumothorax. We extracted knowledge of key features and procedures used to make a diagnosis. With this knowledge, we employed knowledge engineering concepts to make recommendations for an AI model design to automatically diagnose Pneumothorax.
Enhanced Knowledge Selection for Grounded Dialogues via Document Semantic Graphs
Sha Li, Mahdi Namazifar, Di Jin
et al.
Providing conversation models with background knowledge has been shown to make open-domain dialogues more informative and engaging. Existing models treat knowledge selection as a sentence ranking or classification problem where each sentence is handled individually, ignoring the internal semantic connection among sentences in the background document. In this work, we propose to automatically convert the background knowledge documents into document semantic graphs and then perform knowledge selection over such graphs. Our document semantic graphs preserve sentence-level information through the use of sentence nodes and provide concept connections between sentences. We jointly apply multi-task learning for sentence-level and concept-level knowledge selection and show that it improves sentence-level selection. Our experiments show that our semantic graph-based knowledge selection improves over sentence selection baselines for both the knowledge selection task and the end-to-end response generation task on HollE and improves generalization on unseen topics in WoW.
Vincenzo Consolo, la "preistoria" di uno scrittore siciliano
Antonio Catalfamo
Vincenzo Consolo è morto il 21 gennaio 2012, a Milano, prima che vedesse la luce, nell’ambito della prestigiosa collana mondadoriana dei Meridiani, la sua Opera completa. Ciò ha fatto sì ch’essa venisse pubblicata, a nostro avviso, con una certa approssimazione, riguardante, in particolare, alcuni dati biografici, con qualche svarione di troppo, che non è rimasto isolato, riverberandosi su giudizi riguardanti l’opera dello scrittore siciliano quantomeno fuorvianti. Ad ulteriore conferma dell’erroneità di fondo delle tesi strutturaliste, affermatesi in Italia in una versione «estremista», secondo la quale i testi letterari vanno analizzati nella loro «autosufficienza», nella loro «autoreferenzialità», prescindendo completamente dai «contesti» (storico-politico, economicosociale, ideologico, culturale) nell’ambito dei quali sono stati concepiti, l’opera di Consolo è strettamente legata alla realtà siciliana, che ha condizionato fortemente la sua vita, anche allorquando egli si è allontanato dall’isola natia, nonché tutta la sua attività letteraria, assolutamente incomprensibile prescindendo da questo «contesto», visto in tutte le sue sfaccettature, per l’appunto, storico-politiche, economico-sociali, ideologico-culturali e specificatamente letterarie.
Computational linguistics. Natural language processing, Epistemology. Theory of knowledge
Per uno sposalizio tra neuro-cognitivismo e critica computazionale: l’esempio del gender
Stefano Calabrese
La critica computazionale negli ultimi vent’anni si è dimostrata un modello di distant reading funzionale alla ricerca di costanti morfologiche e archetipi tematici: con Franco Moretti ha addirittura nutrito l’ambizione, applicandosi alle Pathosformeln di Aby Warburg, di trovare l’algoritmo della rappresentazine delle emozioni. Negli stessi anni, in laboratori della comunità scientifica del tutto separati, gli studiosi di neuro-cognitivismo e le neuroscienze hanno ricercato e fotografato attraverso risonanza magnetica le costanti percettive e il modo in cui il cervello processa la realtà, con le uniche varianti apportate dai contesti storico-ambientali. Il presente contributo propone un’alleanza metodologica della critica computazionale e delle neuroscienze per rendere sempre più raffinata e probatoria la ricerca del modo in cui procede la trasmissione ereditaria delle informazioni; in particolare, ci si sofferma sugli studi condotti da entrambe le parti sugli effetti del gender in relazione alla lettura della realtà e alla produzione di mondi finzionali, con esempi che riguardano l’evoluzione neuro-cognitiva delle popolazioni euro-nordamericane tra Otto e Nocevento. La consapevolezza del ruolo dei nuovi strumenti di calcolo offerti dalla tecnologia moderna, in grado di elaborare quantità massicce di dati a una velocità esponenziale incomparabile rispetto al passato, si unisce qui alla certezza che le neuroscienze stiano dando un contributo fondamentale anche per lo studio delle humanities.
Computational linguistics. Natural language processing, Epistemology. Theory of knowledge
Semantic TrueLearn: Using Semantic Knowledge Graphs in Recommendation Systems
Sahan Bulathwela, María Pérez-Ortiz, Emine Yilmaz
et al.
In informational recommenders, many challenges arise from the need to handle the semantic and hierarchical structure between knowledge areas. This work aims to advance towards building a state-aware educational recommendation system that incorporates semantic relatedness between knowledge topics, propagating latent information across semantically related topics. We introduce a novel learner model that exploits this semantic relatedness between knowledge components in learning resources using the Wikipedia link graph, with the aim to better predict learner engagement and latent knowledge in a lifelong learning scenario. In this sense, Semantic TrueLearn builds a humanly intuitive knowledge representation while leveraging Bayesian machine learning to improve the predictive performance of the educational engagement. Our experiments with a large dataset demonstrate that this new semantic version of TrueLearn algorithm achieves statistically significant improvements in terms of predictive performance with a simple extension that adds semantic awareness to the model.
Features of Epidemic Process of Influenza and its Etiology in the Countries of the Northern and Southern Hemispheres in the Period of Circulation of Pandemic Virus A(H1N1)pdm09 (According to WHO)
L. S. Karpova, M. Yu. Pelikh, N. M. Popovtseva
et al.
Relevance. Influenza is characterized by global distribution and the difference in its seasonality in countries with temperate and tropical climates. The importance of studying antigenic variation of influenza viruses due to the fact that changes in the antigenic structure is an evolutionary mechanism of adaptation of the virus to ensure its survival and cause annual epidemics.Aims. The Aim of this study was to identify the peculiarities of the geographical spread of influenza (seasonal), etiology and the rate of antigenic variability of influenza viruses A and B.Materials and methods. Based on data from WHO Reference research centers, information was collected on circulating influenza virus strains from 1975 A(H3N2), 1977 A(H1N1)pdm09 and type B of the Yamagata and Victoria lines from 1987 to 2019, as well as data on the number of all identified influenza viruses and individual strains circulating in the Northern and Southern hemispheres from 2008 to 2018.Results and discussion. Analysis of the global spread of influenza, its etiology and antigenic variability of viruses, according to WHO, showed that the influenza A(H1N1)pdm09 virus was the main causative agent of epidemics and regional outbreaks in seasons of high influenza activity in all countries except the United States and Canada, where influenza A(H3N2) and B viruses dominated in countries with severe seasonality, the change of season led to a change in the etiology of influenza, and in tropical countries, the A(H1N1)pdm09 virus more often remained dominant in all seasons of the year.Conclusions. The pronounced seasonality of influenza in Northern countries and its absence in tropical countries, where regional outbreaks prevailed in all seasons of the year, were confirmed. Low antigenic variability of influenza A(H1N1)pdm09 strains was confirmed, and the highest – A(H3N2). Among influenza B strains in the Victoria line had less antigenic variability, because the duration of its circulation before the appearance of a new drift variant was longer than that of the Yamagata line. The tendency to increase the total duration of circulation of influenza viruses B/Victoria, A(H1N1)pdm09 and B/Yamagata due to increased circulation before the emergence of new drift variants is shown.
Epistemology. Theory of knowledge