Younghee Sheen
Hasil untuk "Language acquisition"
Menampilkan 20 dari ~5476888 hasil · dari arXiv, DOAJ, Semantic Scholar, CrossRef
Bambi B. Schieffelin, Elinor Ochs
G. Marcus, S. Pinker, M. Ullman et al.
R. Hawkins, Cecilia Yuet Hung Chan
Suresh Canagarajah
Z. Dörnyei
Donna Lardiere
Aleksandra Krasnodębska, Karolina Seweryn, Szymon Łukasik et al.
Despite increasing efforts to ensure the safety of large language models (LLMs), most existing safety assessments and moderation tools remain heavily biased toward English and other high-resource languages, leaving majority of global languages underexamined. To address this gap, we introduce a manually annotated benchmark dataset for language model safety classification in Polish. We also create adversarially perturbed variants of these samples designed to challenge model robustness. We conduct a series of experiments to evaluate LLM-based and classifier-based models of varying sizes and architectures. Specifically, we fine-tune three models: Llama-Guard-3-8B, a HerBERT-based classifier (a Polish BERT derivative), and PLLuM, a Polish-adapted Llama-8B model. We train these models using different combinations of annotated data and evaluate their performance, comparing it against publicly available guard models. Results demonstrate that the HerBERT-based classifier achieves the highest overall performance, particularly under adversarial conditions.
Michael Cooper, Rohan Wadhawan, John Michael Giorgi et al.
Decision-makers often possess insufficient information to render a confident decision. In these cases, the decision-maker can often undertake actions to acquire the necessary information about the problem at hand, e.g., by consulting knowledgeable authorities or by conducting experiments. Importantly, different levers of information acquisition come with different costs, posing the challenge of selecting the actions that are both informative and cost-effective. In this work, we propose CuriosiTree, a heuristic-based, test-time policy for zero-shot information acquisition in large language models (LLMs). CuriosiTree employs a greedy tree search to estimate the expected information gain of each action and strategically chooses actions based on a balance of anticipated information gain and associated cost. Empirical validation in a clinical diagnosis simulation shows that CuriosiTree enables cost-effective integration of heterogenous sources of information, and outperforms baseline action selection strategies in selecting action sequences that enable accurate diagnosis.
Uli Sauerland, Celia Matthaei, Felix Salfner
We argue that human language learning proceeds in a manner that is different in nature from current approaches to training LLMs, predicting a difference in learning biases. We then present evidence from German plural formation by LLMs that confirm our hypothesis that even very powerful implementations produce results that miss aspects of the logic inherent to language that humans have no problem with. We conclude that attention to the different structures of human language and artificial neural networks is likely to be an avenue to improve LLM performance.
John Pavlopoulos, Juli Bakagianni, Kanella Pouli et al.
Natural Language Processing (NLP) for lesser-resourced languages faces persistent challenges, including limited datasets, inherited biases from high-resource languages, and the need for domain-specific solutions. This study addresses these gaps for Modern Greek through three key contributions. First, we evaluate the performance of open-source (Llama-70b) and closed-source (GPT-4o mini) large language models (LLMs) on seven core NLP tasks with dataset availability, revealing task-specific strengths, weaknesses, and parity in their performance. Second, we expand the scope of Greek NLP by reframing Authorship Attribution as a tool to assess potential data usage by LLMs in pre-training, with high 0-shot accuracy suggesting ethical implications for data provenance. Third, we showcase a legal NLP case study, where a Summarize, Translate, and Embed (STE) methodology outperforms the traditional TF-IDF approach for clustering \emph{long} legal texts. Together, these contributions provide a roadmap to advance NLP in lesser-resourced languages, bridging gaps in model evaluation, task innovation, and real-world impact.
Michael Levy, Glenn, Stockwell
Niyati Bafna, Kenton Murray, David Yarowsky
While large language models exhibit certain cross-lingual generalization capabilities, they suffer from performance degradation (PD) on unseen closely-related languages (CRLs) and dialects relative to their high-resource language neighbour (HRLN). However, we currently lack a fundamental understanding of what kinds of linguistic distances contribute to PD, and to what extent. Furthermore, studies of cross-lingual generalization are confounded by unknown quantities of CRL language traces in the training data, and by the frequent lack of availability of evaluation data in lower-resource related languages and dialects. To address these issues, we model phonological, morphological, and lexical distance as Bayesian noise processes to synthesize artificial languages that are controllably distant from the HRLN. We analyse PD as a function of underlying noise parameters, offering insights on model robustness to isolated and composed linguistic phenomena, and the impact of task and HRL characteristics on PD. We calculate parameter posteriors on real CRL-HRLN pair data and show that they follow computed trends of artificial languages, demonstrating the viability of our noisers. Our framework offers a cheap solution for estimating task performance on an unseen CRL given HRLN performance using its posteriors, as well as for diagnosing observed PD on a CRL in terms of its linguistic distances from its HRLN, and opens doors to principled methods of mitigating performance degradation.
Xiaonan Wang, Jinyoung Yeo, Joon-Ho Lim et al.
Large language models have exhibited significant enhancements in performance across various tasks. However, the complexity of their evaluation increases as these models generate more fluent and coherent content. Current multilingual benchmarks often use translated English versions, which may incorporate Western cultural biases that do not accurately assess other languages and cultures. To address this research gap, we introduce KULTURE Bench, an evaluation framework specifically designed for Korean culture that features datasets of cultural news, idioms, and poetry. It is designed to assess language models' cultural comprehension and reasoning capabilities at the word, sentence, and paragraph levels. Using the KULTURE Bench, we assessed the capabilities of models trained with different language corpora and analyzed the results comprehensively. The results show that there is still significant room for improvement in the models' understanding of texts related to the deeper aspects of Korean culture.
Niloufar Koleini, Tahereh Boroughani, Zohreh R. Eslami et al.
Technical vocabulary acquisition holds paramount significance within academic and professional contexts. This research delves into the transformative potential of Mobile-Assisted Language Learning (MALL) in facilitating university students' mastery of technical words. The study examined the efficacy of digital flashcards (DFs) compared to traditional paper-based flashcards (PFs) as pedagogical tools for vocabulary instruction. In doing so, 80 Iranian psychology students with intermediate English proficiency participated in this research with an intervention that lasted for ten weeks. Their vocabulary knowledge was assessed using the Vocabulary Knowledge Scale (VKS) pre- and post-intervention, with an additional delayed post-test administered six weeks later. The results of mixed between-within analysis of variance (ANOVA) revealed that those using DFs outperformed the control learning condition (PFs) in both post-test (p < 0.001, η² = 0.240) and delayed post-test assessments (p < 0.001, Partial η² = 0.407), signifying the effectiveness of mobile-assisted learning in developing receptive and productive written knowledge of technical vocabulary. Additionally, the findings demonstrated that students who engaged with DFs exhibited substantially greater retention and recall of technical vocabulary over time (i.e., a significant effect for time), indicating the long-term benefits of MALL. The practical implications of these findings extend to English language teachers and learners in specialized fields, emphasizing the pedagogical value of mobile technology in optimizing technical vocabulary instruction.
Gbenga Michael Adeyeye
This study investigated the influence of language obstacles on social isolation in varied communities, with a specific focus on the Ogbomoso community in Oyo State, Nigeria. It analyzed the impact of language barriers on communication effectiveness, resulting in misunderstandings, conflicts, and limited access to crucial services such as healthcare, education, and employment. These obstacles also contribute to social and economic inequalities, impeding social advancement and intensifying emotions of seclusion, unease, and apprehension. The study used a qualitative research approach, utilizing semi-structured interviews and focus group discussions, to investigate the experiences of residents and their solutions for surmounting language barriers. The findings emphasized the importance of culturally responsive teaching, community-based learning initiatives, and the utilization of technology in supporting language acquisition and integration. The study presents exemplary programs, such as Canada’s LINC and Australia’s AMEP, as examples of effectively addressing these difficulties. The study underscored the importance of continuous policy assistance, fair access to educational resources, and active community involvement in order to establish more inclusive and unified societies. This study aims to promote social inclusion and reduce isolation by encouraging the use of multiple languages and implementing effective language learning methods. Its goal is to empower people from all backgrounds to succeed in academic, economic, and social aspects of life.
Niels Dickson
An ability that underlies human syntactic knowledge is determining which words can appear in the similar structures (i.e. grouping words by their syntactic categories). These groupings enable humans to combine structures in order to communicate complex meanings. A foundational question is how do children acquire this ability underlying syntactic knowledge. In exploring this process, we will review various engineering approaches whose goal is similar to that of a child's -- without prior syntactic knowledge, correctly identify the parts of speech (POS) of the words in a sample of text. In reviewing these unsupervised tagging efforts, we will discuss common themes that support the advances in the models and their relevance for language acquisition. For example, we discuss how each model judges success (evaluation metrics), the "additional information" that constrains the POS learning (such as orthographic information), and the context used to determine POS (only previous word, words before and after the target, etc). The identified themes pave the way for future investigations into the cognitive processes that underpin the acquisition of syntactic categories and provide a useful layout of current state of the art unsupervised POS tagging models.
Jan Štěpánek
Just as the accuracy of scientific theories is best tested in extreme physical conditions, it is advisable to verify the accuracy of a recognized conception of language on its extreme parts. Mathematical statements meet this role, thanks to the notion of truth and proof. Michael Dummett’s anti-realism is an enterprise that has attempted on this basis to question the notion of the functioning of language-based primarily on the principle of bivalence, the truth-condition theory of meaning, and the notion that the speaker must be able to demonstrate his knowledge of meaning publicly. In common language practice, one can observe assertions that we can neither verify nor refute in principle. On these so-called undecidable statements, Dummett tried to show that if we apply the traditional description to them, we inevitably reach paradoxical conclusions. Mathematical statements referring to an infinite number may be examples of these assertions. In the submitted paper, I will present Dummett’s position resulting primarily in a manifestation and acquisition argument, according to which it should not be possible to understand undecidable statements at all. In conclusion, however, I will show that his intention – despite many valuable comments – fails, i.e. that there is a way to avoid both arguments while preserving the realistic description of the language in general. Key words: anti-realism, mathematical statements, meaning, Michael Dummett, truth, truth-condition theory of meaning
Weronika Urbanik-Pęk
El presente capítulo tiende a describir los rasgos de la entonación declarativa neutra de la interlengua español hablado por polacos. Sigue la metodología del Análisis Melódico del Habla presentado en Cantero (2002) y expuesto más tarde en forma de protocolo en Cantero y Font-Rotchés (2009, 2020). El corpus está constituido por 100 enunciados espontáneos declarativos, emitidos por casi 30 informantes. Se han analizado todos los enunciados teniendo en cuenta sus tres partes fundamentales, el primer pico, el cuerpo y la inflexión final. Los resultados se pueden concluir de manera siguiente: la mayoría de los contornos carece del primer pico y los que lo tienen, suelen trasladarlo a la átona posterior. El cuerpo de la mayoría de los enunciados transcurre plano, o presenta un descenso permanente. La inflexión final suele ser plana y, en menor grado, es ascendente o descendente.
Afra Feyza Akyürek, Muhammed Yusuf Kocyigit, Sejin Paik et al.
Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups -- posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We find out that the practice of measuring biases through text completion is prone to yielding contradicting results under different experiment settings. We additionally provide recommendations for reporting biases in open-ended language generation for a more complete outlook of biases exhibited by a given language model. Code to reproduce the results is released under https://github.com/feyzaakyurek/bias-textgen.
Halaman 7 dari 273845