Julia Kristeva, L. S. Roudiez, Thomas Gora et al.
Hasil untuk "Language and Literature"
Menampilkan 20 dari ~3358758 hasil · dari DOAJ, CrossRef, Semantic Scholar, arXiv
J. Law, James Martin Edward Boyle, F. Harris et al.
The prevalence and the natural history of primary speech and language delays were two of four domains covered in a systematic review of the literature related to screening for speech and language delay carried out for the NHS in the UK. The structure and process of the full literature review is introduced and criteria for inclusion in the two domains are specified. The resulting data set gave 16 prevalence estimates generated from 21 publications and 12 natural history studies generated from 18 publications. Results are summarized for six subdivisions of primary speech and language delays: (1) speech and/or language, (2) language only, (3) speech only, (4) expression with comprehension, (5) expression only and (6) comprehension only. Combination of the data suggests that both concurrent and predictive case definition can be problematic. Prediction improves if language is taken independently of speech and if expressive and receptive language are taken together. The results are discussed in terms of the need to develop a model of prevalence based on risk of subsequent difficulties.
Ismael El Bahraoui Pérez
El presente estudio pretende llevar a cabo un análisis lingüístico-literario de la vigésimo primera misiva que conforma el primer libro del epistolario alcifroneo, estructurada a partir de una écfrasis en torno a la cual el epistológrafo, en tanto que πεπαιδευμένος, hace uso de variados recursos que ofrecen los manuales de retórica (Robb, 1994, pp. 10-25), con especial atención al empleo del relato y el lugar común como elementos distintivos. Esta epístola, que tiene como emisor a Éuploo y como receptor a Taláseros, describe la dilapidación de todos los bienes por parte del último tras haberse encantado por una músico que toca la lira y, en consecuencia, el marco narrativo aparece caracterizado por la fuerte presencia del tópico del naufragio de amor, en concreto la variante del naufragio en tierra firme, muy frecuente en el V y XII libro de la Antología Palatina. En este sentido, dicho examen nos permitirá ofrecer una panorámica de aquellos pasajes –pertenecientes a la poesía epigramática erótica–, que entran en relación con nuestra carta desde el punto de vista de la topificación amatoria.
Qiao Wang, Adnan Labib, Robert Swier et al.
GenQuest is a generative text adventure game that leverages Large Language Models (LLMs) to facilitate second language learning through immersive, interactive storytelling. The system engages English as a Foreign Language (EFL) learners in a collaborative "choose-your-own-adventure" style narrative, dynamically generated in response to learner choices. Game mechanics such as branching decision points and story milestones are incorporated to maintain narrative coherence while allowing learner-driven plot development. Key pedagogical features include content generation tailored to each learner's proficiency level, and a vocabulary assistant that provides in-context explanations of learner-queried text strings, ranging from words and phrases to sentences. Findings from a pilot study with university EFL students in China indicate promising vocabulary gains and positive user perceptions. Also discussed are suggestions from participants regarding the narrative length and quality, and the request for multi-modal content such as illustrations.
Janusz Antoni Strużyna
Objectives 1. identification of the core of the title process and its comparison with known types of patterns of improving management, 2. elaboration on the proposals and topics that enrich knowledge about the processes of HRM improvement, including the problem of institutional isomorphism. Material and methods Literature studies allowed us to identify patterns of organizational improvement change. The subject of empirical research was the content of Polish-language offers posted on websites for consulting companies improving Human Resources Management. The analysis was conducted based on content analysis guidelines. The obtained analysis results were compared with theoretical patterns of changes. In this way, the features of the examined core were identified. Results The identified core of dissemination consists of four phases and three processes with different logic. The interaction of processes can create inconsistency in the entire solution. The dominance of behaviors securing and facilitating HRM operations over other types of managerial behaviors was found. There are no methods of dealing with this asymmetry in the offers. Conclusions The inconsistencies between the core processes of pattern dissemination should encourage practitioners to determine their level of acceptance of the heterogeneity of patterns. For this, it is necessary to develop practical competencies in managing contradictions. Future research should provide a more complete understanding of the specificity of different sources of patterns, the nature of operations performed on patterns, and the identification of inconsistencies in the processes of pattern dissemination. It is also valuable to draw attention to the specific forces of institutional isomorphism that shape the similarity of patterns and ways of disseminating them
Eren Dogan, M. Egemen Uzun, Atahan Uz et al.
The developments that language models have provided in fulfilling almost all kinds of tasks have attracted the attention of not only researchers but also the society and have enabled them to become products. There are commercially successful language models available. However, users may prefer open-source language models due to cost, data privacy, or regulations. Yet, despite the increasing number of these models, there is no comprehensive comparison of their performance for Turkish. This study aims to fill this gap in the literature. A comparison is made among seven selected language models based on their contextual learning and question-answering abilities. Turkish datasets for contextual learning and question-answering were prepared, and both automatic and human evaluations were conducted. The results show that for question-answering, continuing pretraining before fine-tuning with instructional datasets is more successful in adapting multilingual models to Turkish and that in-context learning performances do not much related to question-answering performances.
Manuel de Buenaga, Francisco Javier Bueno
The GPT (Generative Pre-trained Transformer) language models are an artificial intelligence and natural language processing technology that enables automatic text generation. There is a growing interest in applying GPT language models to university teaching in various dimensions. From the perspective of innovation in student and teacher activities, they can provide support in understanding and generating content, problem-solving, as well as personalization and test correction, among others. From the dimension of internationalization, the misuse of these models represents a global problem that requires taking a series of common measures in universities from different geographical areas. In several countries, there has been a review of assessment tools to ensure that work is done by students and not by AI. To this end, we have conducted a detailed experiment in a representative subject of Computer Science such as Software Engineering, which has focused on evaluating the use of ChatGPT as an assistant in theory activities, exercises, and laboratory practices, assessing its potential use as a support tool for both students and teachers.
Zofia Malisz, Jan Foremski, Małgorzata Kul
We present a speech database and a phoneme-level language model of Polish. The database and model are designed for the analysis of prosodic and discourse factors and their impact on acoustic parameters in interaction with predictability effects. The database is also the first large, publicly available Polish speech corpus of excellent acoustic quality that can be used for phonetic analysis and training of multi-speaker speech technology systems. The speech in the database is processed in a pipeline that achieves a 90% degree of automation. It incorporates state-of-the-art, freely available tools enabling database expansion or adaptation to additional languages.
Renxi Wang, Haonan Li, Xudong Han et al.
Large language models (LLMs) have achieved success in acting as agents, which interact with environments through tools such as search engines. However, LLMs are optimized for language generation instead of tool use during training or alignment, limiting their effectiveness as agents. To resolve this problem, previous work has first collected interaction trajectories between LLMs and environments, using only trajectories that successfully finished the task to fine-tune smaller models, making fine-tuning data scarce and acquiring it both difficult and costly. Discarding failed trajectories also leads to significant wastage of data and resources and limits the possible optimization paths during fine-tuning. In this paper, we argue that unsuccessful trajectories offer valuable insights, and LLMs can learn from these trajectories through appropriate quality control and fine-tuning strategies. By simply adding a prefix or suffix that tells the model whether to generate a successful trajectory during training, we improve model performance by a large margin on mathematical reasoning, multi-hop question answering, and strategic question answering tasks. We further analyze the inference results and find that our method provides a better trade-off between valuable information and errors in unsuccessful trajectories. To our knowledge, we are the first to demonstrate the value of negative trajectories and their application in agent-tunning scenarios. Our findings offer guidance for developing better agent-tuning methods and low-resource data usage techniques.
Jiahao Huo, Yibo Yan, Boren Hu et al.
Projecting visual features into word embedding space has become a significant fusion strategy adopted by Multimodal Large Language Models (MLLMs). However, its internal mechanisms have yet to be explored. Inspired by multilingual research, we identify domain-specific neurons in multimodal large language models. Specifically, we investigate the distribution of domain-specific neurons and the mechanism of how MLLMs process features from diverse domains. Furthermore, we propose a three-stage mechanism for language model modules in MLLMs when handling projected image features, and verify this hypothesis using logit lens. Extensive experiments indicate that while current MLLMs exhibit Visual Question Answering (VQA) capability, they may not fully utilize domain-specific information. Manipulating domain-specific neurons properly will result in a 10% change of accuracy at most, shedding light on the development of cross-domain, all-encompassing MLLMs in the future. The source code is available at https://github.com/Z1zs/MMNeuron.
E. Hill
France Lafleur
In this article on the conceptualization of the digital space in the teaching-learning of languages, our contribution to the "Architecture of the processes of production and reception" of language (François & Nespoulous, 2014), consists in the identification of the interlinguistic didactic constants of the teaching-learning of foreign languages and their integration into a three-dimensional pedagogical model integrating the stratified structural components of their deep learning. This is an experimental model, but immediately applicable in the teaching-learning-evaluation of languages, which therefore exposes one of the results of our current research-actions in distance learning (FAD).In the introduction, we begin our remarks by presenting the abundant parameters of the place of digital technology in the teaching-learning-assessment of languages. Our methodology is that of analytical and conceptual research on the teaching-learning-evaluation of languages. Our analyses are based on the founding documents of the European Community (EC), in particular the Common European Framework of Reference for Languages, CEFR (Conseil de l'Europe, 2001, 2018, 2021). Our objective is to cover as many of the language components and skills required to update them within the framework of the action-oriented approach advocated by the EC.Our discussion focuses on the technological conditions for applying this model and our conclusion on the prospects, already at our doorstep, of the organic integration of the artificial intelligence of languages into humans.
Hernina Hernina, Yenny Karlina, Devi Ambarwati Puspitasari
The Indonesian terms of disease names are unique. Despite being different from their medical terms, Indonesian terms of disease names contain elements of figurative language. This study aims to analyze stylistic naming. The data were Corpora from articles, social media forums, and online news on health for 2013-2023, involving 1.206.281.985 tokens from the Indonesian-Leipzig Corpora Collection (ILC) and 39.294 tokens collected for the 2023 Health Forum Corpus (HF). Data analysis concerned the wordlist and collocation feature to see the frequency, trend, and pattern, and the concordance feature examined the language style of the names. The study does not find any evidence of changes in health terms over the past decade, such as “penyakit jantung” (heart disease), "headache," and "hospital." However, it does uncover some interesting findings regarding the formation of disease names. Affixation and compounding are the primary word formation processes. The stylistic elements of disease names were hyperbolic figures, such as "gagal ginjal" (chronic kidney disease), and symbolic figures, such as "kaki gajah" (filariasis) and "mata ikan" (clavus). In conclusion, the names of diseases followed a particular pattern, but the specific terminology used might vary based on linguistic factors and cultural understanding.
Shangshang Zheng, He Bai, Yizhe Zhang et al.
Large Language Models (LLMs) might hallucinate facts, while curated Knowledge Graph (KGs) are typically factually reliable especially with domain-specific knowledge. Measuring the alignment between KGs and LLMs can effectively probe the factualness and identify the knowledge blind spots of LLMs. However, verifying the LLMs over extensive KGs can be expensive. In this paper, we present KGLens, a Thompson-sampling-inspired framework aimed at effectively and efficiently measuring the alignment between KGs and LLMs. KGLens features a graph-guided question generator for converting KGs into natural language, along with a carefully designed importance sampling strategy based on parameterized KG structure to expedite KG traversal. Our simulation experiment compares the brute force method with KGLens under six different sampling methods, demonstrating that our approach achieves superior probing efficiency. Leveraging KGLens, we conducted in-depth analyses of the factual accuracy of ten LLMs across three large domain-specific KGs from Wikidata, composing over 19K edges, 700 relations, and 21K entities. Human evaluation results indicate that KGLens can assess LLMs with a level of accuracy nearly equivalent to that of human annotators, achieving 95.7% of the accuracy rate.
Zheng-Xin Yong, Cristina Menghini, Stephen H. Bach
AI safety training and red-teaming of large language models (LLMs) are measures to mitigate the generation of unsafe content. Our work exposes the inherent cross-lingual vulnerability of these safety mechanisms, resulting from the linguistic inequality of safety training data, by successfully circumventing GPT-4's safeguard through translating unsafe English inputs into low-resource languages. On the AdvBenchmark, GPT-4 engages with the unsafe translated inputs and provides actionable items that can get the users towards their harmful goals 79% of the time, which is on par with or even surpassing state-of-the-art jailbreaking attacks. Other high-/mid-resource languages have significantly lower attack success rate, which suggests that the cross-lingual vulnerability mainly applies to low-resource languages. Previously, limited training on low-resource languages primarily affects speakers of those languages, causing technological disparities. However, our work highlights a crucial shift: this deficiency now poses a risk to all LLMs users. Publicly available translation APIs enable anyone to exploit LLMs' safety vulnerabilities. Therefore, our work calls for a more holistic red-teaming efforts to develop robust multilingual safeguards with wide language coverage.
Ibrahim Oteir, A. Al-Otaibi
Research in foreign language learning has notably revealed that foreign language anxiety has been a crucial area in applied linguistics. Therefore, this study tends to give a comprehensive review of literature on foreign language anxiety. This review also tries to add an additional explanation to the earlier studies of this issue. It clarifies the concept of foreign language anxiety and how it is different from other related types of anxiety. Finally, it shows the main causes and effects of foreign language anxiety that influence language learners.
I. Piller, Livia Gerber
ABSTRACT In contemporary Western societies, parenting has become the subject of a substantial body of advice and self-help literature. Within this literature, questions of bilingual parenting have begun to add yet another dimension to parental anxieties. Against this background, we examine how parents in a general Australian online parenting forum discuss the desires they have for their children's bilingualism and the challenges they experience to their bilingual parenting. We first demonstrate that individual bilingualism in the abstract is discussed in highly favourable terms and is widely conceptualised as a ‘gift’ from parents to children. However, posters’ belief in the bilingual advantage does not easily translate into effective bilingual parenting practices. First, many posters are concerned that bilingualism in the early years might be jeopardising their child's English language proficiency and hence school success. Second, a very narrow definition of ‘true’ bilingualism is connected with a relatively dogmatic belief in the ‘one parent, one language’ parenting strategy. As a result, consecutive bilinguals, particularly migrant fathers, come to be perceived as both problematic bilinguals and problematic parents. We close with implications for family language policy and advocacy in the face of entrenched institutional English monolingualism.
Kshitij Gupta
Large pre-trained language models have brought remarkable progress in NLP. Pre-training and Fine-tuning have given state-of-art performance across tasks in text processing. Data Augmentation techniques have also helped build state-of-art models on low or zero resource tasks. Many works in the past have attempted at learning a single massively-multilingual machine translation model for zero-shot translation. Although those translation models are producing correct translations, the main challenge is those models are producing the wrong languages for zero-shot translation. This work and its results indicate that prompt conditioned large models do not suffer from off-target language errors i.e. errors arising due to translation to wrong languages. We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine translation.
Edwin Zhang, Yujie Lu, Shinda Huang et al.
Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and generalization to novel tasks. Recent advances with architectures have allowed for improved scaling along one or two of these axes, but are still computationally prohibitive to use. In this paper, we propose to address all three axes by leveraging \textbf{L}anguage to \textbf{C}ontrol \textbf{D}iffusion models as a hierarchical planner conditioned on language (LCD). We effectively and efficiently scale diffusion models for planning in extended temporal, state, and task dimensions to tackle long horizon control problems conditioned on natural language instructions, as a step towards generalist agents. Comparing LCD with other state-of-the-art models on the CALVIN language robotics benchmark finds that LCD outperforms other SOTA methods in multi-task success rates, whilst improving inference speed over other comparable diffusion models by 3.3x~15x. We show that LCD can successfully leverage the unique strength of diffusion models to produce coherent long range plans while addressing their weakness in generating low-level details and control.
Elijah Rippeth, Sweta Agrawal, Marine Carpuat
This paper describes the University of Maryland's submission to the Special Task on Formality Control for Spoken Language Translation at \iwslt, which evaluates translation from English into 6 languages with diverse grammatical formality markers. We investigate to what extent this problem can be addressed with a \textit{single multilingual model}, simultaneously controlling its output for target language and formality. Results show that this strategy can approach the translation quality and formality control achieved by dedicated translation models. However, the nature of the underlying pre-trained language model and of the finetuning samples greatly impact results.
Halaman 12 dari 167938