Hasil untuk "Language and Literature"

Menampilkan 20 dari ~3358499 hasil · dari DOAJ, arXiv, Semantic Scholar, CrossRef

JSON API
S2 Open Access 2019
Use of Smartphone Applications in English Language Learning—A Challenge for Foreign Language Education

Jaroslav Kacetl, B. Klimova

At present, hardly any younger person can imagine life without mobile technologies. They use them on a daily basis, including in language learning. Such learning supported with mobile devices is called mobile learning, which seems beneficial especially thanks to the unique features of mobile applications (e.g., interactivity, ubiquity, and portability) and teachers’ encouragement and feedback. The purpose of this review study is to explore original, peer-reviewed English studies from 2015 to April 2019 and to determine whether mobile applications used in the learning of English as a foreign language are beneficial and/or effective. The methods are based on a literature review of available sources found on the research topic in two acknowledged databases: Web of Science and Scopus. Altogether, 16 original journal studies on the research topic were detected. The results reveal that mobile learning is becoming a salient feature of education as it is a great opportunity for foreign language learning. Its key benefits are as follows: the enhancement of the learner’s cognitive capacity, the learner’s motivation to study in both formal and informal settings, the learner’s autonomy and confidence, as well as the promotion of personalized learning, helping low-achieving students to reach their study goals. Although mobile learning seems to be effective overall, it is desirable to design, plan and implement it with caution, according to students’ needs, and to deliver multiple language skills in authentic learning environments.

223 sitasi en Psychology
S2 Open Access 2017
Natural language processing in mental health applications using non-clinical texts†

Rafael A. Calvo, D. Milne, M. Hussain et al.

Natural language processing (NLP) techniques can be used to make inferences about peoples' mental states from what they write on Facebook, Twitter and other social media. These inferences can then be used to create online pathways to direct people to health information and assistance and also to generate personalized interventions. Regrettably, the computational methods used to collect, process and utilize online writing data, as well as the evaluations of these techniques, are still dispersed in the literature. This paper provides a taxonomy of data sources and techniques that have been used for mental health support and intervention. Specifically, we review how social media and other data sources have been used to detect emotions and identify people who may be in need of psychological assistance; the computational techniques used in labeling and diagnosis; and finally, we discuss ways to generate and personalize mental health interventions. The overarching aim of this scoping review is to highlight areas of research where NLP has been applied in the mental health literature and to help develop a common language that draws together the fields of mental health, human-computer interaction and NLP.

288 sitasi en Computer Science
S2 Open Access 2017
Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog

Satwik Kottur, José M. F. Moura, Stefan Lee et al.

A number of recent works have proposed techniques for end-to-end learning of communication protocols among cooperative multi-agent populations, and have simultaneously found the emergence of grounded human-interpretable language in the protocols developed by the agents, learned without any human supervision! In this paper, using a Task & Talk reference game between two agents as a testbed, we present a sequence of ‘negative’ results culminating in a ‘positive’ one – showing that while most agent-invented languages are effective (i.e. achieve near-perfect task rewards), they are decidedly not interpretable or compositional. In essence, we find that natural language does not emerge ‘naturally’,despite the semblance of ease of natural-language-emergence that one may gather from recent literature. We discuss how it is possible to coax the invented languages to become more and more human-like and compositional by increasing restrictions on how two agents may communicate.

234 sitasi en Computer Science
DOAJ Open Access 2024
Polish research on publishing in Poland between 1945 and 2015: Themes, legacy and implications for further research

Maria Juda

The history of publishing in Poland encompasses many issues associated with the emergence and dissemination of printed books. Of fundamental significance to the study of these issues are the records of the publishing output: while we have nearly complete, though still underexamined, records of this output for the period from the 15th to the 18th century, documented in bibliographies and catalogues, the situation is worse when it comes to the 19th and 20th centuries, until the outbreak of the Second World War. In this respect, what we need is not only a continuation, but a radical intensification of bibliographic work. This concerns works published in the Latin, Cyrillic, Hebrew and Greek scripts, as well as musical notation. Polish book scholars have devoted a lot of attention to the beginnings of printing in Poland. The historiography concerning various typographic workshops located in the former Polish-Lithuanian Commonwealth is rich; however, it still requires further extensive studies. Scholars have also been interested in phenomena influ- encing the content structure of printed publications, such as publishing privileges (in the former Polish-Lithuanian Commonwealth), censorship and restrictions imposed by the partitioning powers and later by Poland’s communist authorities, as a result of which Polish publications had to be printed abroad and an independent publishing movement emerged. The scholars’ research interests have also focused on books as products of printers and publishers and on the publication of written works. Scholars have examined both the various components of the book (title page, printer’s signet, stemmata, etc.) and its editorial composition as a whole. Their undoubted achievements in the studies of the history of publishing in Poland are significant, yet in many areas they need to be continued and expanded (one important task is the edition of sources for the study of the history of Polish publishing) and to investigate the phenomena that stem from developmental tendencies in modern book studies.

Bibliography. Library science. Information resources, Communication. Mass media
DOAJ Open Access 2024
Конвергентті журналистикадағы жанрлар үдерісі

А.Т. Бельдибекова

Журналистикада трансмедиа деп аталатын ұғым бар. Жалпы трансмедиа әр түрлі бұқаралық ақпарат құралдарындағы ақпаратты баяндауға қатысады. Мультимедиадан айырмашылығы – барлық бұқаралық ақпарат құралдары өзара байланыста болады. Мәселен, роман сюжетінің шағын өзгерістерімен фильмге бейімдеу немесе повестің бірінші бөлігін телевизиялық шоу, екінші бөлігін фильм, үшінші бөлігін бейне ойын арқылы баяндау мүмікіндігін тек трансмедиа технологиясы жасай алады. Жалпы әлемде көркем әдебиеттің ажырамас элементтерін таратудың әртүрлі тұстары қалыптасқан. Трансмедиалық ақпарат дегеніміз әлемге ел туралы құнды мәліметтер беретін тұжырым жүйесі. Ондағы ақпарат негізі тарихпен және көптеген мәтіндермен байланысты. Баяндау құрылымы тарихи деректерге толы болады және мәтіндер мен аудио, бейне, қаріптер, сурет, инфографика және т.б. маңызды компонент болып табылады. Әр мәтін өз бетінше әңгіме жасай алады, әлемнің тарихын толығырақ баяндайды. Бір-бірінен тәуелсіз барлық элементтерді біріктіргенде, әрбір элемент негізгі тарихқа өз үлесін қосады. Сол үшін де трансмедианың тиімділігі сторителлинг әдісімен тікелей байланысты. Ғылыми мақаланың мақсаты мен негізгі бағыты онлайн-медиадағы жаңа мультимедиалық жанрлардың қалыптасу процесінің ерекшеліктерін зерттей отырып, қазіргі таңдағы жанрлардың түрленіп, басқа формаларда берілу тәсілдерінің ерекшеліктеріне тоқталу болды. Мақаланың ғылыми және практикалық маңызы қазақ журналистикасында орын алып келе жатқан жаңа тенденцияларды сөз ете отырып, лонгрид, строителлинг, қысқа бейнежазба секілді жаңадан енген терминдік жанрлардың қазақтілді БАҚ мониторингін жасауға әрекеттену. Ғылыми-зерттеу барысында салыстырмалы талдау әдісін пайдалана отырып,  негізгі нәтиже мен талдау олардың ерекше сипаттамалары мен даму динамикасының тенденцияларын қарастыру болып табылды  Malim.kz, Массагет.kz, Balbal.kz секілді сайттарда жарияланған материалдарға талдау жасай отырып, қазіргі таңдағы сұранысты өтеп отырған сторителлинг, лонгрид секілді жанрлардың жаңа үрдісінің табиғатын ашуға тырыстық. Жоғарыда аталған жанрлардағы материалдарға мазмұндық талдау жасай отырып, біз олардың өзіндік ерекшеліктерін бөліп көрсете алдық. Ғылыми зерттеудің бүгінгі таңдағы маңыздылығы жалпы мультимедиалық журналистика материалдарын ұсынудың жаңа форматтарының бәсекеге қабілеттілік критерийлері мен үрдістерін негіз етіп қарастыруында болды.

Journalism. The periodical press, etc.
DOAJ Open Access 2024
Mastering Digital Media Literacy of Muslim Woman's Activists in Preventing Online Gender-Based Violence

Prima Ayu Rizqy Mahanani, Fatma Dian Pratiwi, Fartika Ifriqia et al.

This article tends to analyze how Muslim women's activists, who are the members of Nasyiatul Aisyiyah and Fatayat NU in Kediri and Yogyakarta, build their digital literacy to prevent violence in social media. Mainly due to the digital divide between men and women, which causes imbalance and injustice when they access digital media. The method used in data collection was semi-ethnographic, in which the researcher participated in observing research objects when carrying out activities using digital technology, interviews, and documentation on 3 members of Nasyiatul Aisyiyah and Fatayat NU, both in Jogja and Kediri. The research findings show that what has been stigmatized to women so far is that they are powerless to master information and communication technology, which does not apply to members of Fatayat NU and Nasyiatul Aisyiyah. This research shows that women are also reliable in accessing the internet for the benefit of empowering women, especially KBGO issues. This research has provided a different understanding of the phenomenon of the massive use of internet-based technology by female activists

Communication. Mass media, Islam
arXiv Open Access 2024
A Comprehensive Evaluation of Semantic Relation Knowledge of Pretrained Language Models and Humans

Zhihan Cao, Hiroaki Yamada, Simone Teufel et al.

Recently, much work has concerned itself with the enigma of what exactly pretrained language models~(PLMs) learn about different aspects of language, and how they learn it. One stream of this type of research investigates the knowledge that PLMs have about semantic relations. However, many aspects of semantic relations were left unexplored. Generally, only one relation has been considered, namely hypernymy. Furthermore, previous work did not measure humans' performance on the same task as that performed by the PLMs. This means that at this point in time, there is only an incomplete view of the extent of these models' semantic relation knowledge. To address this gap, we introduce a comprehensive evaluation framework covering five relations beyond hypernymy, namely hyponymy, holonymy, meronymy, antonymy, and synonymy. We use five metrics (two newly introduced here) for recently untreated aspects of semantic relation knowledge, namely soundness, completeness, symmetry, prototypicality, and distinguishability. Using these, we can fairly compare humans and models on the same task. Our extensive experiments involve six PLMs, four masked and two causal language models. The results reveal a significant knowledge gap between humans and models for all semantic relations. In general, causal language models, despite their wide use, do not always perform significantly better than masked language models. Antonymy is the outlier relation where all models perform reasonably well. The evaluation materials can be found at https://github.com/hancules/ProbeResponses.

arXiv Open Access 2024
Grounding Toxicity in Real-World Events across Languages

Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

Social media conversations frequently suffer from toxicity, creating significant issues for users, moderators, and entire communities. Events in the real world, like elections or conflicts, can initiate and escalate toxic behavior online. Our study investigates how real-world events influence the origin and spread of toxicity in online discussions across various languages and regions. We gathered Reddit data comprising 4.5 million comments from 31 thousand posts in six different languages (Dutch, English, German, Arabic, Turkish and Spanish). We target fifteen major social and political world events that occurred between 2020 and 2023. We observe significant variations in toxicity, negative sentiment, and emotion expressions across different events and language communities, showing that toxicity is a complex phenomenon in which many different factors interact and still need to be investigated. We will release the data for further research along with our code.

en cs.CL
arXiv Open Access 2024
L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages

Aishwarya Mirashi, Srushti Sonavane, Purva Lingayat et al.

In this work, we introduce L3Cube-IndicNews, a multilingual text classification corpus aimed at curating a high-quality dataset for Indian regional languages, with a specific focus on news headlines and articles. We have centered our work on 10 prominent Indic languages, including Hindi, Bengali, Marathi, Telugu, Tamil, Gujarati, Kannada, Odia, Malayalam, and Punjabi. Each of these news datasets comprises 10 or more classes of news articles. L3Cube-IndicNews offers 3 distinct datasets tailored to handle different document lengths that are classified as: Short Headlines Classification (SHC) dataset containing the news headline and news category, Long Document Classification (LDC) dataset containing the whole news article and the news category, and Long Paragraph Classification (LPC) containing sub-articles of the news and the news category. We maintain consistent labeling across all 3 datasets for in-depth length-based analysis. We evaluate each of these Indic language datasets using 4 different models including monolingual BERT, multilingual Indic Sentence BERT (IndicSBERT), and IndicBERT. This research contributes significantly to expanding the pool of available text classification datasets and also makes it possible to develop topic classification models for Indian regional languages. This also serves as an excellent resource for cross-lingual analysis owing to the high overlap of labels among languages. The datasets and models are shared publicly at https://github.com/l3cube-pune/indic-nlp

en cs.CL, cs.LG
arXiv Open Access 2024
Fotheidil: an Automatic Transcription System for the Irish Language

Liam Lonergan, Ibon Saratxaga, John Sloan et al.

This paper sets out the first web-based transcription system for the Irish language - Fotheidil, a system that utilises speech-related AI technologies as part of the ABAIR initiative. The system includes both off-the-shelf pre-trained voice activity detection and speaker diarisation models and models trained specifically for Irish automatic speech recognition and capitalisation and punctuation restoration. Semi-supervised learning is explored to improve the acoustic model of a modular TDNN-HMM ASR system, yielding substantial improvements for out-of-domain test sets and dialects that are underrepresented in the supervised training set. A novel approach to capitalisation and punctuation restoration involving sequence-to-sequence models is compared with the conventional approach using a classification model. Experimental results show here also substantial improvements in performance. The system will be made freely available for public use, and represents an important resource to researchers and others who transcribe Irish language materials. Human-corrected transcriptions will be collected and included in the training dataset as the system is used, which should lead to incremental improvements to the ASR model in a cyclical, community-driven fashion.

en cs.CL, cs.SD
arXiv Open Access 2024
Beyond Data Quantity: Key Factors Driving Performance in Multilingual Language Models

Sina Bagheri Nezhad, Ameeta Agrawal, Rhitabrat Pokharel

Multilingual language models (MLLMs) are crucial for handling text across various languages, yet they often show performance disparities due to differences in resource availability and linguistic characteristics. While the impact of pre-train data percentage and model size on performance is well-known, our study reveals additional critical factors that significantly influence MLLM effectiveness. Analyzing a wide range of features, including geographical, linguistic, and resource-related aspects, we focus on the SIB-200 dataset for classification and the Flores-200 dataset for machine translation, using regression models and SHAP values across 204 languages. Our findings identify token similarity and country similarity as pivotal factors, alongside pre-train data and model size, in enhancing model performance. Token similarity facilitates cross-lingual transfer, while country similarity highlights the importance of shared cultural and linguistic contexts. These insights offer valuable guidance for developing more equitable and effective multilingual language models, particularly for underrepresented languages.

en cs.CL, cs.AI
arXiv Open Access 2024
Soft Language Prompts for Language Transfer

Ivan Vykopal, Simon Ostermann, Marián Šimko

Cross-lingual knowledge transfer, especially between high- and low-resource languages, remains challenging in natural language processing (NLP). This study offers insights for improving cross-lingual NLP applications through the combination of parameter-efficient fine-tuning methods. We systematically explore strategies for enhancing cross-lingual transfer through the incorporation of language-specific and task-specific adapters and soft prompts. We present a detailed investigation of various combinations of these methods, exploring their efficiency across 16 languages, focusing on 10 mid- and low-resource languages. We further present to our knowledge the first use of soft prompts for language transfer, a technique we call soft language prompts. Our findings demonstrate that in contrast to claims of previous work, a combination of language and task adapters does not always work best; instead, combining a soft language prompt with a task adapter outperforms most configurations in many cases.

S2 Open Access 2021
BioM-Transformers: Building Large Biomedical Language Models with BERT, ALBERT and ELECTRA

Sultan Alrowili, Vijay K. Shanker

The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.

91 sitasi en Computer Science

Halaman 10 dari 167925