Hasil "Computational linguistics. Natural language processing"

S2 Open Access 2019

Computational Linguistics: 16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019, Hanoi, Vietnam, October 11–13, 2019, Revised Selected Papers

Frank Rudzicz, Graeme Hirst

627 sitasi en

Detail DOI Sumber

S2 Open Access 2019

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Haoming Jiang, Pengcheng He, Weizhu Chen et al.

Transfer learning has fundamentally changed the landscape of natural language processing (NLP). Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize to unseen data. To address such an issue in a principled manner, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance. The proposed framework contains two important ingredients: 1. Smoothness-inducing regularization, which effectively manages the complexity of the model; 2. Bregman proximal point optimization, which is an instance of trust-region methods and can prevent aggressive updating. Our experiments show that the proposed framework achieves new state-of-the-art performance on a number of NLP tasks including GLUE, SNLI, SciTail and ANLI. Moreover, it also outperforms the state-of-the-art T5 model, which is the largest pre-trained model containing 11 billion parameters, on GLUE.

597 sitasi en Computer Science, Mathematics

Detail DOI Sumber

S2 Open Access 2018

Analysis Methods in Neural Language Processing: A Survey

Yonatan Belinkov, James R. Glass

The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new models have been proposed, many of which are thought to be opaque compared to their feature-rich counterparts. This has led researchers to analyze, interpret, and evaluate neural networks in novel and more fine-grained ways. In this survey paper, we review analysis methods in neural language processing, categorize them according to prominent research trends, highlight existing limitations, and point to potential directions for future work.

619 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2006

NLTK: The Natural Language Toolkit

Steven Bird

The Natural Language Toolkit is a suite of program modules, data sets and tutorials supporting research and teaching in computational linguistics and natural language processing. NLTK is written in Python and distributed under the GPL open source license. Over the past year the toolkit has been rewritten, simplifying many linguistic data structures and taking advantage of recent enhancements in the Python language. This paper reports on the simpli-ﬁed toolkit and explains how it is used in teaching NLP.

5132 sitasi en Computer Science

Detail DOI Sumber

DOAJ Open Access 2025

HOMO NUMERICUS : QUELS ENJEUX ET IMAGE POUR LA SOCIETE CONGOLAISE

Godfrey BENTH NGOYI, Blady KASONGO MUNZILA et Séraphine KUBOLA NGOMA

Résumé : L’article explore la figure de l’Homo numericus, un individu dont la vie est médiatisée et reconfigurée par le numérique. Il analyse les transformations sociales et identitaires induites par les réseaux sociaux, où l’identité se construit dans une mise en scène permanente. Sur le plan économique, l’Homo numericus est à la fois producteur et consommateur de données, intégré dans le capitalisme de surveillance. Politiquement, il participe à de nouvelles formes de mobilisation, mais reste vulnérable à la désinformation et aux manipulations. Culturellement, il contribue à une culture participative, bien que freinée par la fracture numérique. L’étude conclut que cette figure est ambivalente : promesse d’émancipation, mais aussi risque de domination et de déshumanisation. Mots clés : Homo numericus, Identité numérique, Société connectée, Capitalisme de surveillance, Culture participative

Arts in general, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2025

XAIHO: explainable AI leveraging hybrid optimized framework for liver cirrhosis detection

Prashant Kumar Mishra, Brijesh Kumar Chaurasia, Man Mohan Shukla

Abstract This study introduces an explainable AI leveraging hybrid optimized framework for liver cirrhosis detection (XAIHO) with deep learning (DL) to address the critical challenges of low interpretability and diagnostic inefficiency in traditional and AI-driven liver cirrhosis detection systems. Conventional approaches often rely on invasive procedures and lack transparency, while existing DL models, although accurate, function as black boxes and limit clinical trust. To bridge this gap, the research initially explored machine learning (ML) models and then integrated XAI techniques to improve model explainability. Subsequently, DL approaches were employed using fine-tuned pre-trained models such as VGG16, VGG19, ResNet50, ResNet101, Xception, Inception-V3, EfficientNetB1, EfficientNetB2, Vision Transformer (ViT), and InceptionResNetV2. While these models showed strong classification performance, their limited interpretability remained a barrier for clinical deployment. To address this, the proposed XAIHO framework was developed in two phases: first implementing XAI without optimizers, and then enhancing it with advanced optimizers (Adam, NAdam, RMSProp) to improve both predictive accuracy and interpretability. The proposed XAIHO framework achieves a peak accuracy of 92.35%, representing a 4% improvement over standard DL models and an 8% increase compared to traditional ML baselines. Additionally, transparency and interpretability are significantly improves using SHAP values and attention-based visualizations, providing meaningful insights into critical features such as bilirubin, albumin, and age. Empirical results, validate through multiple performance metrics, confirm the framework’s potential for accurate, transparent, and clinically applicable liver cirrhosis diagnosis.

Computational linguistics. Natural language processing, Electronic computers. Computer science

Detail DOI Sumber

DOAJ Open Access 2024

L’afflux sur le marché Congolais des meubles en bois importés ; Bilan et perspectives

MAKONGA BAENAKE Zozo

RESUME : L’ouvrage dont il est question, traite de “L’afflux sur le marché Congolais des meubles en bois importés ; Bilan et perspectives”, commence par le paradoxe d’un pays qui regorge une forêt de 1.280.000Km2, soit 54% de sa superficie, mais dont le marché est innondé par les meubles en bois importés. Ce faisant, l’article évoque des théories des avantages absolus et comparatifs, énoncés respectivement par Adam Smith et David Ricardo qui suggèrent à chaque Nation de se spécialiser dans la production des biens pour lesquels il est plus efficace que les autres, et échanger le surplus contre d’autres biens dont il aurait besoin . En toute logique, la RDC devrait exploiter ses potentialités aussi bien en matières premières, et sa croissance démographique qui constitue à coup sûr, une garantie en main d’œuvre et en débouchées. Ce qui n’est malheureusement pas le cas. C’est pour essayer de résoudre cette équation que nous abordons ce sujet qui s’est penché essentiellement sur le questionnement autour des points suivants : - Pourquoi l’afflux des meubles importés sur le marché Congolais ? - Quel est l’impact de cet afflux sur l’économie et l’artisanat local ? - Quelles perspectives d’avenir pour ce secteur d’activité ? En substance, il se révèle que l’afflux des meubles importés est causé, d’une part par le rejet des meubles locaux par les consommateurs qui les accusent de mauvais travail de finissage et autre vis, au profit de ceux importés convoités suite à leur design. Ce marché cause à la fois la fuite des capitaux qui a un impact négatif sur l’économie et menace dangereusement l’artisanat local qui tend à disparaitre suite à l’abandon de ses produits par les consommateurs qui convoitent les meubles importés. Comme perspectives d’avenir, des suggestions sont formulées en diverses approches qui, une fois matérialisées, garantiront un avenir sera radieux à ce secteur. Mots-clés : Marché congolais ; meubles en bois ; importés ; potentialités locales ; R. D. C.

Arts in general, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2024

Recent advancements in automatic disordered speech recognition: A survey paper

Nada Gohider, Otman A. Basir

Automatic Speech Recognition (ASR) technology has recently witnessed a paradigm shift with respect to performance accuracy. Nevertheless, impaired speech remains a significant challenge, evidenced by the inadequate accuracy of existing ASR solutions. This lacking is reported in various research reports. While this lacking has motivated new directions in Automatic Disordered Speech Recognition (ADSR), the gap between ASR performance accuracy and that of ADSR remains significant. In this paper, we report a consolidated account of research work conducted to date to address this gap, highlighting the root causes of such performance discrepancy and discussing prominent research directions in this area. The paper raises some fundamental issues and challenges that ADSR research faces today. Firstly, we discuss the adequacy of impaired speech representation in existing datasets, in terms of the diversity of speech impairments, speech continuity, speech style, vocabulary, age group, and the environments of the data collection process. We argue that disordered speech is poorly represented in the existing datasets; thus, it is expected that several fundamental components needed for training ADSR models are absent. Most of the open-access databases of impaired speech focus on adult dysarthric speakers, ignoring a wide spectrum of speech disorders and age groups. Furthermore, the paper reviews prominent research directions adopted by the ADSR research community in its effort to advance speech recognition technology for impaired speakers. We categorize this research effort into directions such as personalized models, model adaptation, data augmentation, and multi-modal learning. Although these research directions have advanced the performance of ADSR models, we believe there is still potential for further advancement since current efforts, in essence, make the false assumption that there is a limited distribution shift between the source and target data. Finally, we stress the need to investigate performance measures other than Word Error Rate (WER)- measures that can reliably encode the contribution of erroneous output tokens in the final uttered message.

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2023

La situación de la religión en La cena secreta de Javier Sierra Albert

Bourfané HAMAN ARMAND

Resumen: Nuestra investigación lleva sobre la situación de la religión en La cena secreta publicada en 2004. La obra de Javier Sierra Albert nos presenta la situación en la que se encuentra la fe católica. Según su obra, la religión católica se encuentra en una situación de decaimiento o más bien de descrecencia debido a muchos factores. Dentro de estos elementos que favorecieron el decaimiento de la Iglesia Católica, el autor destaca la pérdida de la facultad de interpretar imágenes, el advenimiento del racionalismo, la Grecia de Platón, el Egipto de Cleopatra, las extravagancias que vienen del Oriente, la presencia de turcos en la Mediterránea, la aparición de una horda de paganos y el Papa Alejandro que sostenía el paganismo. Según el autor de la obra, estos aspectos hicieron que la Iglesia católica y la cristiandad en general se encontraran en un contexto peor. Palabras claves: La situación de la religión, la fe católica, el decaimiento de valores bíblicos.

Arts in general, Computational linguistics. Natural language processing

Detail Sumber

DOAJ Open Access 2023

Análisis multimodal de actos de habla en español del Paraguay y de Argentina

Natalia dos Santos Figueiredo

Este artículo presenta los resultados de una investigación que describe los rasgos entonativos de los actos de habla —pregunta y pedido—, producidos por hablantes de Asunción, en Paraguay, y de Buenos Aires, en Argentina. Consideramos para este estudio datos de audio y video de habla actuada experimental, lo que se configuró por la grabación de enunciados producidos por 2 hombres y 2 mujeres, con edades entre 20 y 35 años, originarios de las 2 ciudades investigadas. Para el análisis multimodal (Moraes et al., 2014), se llevó a cabo la descripción de elementos acústicos y visuales de los enunciados obtenidos, y se observó el contraste de las estrategias de cortesía utilizadas por los participantes de cada localidad, la variación de las curvas de F0 y de duración de sílabas, además de los movimientos faciales (Ekman et al., 2002). Los resultados demuestran que el análisis acústico no es suficiente para definir los rasgos específicos de los actos de habla directivos en esas variedades del español, debido a las diferentes configuraciones de tonemas encontrados en los enunciados analizados. Se hizo necesario recurrir a la combinación con la descripción visual, para describir las unidades de acción que acompañan la producción del acto de habla directivo.

Philology. Linguistics, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2023

La thématique de recherche en communication environnementale dans les universités de la ville de Kinshasa (UNIKIN, IFASIC et UCC)

Marc KILENGE TOKO PUKU

Résumé : Cet article dresse une classification des thématiques récurrent dans la recherche en communication environnementale dans les trois universités de la ville de Kinshasa qui organisent les enseignements dans le domaine des Sciences de l’Information et de la Communication (SIC) en sigle. Il est question pour nous répondre à la question de savoir comment les universités précitées exploitent la recherche en la communication environnementale sur le plan thématique ? A cela nous nous pouvons dire que la recherche en communication environnementale dans les trois universités de la ville de Kinshasa est récente. Elle remonte des années 2010. A ces jours, la recherche en communication environnementale avec des thématiques diversifiés commence à gagner du terrain dans les trois universités précitées. La sensibilité est plus grande dans les médias et la communication environnementale. Pour tenter de répondre à ces questions, nous userons de la méthode d’analyse de contenue dans son approche qualitative, soutenue par la technique documentaire et d’observation directe. Sur le plan temporel, depuis l’existence de ces institutions jusqu’à 2019. Mots-clés : Communication Environnementale, CE, SIC

Arts in general, Computational linguistics. Natural language processing

Detail Sumber

S2 Open Access 2022

Computational Linguistics Based Emotion Detection and Classification Model on Social Networking Data

Heyam H. Al-Baity, Hala J. Alshahrani, Mohamed K. Nour et al.

Computational linguistics (CL) is the application of computer science for analysing and comprehending written and spoken languages. Recently, emotion classification and sentiment analysis (SA) are the two techniques that are mostly utilized in the Natural Language Processing (NLP) field. Emotion analysis refers to the task of recognizing the attitude against a topic or target. The attitude may be polarity (negative or positive) or an emotional state such as sadness, joy, or anger. Therefore, classifying posts and opinion mining manually is a difficult task. Data subjectivity has made this issue an open problem in the domain. Therefore, this article develops a computational linguistics-based emotion detection and a classification model on social networking data (CLBEDC-SND) technique. The presented CLBEDC-SND technique investigates the recognition and classification of emotions in social networking data. To attain this, the presented CLBEDC-SND model performs different stages of data pre-processing to make it compatible for further processing. In addition, the CLBEDC-SND model undergoes vectorization and sentiment scoring process using fuzzy approach. For emotion classification, the presented CLBEDC-SND model employs extreme learning machine (ELM). Finally, the parameters of the ELM model are optimally modified by the use of the shuffled frog leaping optimization (SFLO) algorithm. The performance validation of the CLBEDC-SND model is tested using benchmark datasets. The experimental results demonstrate the better performance of the CLBEDC-SND model over other models.

4 sitasi en

Detail DOI Sumber

DOAJ Open Access 2022

Discourse Prominence and Antecedent Mis-Retrieval during Native and Non-Native Pronoun Resolution

Cecilia Puebla, Claudia Felser

Previous studies on non-native (L2) anaphor resolution suggest that L2 comprehenders are guided more strongly by discourse-level cues compared to native (L1) comprehenders. Here we examine whether and how a grammatically inappropriate antecedent’s discourse status affects the likelihood of it being considered during L1 and L2 pronoun resolution. We used an interference paradigm to examine how the extrasentential discourse impacts the resolution of German object pronouns. In an eye-tracking-during-reading experiment we examined whether an elaborated local antecedent ruled out by binding Condition B would be mis-retrieved during pronoun resolution, and whether initially introducing this antecedent as the discourse topic would affect the chances of it being mis-retrieved. While both participant groups rejected the inappropriate antecedent in an offline questionnaire irrespective of its discourse prominence, their real-time processing patterns differed. L1 speakers initially mis-retrieved the inappropriate antecedent regardless of its contextual prominence. L1 Russian/L2 German speakers, in contrast, were affected by the antecedent’s discourse status, considering it only when it was discourse-new but not when it had previously been introduced as the discourse topic. Our findings show that L2 comprehenders are highly sensitive to discourse dynamics such as topic shifts, supporting the claim that discourse-level cues are more strongly weighted during L2 compared to L1 processing.

Philology. Linguistics, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2022

HABITUS MASYARAKAT KRAPYAK KIDUL KOTA PEKALONGAN TERKAIT TRADISI LOPIS RAKSASA

Divani Majidullah Syarief, Ufairoh Shoofii Abiyyi, Umu Hana Amini et al.

Penelitian ini dilaksanakan dengan maksud untuk mengetahui habitus-habitus masyarakat Krapyak Kidul terkait tradisi lopis raksasa. Tradisi ini rutin dihelat setiap tahun pada tanggal 8 Syawal atau tujuh hari setelah Idulfitri di Krapyak Kidul, Kota Pekalongan. Penelitian tergolong deskriptif kualitatif yang menggunakan pendekatan strukturalisme genetik Pierre Bourdieu. Pengambilan data dilakukan dengan cara wawancara terhadap narasumber yang direkam dengan gawai untuk nantinya data-data yang dibutuhkan dalam penelitian diambil dengan teknik simak dan catat. Data yang terkumpul akan dibedah berdasarkan konsep habitus dalam strukturalisme genetik Pierre Bourdieu. Hasil yang diperoleh dalam penelitian ini antara lain: (1) Habitus persaudaraan, tradisi lopis raksasa memiliki semangat persaudaraan untuk merekatkan masyarakat; (2) Habitus kompak, tradisi lopis raksasa merupakan media untuk mengompakkan masyarakat yang heterogen; (3) Habitus religius, tradisi lopis raksasa tidak dapat dipisahkan dari nilai keagamaan; (4) Habitus berbagi, tradisi lopis raksasa mengajarkan untuk berbagi pada sesama; (5) Habitus gotong royong, pembuatan lopis raksasa melalui proses yang panjang dan dilakukan bersama-sama; (6) Habitus kerja keras, eksistensi tradisi lopis raksasa tidak lepas dari usaha dan kerja keras masyarakat dalam melestarikan tradisi ini; dan (7) Habitus berdagang, tradisi lopis raksasa mampu menaikkan perekonomian masyarakat dengan adanya kesempatan untuk berdagang. Kata kunci: tradisi, lopis raksasa, syawalan

Language. Linguistic theory. Comparative grammar, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2022

LA CULTURE DE LA LECTURE ET SES CONTRAINTES EN REPUBLIQUE DEMOCRATIQUE DU CONGO

KATANGA KALONJI Michel

Résumé : Cette réflexion porte sur la problématique de la lecture publique en R. D. C. La première nation francophone, hormis la France, en termes des locuteurs de la langue française, ne semble pas avoir une bonne culture de la lecture à cause des multiples contraintes. La culture de la lecture est liée à des contraintes dans les vécus quotidiens des Congolais. Le manque des structures visant à inciter la politique culturelle de la lecture demeure inexistant.

Arts in general, Computational linguistics. Natural language processing

Detail Sumber

CrossRef Open Access 2021

Arabic Computational Linguistics: Potential, Pitfalls and Challenges

Elie Wardini

1 sitasi en

Detail DOI Sumber

DOAJ Open Access 2021

Building and Using a Lexical Knowledge Base of Near-Synonym Differences

Diana Inkpen, Graeme Hirst

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2021

Knowledge Modelling for Establishment of Common Ground in Dialogue Systems

Lina Varonina, Stefan Kopp

The establishment and maintenance of common ground, i.e. mutual knowledge, beliefs and assumptions, is important for dialogue systems in order to be seen as valid interlocutors in both task-oriented and open-domain dialogue. It is therefore important to provide these systems with knowledge models, so that their conversations could be grounded in the knowledge about the relevant domain. Additionally, in order to facilitate understanding, dialogue systems should be able to track the knowledge about the beliefs of the user and the level of their knowledgeability, e.g., the assumptions that they hold or the extent to which a piece of knowledge has been accepted by the user and can now be considered shared. This article provides a basic overview of current research on knowledge modelling for the establishment of common ground in dialogue systems. The presented body of research is structured along three types of knowledge that can be integrated into the system: (1) factual knowledge about the world, (2) personalised knowledge about the user, (3) knowledge about user’s knowledge and beliefs. Additionally, this article discusses the presented body of research with regards to its relevance for the current state-of-the-art dialogue systems and several ideal application scenarios that future research on knowledge modelling for common ground establishment could aim for.

Social Sciences, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2021

Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning

Marcos Garcia

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2021

<b>Introduction to Linguistic Annotation and Text Analytics Graham Wilcock</b> (University of Helsinki) Princeton, NJ: Morgan & Claypool (Synthesis Lectures on Human Language Technologies, edited by Graeme Hirst, volume 2, No. 1), 2009, x+149 pp; paperbound, ISBN 978-1-59829-738-6, $40.00; ebook, ISBN 978-1-59829-739-3, $30.00 or by subscription

Udo Hahn

Computational linguistics. Natural language processing

Detail DOI Sumber

Hasil untuk "Computational linguistics. Natural language processing"