Hasil "Computational linguistics. Natural language processing"

DOAJ Open Access 2026

Application of genetic algorithms in digital financial resource allocation

Muqiao Cai

Abstract In the wave of digitalization, the financial resource allocation of enterprises faces severe challenges. More than 80% of enterprises have inefficiency problems, and about 60% of them waste resources and suffer huge losses due to traditional methods. Focusing on this background, this paper deeply studies the application of genetic algorithms in digital financial resource allocation. By constructing an innovative model, a coding system combining dynamic scaling real number coding and hierarchical associative symbol coding is adopted, and personalized fitness functions of factors such as income, risk, and flexibility are integrated, as well as genetic operation strategies based on elite retention of competitive selection, resource category and correlation crossover, and adaptive mutation. Experiments are conducted using 5 years of financial data of 50 listed companies to compare financial resource allocation models such as linear programming, traditional genetic algorithms, support vector machines, and decision trees. The results show that the average value of the proposed model in terms of income indicators is 15.6 million yuan, which is significantly higher than other models; the average risk coefficient is 0.35, which is lower than the comparison model; the average resource utilization rate is 85.6%, and the average score of fit with corporate strategy is 8.2 points, both of which are leading. This study provides a more efficient and accurate resource allocation method for corporate financial decision-making and enhances corporate competitiveness.

Computational linguistics. Natural language processing, Electronic computers. Computer science

Detail DOI Sumber

DOAJ Open Access 2025

Design and application of art creation education system based on generative adversarial network

Xiaoxiao Shi, Yang Yu

Abstract In response to the limitations of traditional art education, this study proposes a generative adversarial network-based system for supporting artistic creation and teaching. A novel Hierarchical Attention GAN (HAGAN) model was designed and evaluated on a curated dataset of 5000 artworks across various styles. Quantitative metrics (SSIM 0.8325, PSNR 31) and human feedback from 60 participants confirmed HAGAN’s superior quality and diversity over DCGAN and Variational Autoencoder (VAE). Students rated generated works highly in creative inspiration (avg. 4.125), while teachers affirmed their value in instructional support (avg. 4.225). Surveys and Likert-scale protocols ensured reliable subjective evaluation. Although promising, real-world deployment challenges such as device compatibility, computational load, and network access are discussed as key limitations. Future work will address scalability and integration into diverse classroom environments.

Computational linguistics. Natural language processing, Electronic computers. Computer science

Detail DOI Sumber

DOAJ Open Access 2025

EMIR ABDELKADER: ALGERIAN MILITARY RESISTANCE AND STATESMAN (1832-1847)

Mohamed REHAI

Abstract : During the nineteenth century, Algeria witnessed military resistance against the French occupation since 1830, starting with an official resistance by the Algerian regime led by Dey Hussein, which did not last long and ended with the fall of the Algerian capital on 5 July 1830, then the resistance of Haj Ahmed Bey in eastern Algeria, the latter considered himself the successor of Dey Hussein, then popular resistance began, the most important of which was the resistance of Prince Abdelkader bin Muhyiddin, this resistance began at the level of western Algeria in 1832 and expanded to cover the national level, and continued until 1847, and our article comes to highlight the personality of Prince Abdelkader and the facts of this resistance, and then to what extent it was able to counter French colonisation. Keywords: Military resistance, Emir Abdelkader, French colonisation, Treaty de Michel, Treaty of Tavna, General Vallee, General Bijou.

Arts in general, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2025

The application research of ancient calligraphy recognition model based on improved filtering and AI technology in calligraphy education in colleges and universities

Xing You

Abstract With the increase of Chinese emphasis on calligraphy education, more and more students choose to learn calligraphy courses in colleges. However, the present process of calligraphy education lacks enough content of appreciation and learning of stele characters of ancient calligraphers. To reduce the difficulty of carrying out the content of stele calligraphy education in colleges and universities, this study constructed a recognition model of ancient Chinese characters in stele. The recognition model will filter and denoise the input image data, open and close calculation, binarization, data amplification and skeletonization. It feeds into two designed convolutional neural networks to extract graphic features. The test results of recognition models showed that the recognition accuracy and recall rates of SCF, Gabor, Fast R-CNN and S-ICNN models on the test set were 79.13%, 74.52%, 82.66%, 93.87% and 76.94%, 72.41%, 88.25%, 94.09%, respectively. The data showed that the recognition accuracy of the ancient Chinese character recognition model designed by this research is significantly higher than the common methods in the market. It has certain application potential, but further research is needed to realize the purpose of applying it in college calligraphy education.

Computational linguistics. Natural language processing, Electronic computers. Computer science

Detail DOI Sumber

DOAJ Open Access 2025

Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis

Jingkai Li

Integrated Information Theory (IIT) provides a quantitative framework for explaining consciousness phenomenon, positing that conscious systems comprise elements integrated through causal properties. We apply IIT 3.0 and 4.0 — the latest iterations of this framework — to sequences of Large Language Model (LLM) representations, analyzing data derived from existing Theory of Mind (ToM) test results. Our study systematically investigates whether the differences of ToM test performances, when presented in the LLM representations, can be revealed by IIT estimates, i.e., Φmax(IIT 3.0), Φ (IIT 4.0), Conceptual Information (IIT 3.0), and Φ-structure (IIT 4.0). Furthermore, we compare these metrics with the Span Representations independent of any estimate for consciousness. This additional effort aims to differentiate between potential “consciousness” phenomena and inherent separations within LLM representational space. We conduct comprehensive experiments examining variations across LLM transformer layers and linguistic spans from stimuli. Our results suggest that sequences of contemporary Transformer-based LLM representations lack statistically significant indicators of observed “consciousness” phenomena but exhibit intriguing patterns under spatio-permutational analyses.

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2025

RACHNA: Racial hoax code mixed Hindi–English with novel language augmentation

Shanu SidharthKumar Dhawale, Rahul Ponnusamy, Prasanna Kumar Kumaresan et al.

Warning: This paper contains derogatory language that may be offensive to some readers. As a type of misinformation, hoaxes seek to propagate incorrect information in order to gain popularity on social media. Racial hoaxes are a particular kind of hoax that is particularly harmful since they falsely link individuals or groups to crimes or incidents. This involves nuanced challenges of identifying false accusations, fabrications, and stereotypes that falsely impact other social, ethnic or out groups in negative actions. On the other hand, social media comments frequently incorporate many languages and are written in scripts that are not native to the user. They also rarely adhere to inflexible grammar norms. Lack of code-mixed racial hoax annotated data for a Low-resource languages like Code-Mixed Hindi and English make this issue more challenging. In order to address this, we collected 210,768 sentences and generated a racial hoax-annotated, code-mixed corpus of 5,105 YouTube comment postings in Hindi–English as HoaxMixPlus corpus. We outline the method of building the corpus and assigning the binary values indicating the presence of racial hoax which fills a critical gap in understanding and combating racialized misinformation along with inter-annotator agreement. We display the results of analysis, training using this corpus as a benchmark, new methodologies which includes dictionary based approach by correctly identifying code-mixed words as well as novel language augmentation strategies like transliteration and language tags. We evaluate several models on this dataset and demonstrate that our augmentation strategies lead to consistent performance gains.

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2025

Прикладні й теоретичні аспекти дослідження усного мовлення крізь призму лінгвістичної експертизи

Зоя Дудник

Вступ. Розглянуто питання оновлення методологічного підходу до лінгвістичної експертизи усного мовлення. Потенціал для переосмислення проблеми міститься в експериментальних та прикладних досягненнях сучасної фонетичної науки. Досвід роботи автора з експертами у спеціальних експертних установах надає можливість звернути увагу методистів і юристів на посилення професійної підготовки експертів, зокрема на шляхи пристосування спеціальних знань із загальної та прикладної фонетики до експертних завдань зі спеціальності «Лінгвістичне дослідження усного мовлення» – 7.3. Методи. Використано метод оцінки фонетичного змісту в експертних дослідженнях щодо пристосування спеціальних знань до конкретних питань ідентифікації особи. Здійснено порівняння об’єкта, предмета та матеріалу (даних) у прикладних галузях, об’єднаних спільною потребою у фонетичних знаннях. Описано фонетичні підстави для поглиблення методологічних підходів у фоноскопічному дослідженні. Результати. Автор доповнив визначення спеціальних експертних знань і навів приклади помилкового оброблення необізнаним експертом лінгвістичних завдань, які були отримані зі спостережень в аналізованій спеціальності. Автор запропонував розробити модель коректного використання необхідних фонетичних знань для визначення ступеня достовірності експертних висновків. Висновки. Зроблено висновок про опосередкований вплив фонетичних знань на спроможність експерта пристосувати спеціальні знання до завдань експертного дослідження. Показано необхідність у зміні методологічних підходів, формуванні концептуальної бази й оволодінні інструментальним фонетичним аналізом акустичних параметрів мовлення. Інформація про автора: Дудник Зоя Василівна – кандидат філологічних наук, завідувач лабораторії експериментальної фонетики Навчально-наукового інституту філології Київського національного університету імені Тараса Шевченка Електронна адреса: z.dudnyk@knu.ua

Language. Linguistic theory. Comparative grammar, Computational linguistics. Natural language processing

Detail DOI Sumber

CrossRef Open Access 2024

Semantic Accuracy and Cultural Adaptability in the English-Chinese Translation of Jane Eyre Based on Computational Linguistics and Natural Language Processing Techniques

Chunbo Ye

Abstract This paper takes Jane Eyre as an example to study the role of computational linguistics and natural language processing technology on the semantic accuracy and cultural adaptability of English-Chinese translation. We construct an interactive translation system of Jane Eyre, based on the interactive translation model design of computational linguistics and natural language processing technology, to support the application of these technologies in the English-Chinese translation of Jane Eyre. Context-association mapping, semantic retrieval, and semantic ontology structural feature construction methods are employed to evaluate semantic accuracy and cultural adaptability. We empirically analyze the semantic accuracy and cultural adaptability of the English-Chinese translation of Jane Eyre using text data from a Python web crawler. The results show that the semantic accuracy of the English-Chinese translation of Jane Eyre in this paper’s model is the highest compared to SAN (self-attention network) and RNN (recurrent neural network) translation models. Among the 200 students sampled, the translation result achieved a satisfaction rate of 71.5%. The work’s translation of literary sentences is more in line with the effect of Chinese expression, indicating excellent cultural adaptability.

1 sitasi en

Detail DOI Sumber

DOAJ Open Access 2024

The Impact of Word Splitting on the Semantic Content of Contextualized Word Representations

Aina Garí Soler, Matthieu Labeau, Chloé Clavel

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2024

Thinking Ethics in an African Space

Peter ONI

Abstract: This paper explores major issues in ethics. It considers relevant concepts related to the core branches and theories of ethics. The paper looks at notions such as bad and good as well as character as they apply to communal societies such as African traditional societies. African traditional societies are intrinsically communal in nature. Their communitarian and altruistic approaches to life often under-estimated need genuine attention. As a result, this paper argues that a synoptic look at ethics not only treats ethics from a global and cosmopolitan perspective, it also brings to bear the fundamentals of humanness, love, peace, integrity and character that ground most African Traditional Societies. Keywords : African Traditional Societies, Character, Communitarian, Ethics, Humanness.

Arts in general, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2024

Les échos de Batouala dans Les Bouts de bois de Dieu d’Ousmane Sembène et Crépuscule des temps anciens de Nazi Boni

Assane NDIAYE

Résumé : Dans son fond et dans sa forme, Batouala, véritable roman nègre de René Maran a eu, consciemment ou involontairement, un impact sur certains romanciers Négro-africains. C’est ainsi que la relecture de Les Bouts de bois de Dieu et de Crépuscule des temps anciens a révélé des similitudes thématiques et stylistiques avec le roman du Guyanais. Le but de cette étude est justement de mettre en exergue les échos de Batouala dans ces deux œuvres romanesques, respectivement de Sembène Ousmane et de Nazi Boni. En s’appuyant sur une démarche stylico-thématique et une approche comparatiste, l’article a cherché à montrer des ressemblances évidentes entre le premier roman de Maran et les deux autres cités. Mots-clés : Batouala, échos, style, roman négro-africain, thème

Arts in general, Computational linguistics. Natural language processing

Detail DOI Sumber

CrossRef Open Access 2023

Review of Dunn (2022): Natural Language Processing for Corpus Linguistics

Hanna Schmück

1 sitasi en

Detail DOI Sumber

DOAJ Open Access 2023

Les catégories de la philosophie bantu rwandaise face aux catégories aristotéliciennes : de la logique à l’ontologie

Anna LIAKI EKPALE

Résumé : Notre réflexion établit la différence entre Aristote et la pensée bantu-rwandaise concernant les catégories. Cette différence n’est pas que logique, elle est aussi métaphysique. Elle porte aussi bien sur la notion de l’être qui ne s’applique pas de la même manière chez Aristote que dans la philosophie bantu-rwandaise. Chez Aristote, la substance et l’être s’appliquent à tout ce qui existe, même à Dieu être par excellence, tandis que pour les bantu, Dieu est en dehors de l’être et de la substance puis qu’il est non principié, mais il partage l’existant avec les autres. Dans la philosophie européenne, les finalités de l’être sont celles de Dieu être suprême, tandis que chez les rwandais, les finalités des êtres, intelligents et non intelligents, limitent des êtres à l’homme à son bonheur. Dieu est en dehors des finalités humaines. Les différences sont telles qu’il y a lieu d’affirmer que les catégories bantu-rwandaise sont d’abord des catégories des êtres séparés avant que Alexis Kagame trouve le principe unificateur ‘’Ntu’’. Les catégories d’Aristote sont celles de l’être à qui elles déterminent les différentes modalités d’être. Mots-clés : catégories, philosophie, logique, ontologie

Arts in general, Computational linguistics. Natural language processing

Detail Sumber

CrossRef Open Access 2023

Intelligent Natural Language Processing for Epidemic Intelligence

Danilo Croce, Federico Borazio, Giorgio Gambosi et al.

Epidemic Intelligence activities depend significantly on analysts’ ability to locate and aggregate heterogeneous and complex information promptly. The level of novelty of the targeted information is a challenge. The earlier events of interest are located the larger the benefit: more accurate and timely warnings can be made available by the analysts. In this work, the role of Natural Language Processing technologies is investigated. In particular, transformer-based encoding of Web documents (such as newspaper articles as well as epidemic bulletins) for the automatic recognition of events and relevant epidemic information is adopted and evaluated. The resulting framework is configured as a domain-specific meta-search methodology and as a possible basis for a novel generation of Web search environments supporting the Epidemic Intelligence analyst.

en

Detail DOI Sumber

DOAJ Open Access 2022

Die Integration prosodischen und syntaktischen Wissens bei der Ermittlung der Textkohärenz im schriftlichen Textverstehen

Gianluca Cosentino

Reading is a highly complex cognitive process. From a neurobiological point of view, it involves at least six linguistic sub-competences including orthographic, semantic, syntactic, phonetic and prosodic competence. Each of these skills is required of the reader to extract different types of information from the text, ranging from the perceptual and the syntactic to the lexical and pragmasemantic one. Based on the main results of cognitive-oriented research on text and reading, this paper illustrates how prosodic competence, combined with formal coherence patterns, can be considered and employed as a reading strategy, enabling the reader to progressively grasp the meanings of a text. In the first part particular attention will be paid to the description of the syntax-prosody interface as well as of its function as a means of encoding of information structure in German. With regard to the teaching of German as a foreign language, the second part of the paper will present a method to train intonational reading. This approach aims to raise foreign language readers’ awareness of prosodic features of written texts and to introduce them to melodic patterns which can be encountered while reading. Such a practical training may prove to be advantageous, as the most typical and apparently “uncorrectable” errors committed by foreign language learners when speaking and reading German texts are almost always related to the domains of phonology and prosody.

Computational linguistics. Natural language processing, Language. Linguistic theory. Comparative grammar

Detail DOI Sumber

DOAJ Open Access 2021

Luigi Antonelli e il rapporto con Luigi Pirandello: Il Maestro (1934)

Ilaria Torrieri

Ciò che ha maggiormente suscitato il mio interesse nei confronti della figura di Luigi Antonelli e del suo speciale rapporto con Luigi Pirandello è stata la consultazione dei manoscritti e dattiloscritti del drammaturgo abruzzese presso l’archivio di famiglia situato a Roma. In particolare, ho consultato i dattiloscritti delle commedie in tre atti La nascita dell’uomo, La casa a tre piani, Il convegno, La bottega dei sogni, Fior di valle e degli atti unici Adamo ed Eva, Non perdere il treno, Incontro sentimentale, un testo senza titolo su un litigio tra fidanzati per motivi di sport, divenuto poi Amore sportivo, C’è qualcuno al cancello, I diavoli nella foresta, Storia di burattini. Tra i manoscritti, invece, vi sono il racconto Sulle ali della scapolamina. La mia operazione chirurgica, il racconto Aligi senza gregge e Sulle ali della scapolamina. La mia operazione chirurgica e Pinocchio, avventura fantastica di Collodi nella realizzazione scenica di Luigi Antonelli. Sulla base dello studio di tali opere, del pensiero di Antonelli, della sua biografia e dell’importante collaborazione con Pirandello, ho deciso di analizzare in maniera approfondita le influenze reciproche, le corrispondenze varie, i simili e i diversi punti di vista.

Computational linguistics. Natural language processing, Epistemology. Theory of knowledge

Detail Sumber

DOAJ Open Access 2020

PAAD: POLITICAL ARABIC ARTICLES DATASET FOR AUTOMATIC TEXT CATEGORIZATION

Dhafar Hamed Abd, Ahmed T. Sadiq, Ayad R. Abbas

Now day’s text Classification and Sentiment analysis is considered as one of the popular Natural Language Processing (NLP) tasks. This kind of technique plays significant role in human activities and has impact on the daily behaviours. Each article in different fields such as politics and business represent different opinions according to the writer tendency. A huge amount of data will be acquired through that differentiation. The capability to manage the political orientation of an online article automatically. Therefore, there is no corpus for political categorization was directed towards this task in Arabic, due to the lack of rich representative resources for training an Arabic text classifier. However, we introduce political Arabic articles dataset (PAAD) of textual data collected from newspapers, social network, general forum and ideology website. The dataset is 206 articles distributed into three categories as (Reform, Conservative and Revolutionary) that we offer to the research community on Arabic computational linguistics. We anticipate that this dataset would make a great aid for a variety of NLP tasks on Modern Standard Arabic, political text classification purposes. We present the data in raw form and excel file. Excel file will be in four types such as V1 raw data, V2 preprocessing, V3 root stemming and V4 light stemming.

Technology

Detail DOI Sumber

DOAJ Open Access 2018

A Manuscript of Masnavi Ramzul Ishq

Rafaqat Ali Shahid

Masnavi Ramzul Ish'q is a unique literary master piece of Classic Urdu Literature. It also has great importance as an early Urdu writing in Punjab. It is an important Urdu poetic composition on mysticism. Masnavi Ramzul Ish'q was written by Ghulam Qadir Shah, a famous Noshahi Sufi (mystic leader) of Punjab. Masnavi Ramzul Ish'q is famous among mystic lovers and in Urdu literary circles, so several manuscripts of this Masnavi have been prepared by various scribers and have been found so far. Some of them are considered important in the editing and studying of this  Masnavi. This article deals with another and an unfamiliar copy of the manuscript of Masnavi Ramzul Ish'q. In this article, the writer provides essential and important information about the concerned manuscript. He has covered various aspects of the detailed study of this manuscript. <br />

Language. Linguistic theory. Comparative grammar, Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2018

Reproducibility in Computational Linguistics: Are We Willing to Share?

Martijn Wieling, Josine Rawee, Gertjan van Noord

This study focuses on an essential precondition for reproducibility in computational linguistics: the willingness of authors to share relevant source code and data. Ten years after Ted Pedersen’s influential “Last Words” contribution in Computational Linguistics, we investigate to what extent researchers in computational linguistics are willing and able to share their data and code. We surveyed all 395 full papers presented at the 2011 and 2016 ACL Annual Meetings, and identified whether links to data and code were provided. If working links were not provided, authors were requested to provide this information. Although data were often available, code was shared less often. When working links to code or data were not provided in the paper, authors provided the code in about one third of cases. For a selection of ten papers, we attempted to reproduce the results using the provided data and code. We were able to reproduce the results approximately for six papers. For only a single paper did we obtain the exact same results. Our findings show that even though the situation appears to have improved comparing 2016 to 2011, empiricism in computational linguistics still largely remains a matter of faith. Nevertheless, we are somewhat optimistic about the future. Ensuring reproducibility is not only important for the field as a whole, but also seems worthwhile for individual researchers: The median citation count for studies with working links to the source code is higher.

Computational linguistics. Natural language processing

Detail DOI Sumber

DOAJ Open Access 2018

Guest Editorial

Cheng-Lin Liu, Jian Yang

Computational linguistics. Natural language processing, Computer software

Detail DOI Sumber

Hasil untuk "Computational linguistics. Natural language processing"