Hasil untuk "Languages and literature of Eastern Asia, Africa, Oceania"

Menampilkan 20 dari ~2976373 hasil · dari CrossRef, DOAJ, arXiv

JSON API
arXiv Open Access 2026
Building a Strong Instruction Language Model for a Less-Resourced Language

Domen Vreš, Tjaša Arčon, Timotej Petrič et al.

Large language models (LLMs) have become an essential tool for natural language processing and artificial intelligence in general. Current open-source models are primarily trained on English texts, resulting in poorer performance on less-resourced languages and cultures. We present a set of methodological approaches necessary for the successful adaptation of an LLM to a less-resourced language, and demonstrate them using the Slovene language. We present GaMS3-12B, a generative model for Slovene with 12 billion parameters, and demonstrate that it is the best-performing open-source model for Slovene within its parameter range. We adapted the model to the Slovene language using three-stage continual pre-training of the Gemma 3 model, followed by two-stage supervised fine-tuning (SFT). We trained the model on a combination of 140B Slovene, English, Bosnian, Serbian, and Croatian pretraining tokens, and over 200 thousand English and Slovene SFT examples. We evaluate GaMS3-12B on the Slovenian-LLM-Eval datasets, English-to-Slovene translation, and the Slovene LLM arena. We show that the described model outperforms 12B Gemma 3 across all three scenarios and performs comparably to much larger commercial GPT-4o in the Slovene LLM arena, achieving a win rate of over 60 %.

en cs.CL, cs.LG
DOAJ Open Access 2025
PEMBELAJARAN MENULIS KARYA ILMIAH MENGGUNAKAN DIMENSI BERNALAR KRITIS DENGAN MODEL PROJECT BASED LEARNING

Annisa Utrina, Rustam, Lusia Oktri Wini

This study's goal was to explain how students at SMAN 2 Sungai Penuh learned to write scientific papers using critical reasoning dimensions using a project-based learning model. The study's qualitative case study approach was used to collect data, which was then described descriptively through interviews, documentation studies, and observation.  The findings show that the teaching module, critical reasoning indicators, and the project-based learning model's phases are all used to effectively teach students how to write scientific articles utilizing critical reasoning dimensions. The findings of this study will be useful in helping the development of science by finding new theories to solve the problems faced by Indonesian language teachers.

Theory and practice of education, Languages and literature of Eastern Asia, Africa, Oceania
DOAJ Open Access 2025
“Records of Things Heard on Vladivostok”: A forgotten source on the history of Vladivostok in the late 19th century

V. A. Bushmakin, V. P. Zaytsev

This article examines and analyzes the report of the Tokyo Geographical Society correspondent Kambe Ōichi, entitled “Records of Things Heard on Vladivostok” (Kaisan’i kibun 海參威記聞 = 海參崴 紀聞). It was published in 1882–1883. For the first time, detailed historiographical and bibliographic information about it is provided, the history of its publication is traced, and its contents are revealed. The report consists of three parts – records for the period from May 1881 to April 1882, an appendix to them in the form of an “annual report” of the Vladivostok city government for 1882, and a continuation of records covering the period from April 1882 to May 1883. The report was written at a time when representative offices of Japanese companies were just beginning to open in Vladivostok, which naturally led to the expansion of the Japanese presence and diaspora in the region, and therefore it provides invaluable historical information about this early stage of the penetration of Japanese business into the Russian Far East. Despite its importance, this source is now largely forgotten. This publication is an attempt to point out its importance and reintroduce it into scientific circulation. We believe that this will make a significant contribution to research on the history of Vladivostok and Primorsky Krai in all its aspects and will help to supplement the information available in Russian-language sources. This article is only the first step in studying Kambe’s report. Much work remains to be done to decipher the names found there, as well as to identify the primary sources – the Russian and English documents that Kambe had at his disposal and that formed the basis of his translations.

Japanese language and literature
DOAJ Open Access 2025
The Four-Class System (sideng renzhi 四等人制) of Administration During the Yuan Dynasty (1271–1368) in China

Tatiana Frank

Mongol rule in China stands as a remarkable example of the amalgamation of two distinct cultures—one sedentary and one agricultural. The progression of the Mongol conquest in both northern and southern China warrants special attention. Initially, the Mongol campaigns in northern China (1211–1234) were marked by excessive cruelty, city destruction, the conversion of lands into pastures, and the displacement of the conquered population. However, this strategy proved to be unproductive, yielding minimal benefits for the Mongols. The strategic proposal presented by Yelü Chucai 耶律楚材 (1189–1243), an adviser to Genghis Khan (Mong. Činggis qaγan, Temüǰin 1162–1227) and Ögedei Khan (Mong. Ögedei qaγan, 1186–1241), compelled the conquerors to reassess their subsequent plans. During the reign of the Mongols in China, the population was divided into four groups: the Mongols, the semu 色目, the northern Chinese, and the southern Chinese. The ethnic hierarchy during the Yuan Dynasty was a structured system that categorised the population into distinct classes, primarily to facilitate governance and maintain social order within the diverse and vast empire. This hierarchy had significant implications for the social, political, and economic life of the people under the Mongol rule. Moreover, the Mongols created their own centralised administrative system, which mostly excluded the Chinese from key government positions. The Chinese were often assigned to minor positions or given fewer opportunities for promotion. This study delves into the traits of the four-class system and the Mongol administrative system in China. The ethnic policy implemented by the Mongols against the conquered people during the Yuan dynasty had a significant impact on social relations, economic activity, and political stability in China, which partially contributed to the dynasty’s later downfall

Chinese language and literature
arXiv Open Access 2025
Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages

David Demitri Africa, Suchir Salhan, Yuval Weiss et al.

Named-entity recognition (NER) in low-resource languages is usually tackled by finetuning very large multilingual LMs, an option that is often infeasible in memory- or latency-constrained settings. We ask whether small decoder LMs can be pretrained so that they adapt quickly and transfer zero-shot to languages unseen during pretraining. To this end we replace part of the autoregressive objective with first-order model-agnostic meta-learning (MAML). Tagalog and Cebuano are typologically similar yet structurally different in their actor/non-actor voice systems, and hence serve as a challenging test-bed. Across four model sizes (11 M - 570 M) MAML lifts zero-shot micro-F1 by 2-6 pp under head-only tuning and 1-3 pp after full tuning, while cutting convergence time by up to 8%. Gains are largest for single-token person entities that co-occur with Tagalog case particles si/ni, highlighting the importance of surface anchors.

en cs.CL, cs.AI
arXiv Open Access 2025
Trust and Trustworthiness from Human-Centered Perspective in HRI -- A Systematic Literature Review

Debora Firmino de Souza, Sonia Sousa, Kadri Kristjuhan-Ling et al.

The Industry 5.0 transition highlights EU efforts to design intelligent devices that can work alongside humans to enhance human capabilities, and such vision aligns with user preferences and needs to feel safe while collaborating with such systems take priority. This demands a human-centric research vision and requires a societal and educational shift in how we perceive technological advancements. To better understand this perspective, we conducted a systematic literature review focusing on understanding how trust and trustworthiness can be key aspects of supporting this move towards Industry 5.0. This review aims to overview the most common methodologies and measurements and collect insights about barriers and facilitators for fostering trustworthy HRI. After a rigorous quality assessment following the Systematic Reviews and Meta-Analyses guidelines, using rigorous inclusion criteria and screening by at least two reviewers, 34 articles were included in the review. The findings underscores the significance of trust and safety as foundational elements for promoting secure and trustworthy human-machine cooperation. Confirm that almost 30% of the revised articles do not present a definition of trust, which can be problematic as this lack of conceptual clarity can undermine research efforts in addressing this problem from a central perspective. It highlights that the choice of domain and area of application should influence the choice of methods and approaches to fostering trust in HRI, as those choices can significantly affect user preferences and their perceptions and assessment of robot capabilities. Additionally, this lack of conceptual clarity can be a potential barrier to fostering trust in HRI and explains the sometimes contradictory findings or choice of methods and instruments used to investigate trust in robots and other autonomous systems in the literature.

DOAJ Open Access 2024
Ritual: Violence and Non-violence

Ganesh U. Thite

Current paper looks at the vicissitudes of thought on violence and non-violence in India, from Vedic period to the present. The early Vedic people lived a nomadic life and practiced customary animal sacrifice. Gradually, however, they started using euphemisms in connection with ritualistic violence and switched subsequently to non-violent rituals. Possibly, because there was a lot of opposition to ritualistic violence, mainly from the Buddhist and the Jaina thinkers, even the later Hinduism ultimately accepted the principle of ahiṃsā (non-violence). Although at present most followers of Vedic rituals do not practice violence when performing Vedic rituals, some others still partly accept it and act accordingly. Also, there is some ritualistic violence outside the Vedic ritual, but there is definitely a change in outlook.

Indo-Iranian languages and literature, Languages and literature of Eastern Asia, Africa, Oceania
DOAJ Open Access 2024
Interseksionele feminisme in Afrikaanse poësie: Lynthia Julius se Uit die kroes

Hennely Nel

In the current transnational discourse on fourth-wave feminism, “intersectional feminism” is a fundamental concept. The representation of marginalised voices of especially Black women from underrepresented contexts, such as the Global South, is emphasised in an attempt to decolonise the formal domains of literature, academia and the media. Historically, there is a gap in the representation of diverse Black female voices in South African literatures. However, there has recently been an increase in the publication of the literary texts by previously marginalised voices, especially in Afrikaans poetry. Diverse perspectives are shared regarding the complexities of the intersection of identity categories including race, gender, culture, identity, class, language and socioeconomic status in South African society, and how it affects the previously marginalised. A voice that represents intersectional feminist issues in the South African and Afrikaans contexts can be found in Lynthia Julius’s debut poetry book, Uit die kroes (From the kroes, 2020). In this article, the significance of Julius’s unique, intersectional feminist viewpoint, with stories and perspectives from the Northern Cape, is investigated. The focus is specifically on how Julius represents a ‘triple marginalised’ voice in the South African and Afrikaans contexts with regard to her gender, race and language. Furthermore, I will discuss how the uniqueness of her collection of poems and Northern Cape Afrikaans, that have rarely been provided with a platform in the Afrikaans literary canon, contribute to giving a voice to the historic ‘voiceless’. The importance of Julius’s voice and how it highlights the heterogeneity of previously marginalised groups in South Africa, are also explored. In conclusion it is argued that the publication of poets with diverse intersectional feminist perspectives, such as Julius, can be deemed a positive step in the direction of the decolonising process of the Afrikaans literature and feminism.

African languages and literature
DOAJ Open Access 2024
Development of Competency-Based Arabic Language Curriculum in Traditional Islamic Boarding Schools

Nur Kholis, M. Arif Mustofa

This study aims to determine the development of Maharoh-based Arabic language curriculum (competence) in Traditional Islamic Boarding Schools. The rapid development of today's times makes people want that even though a student is in a pesantren, his Arabic language skills must also develop, namely being able to speak Arabic passively and actively. Based on this, the reform of the Arabic learning system and curriculum in Islamic boarding schools is an urgent need, so that Islamic boarding schools can keep pace with the rapid development of science, technology, and information. This research uses a qualitative approach in the form of a case study. With a qualitative approach in the form of a case study, the important findings in this study are the results of curriculum development in curriculum components consisting of; The components of the objectives and development are contained in the objectives of learning Arabic which include the abilities that must be possessed by students in Arabic language skills, namely maharoh (competence) istima, maharoh (competence) kalam, maharoh (competence) qiroah and maharoh (competence) kitabah. The development of content/material components is found in materials related to daily activities. At the same time, the development of process components is found in the use of various strategies and methods by adapting to the material being studied. Developing an evaluation component involves conducting varied evaluations and adjusting the character of the material being taught.

Language and Literature, Languages and literature of Eastern Asia, Africa, Oceania
arXiv Open Access 2024
Biophysics in Africa: challenges, priorities, and hopes

Tjaart P. J. Krüger

This report is a serious call to scientists, innovators, investors, and policymakers to invest in the development of biophysics in Africa. The complex problems of our day demand multidisciplinary approaches, and biophysics offers training in much-needed multi- and cross-disciplinary thinking. Biophysics is a research field at the forefront of modern science because it provides a powerful scientific platform that addresses many of the critical challenges humanity faces today and in the future. It is a vital source of innovation for any country interested in developing a high-tech economy. However, there is woefully little biophysics educational and research activity in Africa, representing a critical gap that must be addressed with urgency. This report suggests key research areas that African biophysicists should focus on, identifies major challenges to growing biophysics in Africa, and underscores the high-priority needs that must be addressed.

en physics.soc-ph, physics.bio-ph
arXiv Open Access 2024
Justice in Healthcare Artificial Intelligence in Africa

Aloysius Ochasi, Abdoul Jalil Djiberou Mahamadou, Russ B. Altman

There is an ongoing debate on balancing the benefits and risks of artificial intelligence (AI) as AI is becoming critical to improving healthcare delivery and patient outcomes. Such improvements are essential in resource-constrained settings where millions lack access to adequate healthcare services, such as in Africa. AI in such a context can potentially improve the effectiveness, efficiency, and accessibility of healthcare services. Nevertheless, the development and use of AI-driven healthcare systems raise numerous ethical, legal, and socio-economic issues. Justice is a major concern in AI that has implications for amplifying social inequities. This paper discusses these implications and related justice concepts such as solidarity, Common Good, sustainability, AI bias, and fairness. For Africa to effectively benefit from AI, these principles should align with the local context while balancing the risks. Compared to mainstream ethical debates on justice, this perspective offers context-specific considerations for equitable healthcare AI development in Africa.

en cs.CY, cs.AI
arXiv Open Access 2023
Parallel Corpus for Indigenous Language Translation: Spanish-Mazatec and Spanish-Mixtec

Atnafu Lambebo Tonja, Christian Maldonado-Sifuentes, David Alejandro Mendoza Castillo et al.

In this paper, we present a parallel Spanish-Mazatec and Spanish-Mixtec corpus for machine translation (MT) tasks, where Mazatec and Mixtec are two indigenous Mexican languages. We evaluated the usability of the collected corpus using three different approaches: transformer, transfer learning, and fine-tuning pre-trained multilingual MT models. Fine-tuning the Facebook M2M100-48 model outperformed the other approaches, with BLEU scores of 12.09 and 22.25 for Mazatec-Spanish and Spanish-Mazatec translations, respectively, and 16.75 and 22.15 for Mixtec-Spanish and Spanish-Mixtec translations, respectively. The findings show that the dataset size (9,799 sentences in Mazatec and 13,235 sentences in Mixtec) affects translation performance and that indigenous languages work better when used as target languages. The findings emphasize the importance of creating parallel corpora for indigenous languages and fine-tuning models for low-resource translation tasks. Future research will investigate zero-shot and few-shot learning approaches to further improve translation performance in low-resource settings. The dataset and scripts are available at \url{https://github.com/atnafuatx/Machine-Translation-Resources}

en cs.CL
arXiv Open Access 2023
Evolution of ESG-focused DLT Research: An NLP Analysis of the Literature

Walter Hernandez Cruz, Kamil Tylinski, Alastair Moore et al.

Distributed Ledger Technology (DLT) faces increasing environmental scrutiny, particularly concerning the energy consumption of the Proof of Work (PoW) consensus mechanism and broader Environmental, Social, and Governance (ESG) issues. However, existing systematic literature reviews of DLT rely on limited analyses of citations, abstracts, and keywords, failing to fully capture the field's complexity and ESG concerns. We address these challenges by analyzing the full text of 24,539 publications using Natural Language Processing (NLP) with our manually labeled Named Entity Recognition (NER) dataset of 39,427 entities for DLT. This methodology identified 505 key publications at the DLT/ESG intersection, enabling comprehensive domain analysis. Our combined NLP and temporal graph analysis reveals critical trends in DLT evolution and ESG impacts, including cryptography and peer-to-peer networks research's foundational influence, Bitcoin's persistent impact on research and environmental concerns (a "Lindy effect"), Ethereum's catalytic role on Proof of Stake (PoS) and smart contract adoption, and the industry's progressive shift toward energy-efficient consensus mechanisms. Our contributions include the first DLT-specific NER dataset addressing the scarcity of high-quality labeled NLP data in blockchain research, a methodology integrating NLP and temporal graph analysis for large-scale interdisciplinary literature reviews, and the first NLP-driven literature review focusing on DLT's ESG aspects.

en cs.IR, cs.CL
DOAJ Open Access 2022
Ancient Coins of Japan

Marianna Lázár

This paper aims to investigate the origins of ancient coins of Japan (until the 10th century CE), introduce the characteristics of their design and patterns, and examine their role in early Japanese culture and public administration, while briefly introducing the ancient Chinese coins that served as inspiration. Japan adopted numerous ancient Chinese cultural practices during the Asuka and Nara periods (538–794 CE). Especially from the second half of the 7th century to the 8th century CE, Japan introduced various social systems from the Tang dynasty in order to build a centralised government. Japanese nobles recognised the importance of metallic currency, leading to some silver and bronze coin production at the second half of the 7th century CE, including that of Mumon Ginsen and Fuhonsen coins. Scholars believe that they were modelled after ancient Chinese coins. The mintage was regarded as an essential tool for the Japanese government to display the independence and the authority of the nation, both inside and outside the country. The system of the first official imperial currency (Kōchōsen) was introduced to Japan in the early 8th century CE and inspired by the Kāiyuán Tōngbǎo cash coins of the Tang dynasty. The oldest known official Japanese imperial coinage is the Wadō Kaichin. In the second half of the 8th century CE, the national currency was reformed, and silver and gold cash coins were introduced. However, by the end of the 10th century CE, Japan subsequently suspended the mintage and circulation of coins.

Chinese language and literature
arXiv Open Access 2022
@C -- augmented version of C programming language

Iosif Iulian Petrila

The augmented version of C programming language is presented. The language was completed with a series of low-level and high-level facilities to enlarge the language usage spectrum to various computing systems, operations, users. The ambiguities and inconsistencies have been resolved by managing problematic and undefined languages elements through an interpretation and management similar to that used in the case of other C syntax based languages. The proposed augmentative completeness elements, through @C approach, preserve the spirit of C language and its basic characteristics through compatibility with the standard version but also allow rejuvenation and bring C language to the present programming languages state of the art.

en cs.PL, cs.FL
arXiv Open Access 2022
The stochastic nature of power-grid frequency in South Africa

Leonardo Rydin Gorjão, Jacques Maritz

In this work, we explore two mechanisms that explain non-Gaussian behaviour of power-grid frequency recordings in the South African grid. We make use of a Fokker-Planck approach to power-grid frequency that yields a direct relation between common model parameters such as inertia, damping, and noise amplitude and non-parametric estimations of the same directly from power-grid frequency recordings. We propose two explanations for the non-Gaussian leptokurtic distributions in South Africa: The first based on multiplicative noise in power-grid frequency recordings, which we observe in South Africa; The second based on the well-known scheduled and unscheduled load shedding and rolling blackouts that beset South Africa. For the first we derive an analytic expression of the effects of multiplicative noise that permits the estimation of all statistical moments - and discuss drawbacks in comparison with the data; For the second we employ a simple numerical analysis with a modular power grid of South Africa. Both options help understand the statistics of power-grid frequency in South Africa - particularly the presence of heavy tails.

en stat.AP, eess.SY
arXiv Open Access 2022
Conjugacy languages in virtual graph products

Gemma Crowe

In this paper we study the behaviour of conjugacy languages in virtual graph products, extending results by Ciobanu, Hermiller, Holt and Rees. We focus primarily on virtual graph products in the form of a semi-direct product. First, we study the behaviour of twisted conjugacy representatives in right-angled Artin and Coxeter groups. We prove regularity of the conjugacy geodesic language for virtual graph products in certain cases, and highlight properties of the spherical conjugacy language, depending on the automorphism and ordering on the generating set. Finally, we give a criterion for when the spherical conjugacy language is not unambiguous context-free for virtual graph products. We can extend this further in the case of virtual RAAGs, to show the spherical conjugacy language is not context-free.

DOAJ Open Access 2021
A mirror of Japanese studies in Russia. Review of the book “Russian-Japanese Reflections: History, Literature, Arts” by Liudmila M. Ermakova

V. V. Shchepkin

The article reviews the book by Ludmila M. Ermakova “Russian-Japanese Reflections: History, Literature, Arts” (Moscow: Vostochnaya Literatura, 2020. 327 pp. ISBN 978-5-02-039851-1). The book is a collection of the author’s recent articles which are devoted to a variety of subjects covering the history of Russian-Japanese cultural interaction and the history of Japanese studies in Russia. The review notes the breadth of the author’s interests and the depth of elaboration of each topic, the integrity of the collection and its importance for the history of Japanese studies in Russia.

Japanese language and literature
arXiv Open Access 2021
FooDI-ML: a large multi-language dataset of food, drinks and groceries images and descriptions

David Amat Olóndriz, Ponç Palau Puigdevall, Adrià Salvador Palau

In this paper we introduce the FooDI-ML dataset. This dataset contains over 1.5M unique images and over 9.5M store names, product names descriptions, and collection sections gathered from the Glovo application. The data made available corresponds to food, drinks and groceries products from 37 countries in Europe, the Middle East, Africa and Latin America. The dataset comprehends 33 languages, including 870K samples of languages of countries from Eastern Europe and Western Asia such as Ukrainian and Kazakh, which have been so far underrepresented in publicly available visio-linguistic datasets. The dataset also includes widely spoken languages such as Spanish and English. To assist further research, we include benchmarks over two tasks: text-image retrieval and conditional image generation.

en cs.CV, cs.CL

Halaman 8 dari 148819