Hasil untuk "African languages and literature"

Menampilkan 20 dari ~14473 hasil · dari arXiv, DOAJ, Semantic Scholar

JSON API
S2 Open Access 2010
The Biodiversity of the Mediterranean Sea: Estimates, Patterns, and Threats

M. Coll, C. Piroddi, J. Steenbeek et al.

The Mediterranean Sea is a marine biodiversity hot spot. Here we combined an extensive literature analysis with expert opinions to update publicly available estimates of major taxa in this marine ecosystem and to revise and update several species lists. We also assessed overall spatial and temporal patterns of species diversity and identified major changes and threats. Our results listed approximately 17,000 marine species occurring in the Mediterranean Sea. However, our estimates of marine diversity are still incomplete as yet—undescribed species will be added in the future. Diversity for microbes is substantially underestimated, and the deep-sea areas and portions of the southern and eastern region are still poorly known. In addition, the invasion of alien species is a crucial factor that will continue to change the biodiversity of the Mediterranean, mainly in its eastern basin that can spread rapidly northwards and westwards due to the warming of the Mediterranean Sea. Spatial patterns showed a general decrease in biodiversity from northwestern to southeastern regions following a gradient of production, with some exceptions and caution due to gaps in our knowledge of the biota along the southern and eastern rims. Biodiversity was also generally higher in coastal areas and continental shelves, and decreases with depth. Temporal trends indicated that overexploitation and habitat loss have been the main human drivers of historical changes in biodiversity. At present, habitat loss and degradation, followed by fishing impacts, pollution, climate change, eutrophication, and the establishment of alien species are the most important threats and affect the greatest number of taxonomic groups. All these impacts are expected to grow in importance in the future, especially climate change and habitat degradation. The spatial identification of hot spots highlighted the ecological importance of most of the western Mediterranean shelves (and in particular, the Strait of Gibraltar and the adjacent Alboran Sea), western African coast, the Adriatic, and the Aegean Sea, which show high concentrations of endangered, threatened, or vulnerable species. The Levantine Basin, severely impacted by the invasion of species, is endangered as well. This abstract has been translated to other languages (File S1).

2011 sitasi en Geography, Medicine
arXiv Open Access 2025
Measure-Theoretic Aspects of Star-Free and Group Languages

Ryoma Sin'ya, Takao Yuyama

A language $L$ is said to be ${\cal C}$-measurable, where ${\cal C}$ is a class of languages, if there is an infinite sequence of languages in ${\cal C}$ that ``converges'' to $L$. We investigate the properties of ${\cal C}$-measurability in the cases where ${\cal C}$ is SF, the class of all star-free languages, and G, the class of all group languages. It is shown that a language $L$ is SF-measurable if and only if $L$ is GD-measurable, where GD is the class of all generalised definite languages (a more restricted subclass of star-free languages). This means that GD and SF have the same ``measuring power'', whereas GD is a very restricted proper subclass of SF. Moreover, we give a purely algebraic characterisation of SF-measurable regular languages, which is a natural extension of Schutzenberger's theorem stating the correspondence between star-free languages and aperiodic monoids. We also show the probabilistic independence of star-free and group languages, which is an important application of the former result. Finally, while the measuring power of star-free and generalised definite languages are equal, we show that the situation is rather opposite for subclasses of group languages as follows. For any two local subvarieties ${\cal C} \subsetneq {\cal D}$ of group languages, we have $\{L \mid L \text{ is } {\cal C}\text{-measurable}\} \subsetneq \{ L \mid L \text{ is } {\cal D}\text{-measurable}\}$.

en cs.FL
arXiv Open Access 2025
Bridging Gaps in Natural Language Processing for Yorùbá: A Systematic Review of a Decade of Progress and Prospects

Toheeb Aduramomi Jimoh, Tabea De Wille, Nikola S. Nikolov

Natural Language Processing (NLP) is becoming a dominant subset of artificial intelligence as the need to help machines understand human language looks indispensable. Several NLP applications are ubiquitous, partly due to the myriad of datasets being churned out daily through mediums like social networking sites. However, the growing development has not been evident in most African languages due to the persisting resource limitations, among other issues. Yorùbá language, a tonal and morphologically rich African language, suffers a similar fate, resulting in limited NLP usage. To encourage further research towards improving this situation, this systematic literature review aims to comprehensively analyse studies addressing NLP development for Yorùbá, identifying challenges, resources, techniques, and applications. A well-defined search string from a structured protocol was employed to search, select, and analyse 105 primary studies between 2014 and 2024 from reputable databases. The review highlights the scarcity of annotated corpora, the limited availability of pre-trained language models, and linguistic challenges like tonal complexity and diacritic dependency as significant obstacles. It also revealed the prominent techniques, including rule-based methods, among others. The findings reveal a growing body of multilingual and monolingual resources, even though the field is constrained by socio-cultural factors such as code-switching and the desertion of language for digital usage. This review synthesises existing research, providing a foundation for advancing NLP for Yorùbá and in African languages generally. It aims to guide future research by identifying gaps and opportunities, thereby contributing to the broader inclusion of Yorùbá and other under-resourced African languages in global NLP advancements.

en cs.CL, cs.AI
arXiv Open Access 2025
A Multilingual Python Programming Language

Saad Ahmed Bazaz, Mirza Omer Beg

All widely used and useful programming languages have a common problem. They restrict entry on the basis of knowledge of the English language. The lack of knowledge of English poses a major hurdle to many newcomers who do not have the resources, in terms of time and money, to learn the English language. Studies show that people learn better in their own language. Therefore, we propose a language transpiler built on top of the Python programming language, called UniversalPython, which allows one to write Python in their own human language. We demonstrate the ability to create an "Urdu Python" with this transpiler. In the future, we aim to scale the language to encapsulate more human languages to increase the availability of programming. The source code for this transpiler is open-source, and available at https://github.com/universalpython/universalpython

en cs.PL
DOAJ Open Access 2025
Mikakati ya Utetezi wa Mazingira katika Riwaya za Nakuruto na Bustani Ya Edeni

Erick Maina, Lina Akaka, Rose Mavisi

Makala haya yanalenga kutathmini utetezi wa mazingira katika riwaya za Nakuruto (2009) ya Clara Momanyi na Bustani ya Edeni (2001) ya Emmanuel Mbogo. Kufuatia uharibifu wa mazingira kote ulimwenguni, utetezi wa mazingira umekuwa ukifanyika katika majukwaa mbalimbali yakiwemo makongamano ya kimataifa, mashirika ya uanaharakati na katika vyombo vya habari. Katika miaka ya hivi karibuni wanafasihi pia wameanza kutumia fasihi kama jukwaa la utetezi wa mazingira katika kazi zao za riwaya, tamthilia, hadithi fupi na ushairi. Jambo hili ndilo lililomvutia mtafiti kuchunguza namna suala la utetezi wa mazingira lilivyoshughulikiwa katika riwaya teule. Utafiti huu uliongozwa na Nadharia ya Uhakiki-Ikolojia kwa mujibu Glotfelty (1996). Muundo wa utafiti huu ni wa kimaelezo. Sampuli ya utafiti huu ni riwaya za Nakuruto (2009) na Bustani ya Edeni (2001) ambazo ziliteuliwa kimakusudi ili zichunguzwe. Matokeo ya utafiti huu yalionyesha kuwa mikakati mbalimbali kama vile dini, sheria, vyombo vya habari, mabango, nyimbo na maandamano imetumika kuyatetea mazingira. Makala haya yatawachochea wanaharakati wa mazingira katika kutumia majukwaa mbalimbali kuyatetea mazingira. Aidha, utafiti huu utapiga jeki juhudi za Wizara ya Mazingira katika kuhamasisha jamii kuhusu uharibifu dhidi ya mazingira.

African languages and literature
S2 Open Access 2023
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

A. Ezugwu, O. N. Oyelade, A. M. Ikotun et al.

The machine learning (ML) paradigm has gained much popularity today. Its algorithmic models are employed in every field, such as natural language processing, pattern recognition, object detection, image recognition, earth observation and many other research areas. In fact, machine learning technologies and their inevitable impact suffice in many technological transformation agendas currently being propagated by many nations, for which the already yielded benefits are outstanding. From a regional perspective, several studies have shown that machine learning technology can help address some of Africa’s most pervasive problems, such as poverty alleviation, improving education, delivering quality healthcare services, and addressing sustainability challenges like food security and climate change. In this state-of-the-art paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 89% were articles with at least 482 citations published in 903 journals during the past three decades. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent.

34 sitasi en Computer Science, Medicine
arXiv Open Access 2024
Directed Regular and Context-Free Languages

Moses Ganardi, Irmak Saglam, Georg Zetzsche

We study the problem of deciding whether a given language is directed. A language $L$ is \emph{directed} if every pair of words in $L$ have a common (scattered) superword in $L$. Deciding directedness is a fundamental problem in connection with ideal decompositions of downward closed sets. Another motivation is that deciding whether two \emph{directed} context-free languages have the same downward closures can be decided in polynomial time, whereas for general context-free languages, this problem is known to be coNEXP-complete. We show that the directedness problem for regular languages, given as NFAs, belongs to $AC^1$, and thus polynomial time. Moreover, it is NL-complete for fixed alphabet sizes. Furthermore, we show that for context-free languages, the directedness problem is PSPACE-complete.

en cs.FL, cs.CL
arXiv Open Access 2024
Instruct Large Language Models to Generate Scientific Literature Survey Step by Step

Yuxuan Lai, Yupeng Wu, Yidan Wang et al.

Abstract. Automatically generating scientific literature surveys is a valuable task that can significantly enhance research efficiency. However, the diverse and complex nature of information within a literature survey poses substantial challenges for generative models. In this paper, we design a series of prompts to systematically leverage large language models (LLMs), enabling the creation of comprehensive literature surveys through a step-by-step approach. Specifically, we design prompts to guide LLMs to sequentially generate the title, abstract, hierarchical headings, and the main content of the literature survey. We argue that this design enables the generation of the headings from a high-level perspective. During the content generation process, this design effectively harnesses relevant information while minimizing costs by restricting the length of both input and output content in LLM queries. Our implementation with Qwen-long achieved third place in the NLPCC 2024 Scientific Literature Survey Generation evaluation task, with an overall score only 0.03% lower than the second-place team. Additionally, our soft heading recall is 95.84%, the second best among the submissions. Thanks to the efficient prompt design and the low cost of the Qwen-long API, our method reduces the expense for generating each literature survey to 0.1 RMB, enhancing the practical value of our method.

en cs.CL
arXiv Open Access 2023
Cutting the Cake: A Language for Fair Division

Noah Bertram, Alex Levinson, Justin Hsu

The fair division literature in economics considers how to divide resources between multiple agents such that the allocation is envy-free: each agent receives their favorite piece. Researchers have developed a variety of fair division protocols for the most standard setting, where the agents want to split a single item, however, the protocols are highly intricate and the proofs of envy-freeness involve tedious case analysis. We propose Slice, a domain specific language for fair-division. Programs in our language can be converted to logical formulas encoding envy-freeness and other target properties. Then, the constraints can be dispatched to automated solvers. We prove that our constraint generation procedure is sound and complete. We also report on a prototype implementation of Slice, which we have used to automatically check envy-freeness for several protocols from the fair division literature.

arXiv Open Access 2023
A Modular Approach to Metatheoretic Reasoning for Extensible Languages

Dawn Michaelson, Gopalan Nadathur, Eric Van Wyk

This paper concerns the development of metatheory for extensible languages. It uses as its starting point a view that programming languages tailored to specific application domains are to be constructed by composing components from an open library of independently-developed extensions to a host language. In the elaboration of this perspective, static analyses (such as typing) and dynamic semantics (such as evaluation) are described via relations whose specifications are distributed across the host language and extensions and are given in a rule-based fashion. Metatheoretic properties, which ensure that static analyses accurately gauge runtime behavior, are represented in this context by formulas over such relations. These properties may be fundamental to the language, introduced by the host language, or they may pertain to analyses introduced by individual extensions. We expose the problem of modular metatheory, i.e., the notion that proofs of relevant properties can be constructed by reasoning independently within each component in the library. To solve this problem, we propose the twin ideas of decomposing proofs around language fragments and of reasoning generically about extensions based on broad, a priori constraints imposed on their behavior. We establish the soundness of these styles of reasoning by showing how complete proofs of the properties can be automatically constructed for any language obtained by composing the independent parts. Mathematical precision is given to our discussions by framing them within a logic that encodes inductive rule-based specifications via least fixed-point definitions. We also sketch the structure of a practical system for metatheoretic reasoning for extensible languages based on the ideas developed.

en cs.PL, cs.LO
arXiv Open Access 2023
On the Impact of Language Selection for Training and Evaluating Programming Language Models

Jonathan Katzy, Maliheh Izadi, Arie van Deursen

The recent advancements in Transformer-based Language Models have demonstrated significant potential in enhancing the multilingual capabilities of these models. The remarkable progress made in this domain not only applies to natural language tasks but also extends to the domain of programming languages. Despite the ability of these models to learn from multiple languages, evaluations typically focus on particular combinations of the same languages. In this study, we evaluate the similarity of programming languages by analyzing their representations using a CodeBERT-based model. Our experiments reveal that token representation in languages such as C++, Python, and Java exhibit proximity to one another, whereas the same tokens in languages such as Mathematica and R display significant dissimilarity. Our findings suggest that this phenomenon can potentially result in performance challenges when dealing with diverse languages. Thus, we recommend using our similarity measure to select a diverse set of programming languages when training and evaluating future models.

en cs.SE, cs.AI
arXiv Open Access 2023
Simulating H.P. Lovecraft horror literature with the ChatGPT large language model

Eduardo C. Garrido-Merchán, José Luis Arroyo-Barrigüete, Roberto Gozalo-Brizuela

In this paper, we present a novel approach to simulating H.P. Lovecraft's horror literature using the ChatGPT large language model, specifically the GPT-4 architecture. Our study aims to generate text that emulates Lovecraft's unique writing style and themes, while also examining the effectiveness of prompt engineering techniques in guiding the model's output. To achieve this, we curated a prompt containing several specialized literature references and employed advanced prompt engineering methods. We conducted an empirical evaluation of the generated text by administering a survey to a sample of undergraduate students. Utilizing statistical hypothesis testing, we assessed the students ability to distinguish between genuine Lovecraft works and those generated by our model. Our findings demonstrate that the participants were unable to reliably differentiate between the two, indicating the effectiveness of the GPT-4 model and our prompt engineering techniques in emulating Lovecraft's literary style. In addition to presenting the GPT model's capabilities, this paper provides a comprehensive description of its underlying architecture and offers a comparative analysis with related work that simulates other notable authors and philosophers, such as Dennett. By exploring the potential of large language models in the context of literary emulation, our study contributes to the body of research on the applications and limitations of these models in various creative domains.

en cs.CL
DOAJ Open Access 2023
African Vernacular-rooted Imagery in Yemi Ijisakin’s Stone Sculptures

Sule Ameh James

This article presents a critical analysis of the African vernacular-rooted imagery represented in Yemi Ijisakin’s stone sculptures produced between the years 2006 and 2016. The focus on this period is to study the kinds of imagery he represents when there is a global artistic shift to installation and conceptual art. In doing this, I argue that even though Ijisakin’s stone sculptures are deemed vernacular art, they are not indigenous or historical African art, but a rethinking that references indigenous African cultural registers. The article also focuses on the ideas and meanings the interpretations of the works communicate to the audience. Thus, this article presents his artworks to a mainstream journal given that they have not received any critical analysis on the grounds that his works are regressive and outside the normative standards for referencing African/Nigerian/Yoruba contexts. But his works are important for demonstrating the interdependence of art and culture in Nigeria and producing knowledge on cultural practices.

History of Africa, African languages and literature
S2 Open Access 2022
PEARL: A Guide for Developing Community-Engaging and Culturally-Sensitive Education Materials

David Haynes, Kelly D. Hughes, Annette Okafor

Community outreach and engagement has been a regular activity of the National Cancer Institute at its designated Cancer Centers. However, in 2016, community outreach and engagement became a required activity for all cancer centers. Yet there is a gap in the literature that provides guidelines for developing materials that resonate with communities. We developed the PEARL rubric to fulfill that gap from our work developing culturally sensitive breast cancer education materials for African American and Immigrant African women. We conducted a targeted literature review to understand the approaches that have been used for developing education materials for communities. We reviewed the literature and distilled key elements into our PEARL guide for creating culturally appropriate education materials. PEARL consists of five elements: Plain language and understandability, Explicit data, statistics, and graphs, Affirmative framing, Representative content, and Local connection. PEARL is a modern comprehensive guide that researchers can use for creating culturally sensitive materials. It is designed to guide researchers develop educational materials who have little to no experience in community engagement.

15 sitasi en Medicine
arXiv Open Access 2022
Text normalization for low-resource languages: the case of Ligurian

Stefano Lusito, Edoardo Ferrante, Jean Maillard

Text normalization is a crucial technology for low-resource languages which lack rigid spelling conventions or that have undergone multiple spelling reforms. Low-resource text normalization has so far relied upon hand-crafted rules, which are perceived to be more data efficient than neural methods. In this paper we examine the case of text normalization for Ligurian, an endangered Romance language. We collect 4,394 Ligurian sentences paired with their normalized versions, as well as the first open source monolingual corpus for Ligurian. We show that, in spite of the small amounts of data available, a compact transformer-based model can be trained to achieve very low error rates by the use of backtranslation and appropriate tokenization.

en cs.CL
S2 Open Access 2021
Health Chatbots in Africa: Scoping Review

M. Phiri, Allen Munoriyarwa

Background This scoping review explores and summarizes the existing literature on the use of chatbots to support and promote health in Africa. Objective The primary aim was to learn where, and under what circumstances, chatbots have been used effectively for health in Africa; how chatbots have been developed to the best effect; and how they have been evaluated by looking at literature published between 2017 and 2022. A secondary aim was to identify potential lessons and best practices for others chatbots. The review also aimed to highlight directions for future research on the use of chatbots for health in Africa. Methods Using the 2005 Arksey and O’Malley framework, we used a Boolean search to broadly search literature published between January 2017 and July 2022. Literature between June 2021 and July 2022 was identified using Google Scholar, EBSCO information services—which includes the African HealthLine, PubMed, MEDLINE, PsycInfo, Cochrane, Embase, Scopus, and Web of Science databases—and other internet sources (including gray literature). The inclusion criteria were literature about health chatbots in Africa published in journals, conference papers, opinion, or white papers. Results In all, 212 records were screened, and 12 articles met the inclusion criteria. Results were analyzed according to the themes they covered. The themes identified included the purpose of the chatbot as either providing an educational or information-sharing service or providing a counselling service. Accessibility as a result of either technical restrictions or language restrictions was also noted. Other themes that were identified included the need for the consideration of trust, privacy and ethics, and evaluation. Conclusions The findings demonstrate that current data are insufficient to show whether chatbots are effectively supporting health in the region. However, the review does reveal insights into popular chatbots and the need to make them accessible through language considerations, platform choice, and user trust, as well as the importance of robust evaluation frameworks to assess their impact. The review also provides recommendations on the direction of future research.

27 sitasi en Medicine

Halaman 29 dari 724