In South African higher education, the dominance of English and Afrikaans continues to raise concerns about the marginalization and development of indigenous African languages. Consequently, strategies are being sought on how to revitalize indigenous languages in the higher education sector. This study argues that critical lessons for addressing these challenges and achieving the desired goals can be drawn from African literature. Therefore, this study aims to examine how African literature contributes to the development of indigenous African languages and explore strategies from African literature that can effectively revitalize and promote indigenous African languages in South African higher education. The study is a qualitative study that involves a review of African literary texts. Furthermore, thematic analysis was conducted to identify how African literature contributes to the development of indigenous African languages. Practical lessons and strategies that can inform the revitalization of these indigenous languages within higher education were identified. The identified strategies that were adopted by African literary writers in promoting indigenous languages include: documentation, codification and standardization, vocabulary expansion and modernization, cross-linguistic translation, and hybridization, and code switching. All these would require strong institutional commitment and intellectual investment. Through drawing on lessons from African literature, universities can cultivate a linguistic landscape that preserves indigenous languages and empowers them as vehicles for knowledge creation, innovation, and cultural expression.
Nous vivons à présent dans un monde marqué par une crise environnementale de plus en plus grave. Pendant que nous sommes tous touchés par le changement climatique progressant, son impact n’est pas distribué de manière égale. Les régions les plus vulnérables de la planète subissent déjà ce que les Occidentaux redoutent pour l’avenir – un avenir qui s’annonce, lui aussi, inéquitable. Cependant, loin de se cantonner à des discours misérabilistes et a des visions dysphoriques, le présent article explore des récits alternatifs. Plus précisément, il analyse le roman Rouge impératrice de Léonora Miano comme un exemple du contre-discours sur notre avenir global dans l’Anthropocène. En tant que tel, il s’oppose aux discours dominants sur les changements et les migrations climatiques, qui tendent à produire principalement des visions apocalyptiques et à stigmatiser les personnes déplacées. L’afrofuturisme, un courant intellectuel et artistique centré sur les populations subalternes – et dont le roman analysé est un parfait exemple –, s’intéresse particulièrement au devenir du continent africain et, au contraire des fictions climatiques occidentales, propose une vision utopique de ce futur. L’analyse s’articule ainsi autour de trois axes principaux : les discours dominants sur la crise climatique et les réfugiés climatiques, une vision alternative offerte par l’afrofuturisme, et le roman de Miano comme exemple d’un contre-discours porteur d’un projet d’utopie écologique.
This study argues that the Banyamulenge—a Tutsi-descended minority in South Kivu, DRC—have had their ethnic identity historically shaped by migration, contested citizenship, and regional wars. Based on fieldwork in Rwanda, Burundi, and the DRC, as well as secondary sources, the article shows how nationality laws, the 1964 Simba Rebellion, the 1994 Vangu Report, and the creation of the Minembwe Commune framed them alternately as insiders and outsiders. While Banyamulenge leaders emphasized territorial belonging to claim Congolese identity, elites and neighboring states politicized it for their own purposes. Interviews reveal that community members, including younger generations, now stress civic belonging and interethnic cooperation, countering depictions of them as outsiders. The study argues that lasting peace in the Great Lakes Region requires inclusive citizenship and accountable institutions rather than ethnic exclusion.
History of Africa, African languages and literature
This paper maps Africa's distinctive AI risk profile, from deepfake fuelled electoral interference and data colonial dependency to compute scarcity, labour disruption and disproportionate exposure to climate driven environmental costs. While major benefits are promised to accrue, the availability, development and adoption of AI also mean that African people and countries face particular AI safety risks, from large scale labour market disruptions to the nefarious use of AI to manipulate public opinion. To date, African perspectives have not been meaningfully integrated into global debates and processes regarding AI safety, leaving African stakeholders with limited influence over the emerging global AI safety governance agenda. While there are Computer Incident Response Teams on the continent, none hosts a dedicated AI Safety Institute or office. We propose a five-point action plan centred on (i) a policy approach that foregrounds the protection of the human rights of those most vulnerable to experiencing the harmful socio-economic effects of AI; (ii) the establishment of an African AI Safety Institute; (iii) promote public AI literacy and awareness; (iv) development of early warning system with inclusive benchmark suites for 25+ African languages; and (v) an annual AU-level AI Safety & Security Forum.
Jesujoba O. Alabi, Michael A. Hedderich, David Ifeoluwa Adelani
et al.
With over 2,000 languages and potentially millions of speakers, Africa represents one of the richest linguistic regions in the world. Yet, this diversity is scarcely reflected in state-of-the-art natural language processing (NLP) systems and large language models (LLMs), which predominantly support a narrow set of high-resource languages. This exclusion not only limits the reach and utility of modern NLP technologies but also risks widening the digital divide across linguistic communities. Nevertheless, NLP research on African languages is active and growing. In recent years, there has been a surge of interest in this area, driven by several factors-including the creation of multilingual language resources, the rise of community-led initiatives, and increased support through funding programs. In this survey, we analyze 884 research papers on NLP for African languages published over the past five years, offering a comprehensive overview of recent progress across core tasks. We identify key trends shaping the field and conclude by outlining promising directions to foster more inclusive and sustainable NLP research for African languages.
Albert L. Oyeleye, Adesina B. Sunday, Ibijoke Omole
Existing linguistic studies in Nigeria have focused on investigator’s communicative acts in coercive investigative discourse, with little attention given to non-coercive investigative discourse involving accused rapists (ARs) in correctional centres. This study addresses this gap by analysing the pragmatic strategies ARs employ in crime narratives within Agodi Custodial Centre, Ibadan, Nigeria to offer insights into evidential cues that could impact justice administration. Using Jacob Mey’s Pragmatic Acts Theory as framework, the study adopts descriptive design and purposive sampling to select thirty-nine ARs for interviews. Findings revealed that the ARs deployed identity-framing, identity-reframing, attention-seeking, information-controlling, crime-relabelling and attention diversion strategies to influence investigator’s interpretation. The involvement of minors in the rape cases underscores the severity of the crime and the need for effective justice mechanisms. Additionally, cultural assumptions about intimacy and relationships, driven by patriarchal norms and misconceptions about consent, significantly influence crime narratives. Recognising these contexts is crucial to preventing justice perversion and enhancing forensic discourse in Nigeria.
Ethnology. Social and cultural anthropology, Philology. Linguistics
The Artificial Intelligence (AI) revolution has become a reality in today’s world and its importance for linguistics was recognized very early. Despite its unprecedent surge and integration into various academic fields including language teaching and translation, surprisingly, little work has been done by scholars in advancing discussions on the profound impact of the AI on the diversity of widely available languages in both developed and developing world.
Africa is linguistically diverse continent with about one third of the world’s languages that are vastly underrepresented in the
global digital data pool. AI translation machine is supported in only 25 languages out of over 2000 languages in the continent. The paper deploys homomorphism model of AI theory to interrogate the natural language data drawn the African languages to present the current and future challenges, opportunities and potential for developing AI algorithms that could fit neatly into the translation of the African languages. Most of the discussions in the paper focuses on the seven patterns of the AI, the usage and implementation of AI algorithms in the translation science. The research findings show some of the complexities of the African languages in which their syntactic categories have multiple corresponding semantic objects. Unlike English, the findings also reveal that syntactic operation in the African languages do not always have one corresponding semantic operation as postulated by the homomorphism model of AI theory. the study contributes to scholarly literature by stressing the limits and opportunities that relate to using AI in translation science and supplying input from NLP algorithms practitioners to expand the AI applicability operation in the translation science.
Language and Literature, History of scholarship and learning. The humanities
Many multilingual communities, including numerous in Africa, frequently engage in code-switching during conversations. This behaviour stresses the need for natural language processing technologies adept at processing code-switched text. However, data scarcity, particularly in African languages, poses a significant challenge, as many are low-resourced and under-represented. In this study, we prompted GPT 3.5 to generate Afrikaans--English and Yoruba--English code-switched sentences, enhancing diversity using topic-keyword pairs, linguistic guidelines, and few-shot examples. Our findings indicate that the quality of generated sentences for languages using non-Latin scripts, like Yoruba, is considerably lower when compared with the high Afrikaans-English success rate. There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.
Nurshat Fateh Ali, Md. Mahdi Mohtasim, Shakil Mosharrof
et al.
This research presents and compares multiple approaches to automate the generation of literature reviews using several Natural Language Processing (NLP) techniques and retrieval-augmented generation (RAG) with a Large Language Model (LLM). The ever-increasing number of research articles provides a huge challenge for manual literature review. It has resulted in an increased demand for automation. Developing a system capable of automatically generating the literature reviews from only the PDF files as input is the primary objective of this research work. The effectiveness of several Natural Language Processing (NLP) strategies, such as the frequency-based method (spaCy), the transformer model (Simple T5), and retrieval-augmented generation (RAG) with Large Language Model (GPT-3.5-turbo), is evaluated to meet the primary objective. The SciTLDR dataset is chosen for this research experiment and three distinct techniques are utilized to implement three different systems for auto-generating the literature reviews. The ROUGE scores are used for the evaluation of all three systems. Based on the evaluation, the Large Language Model GPT-3.5-turbo achieved the highest ROUGE-1 score, 0.364. The transformer model comes in second place and spaCy is at the last position. Finally, a graphical user interface is created for the best system based on the large language model.
The African continent has witnessed a notable surge in entrepreneurial activity, with the number of startups and investments made in the ecosystem growing significantly in recent years. Against this backdrop, this paper presents an in-depth analysis of the critical key factors influencing funding amounts in African startup deals. A comprehensive analysis of 2,521 startup investment deals, spanning from January 2019 to March 2023, was conducted using a combination of statistical and several machine learning techniques. The results of this study highlight a significant gender diversity gap, the importance of professional experience, and the impact of founders' academic backgrounds. The study reveals that human capital, a diversified sector approach, and cross-border collaboration strategies are crucial for a robust startup ecosystem. Additionally, we identified the potential positive impact of 'Y combinators' for African startups, the implications of exit strategies on deal amounts, and the heterogeneity as well as the incongruity of investment rounds across the continent. In light of these findings, we propose an assortment of policy recommendations aimed at fostering a propitious milieu for African entrepreneurial ventures, promoting equitable investment distribution, and enhancing cross-border collaboration. By providing a rigorous empirical analysis, this study not only contributes to the existing body of literature but also lays the foundation for future research aimed at promoting investment and catalyzing socio-economic development throughout the African continent.
This paper surveys the empirical literature of inflation targeting. The main findings from our review are the following: there is robust empirical evidence that larger and more developed countries are more likely to adopt the IT regime; the introduction of this regime is conditional on previous disinflation, greater exchange rate flexibility, central bank independence, and higher level of financial development; the empirical evidence has failed to provide convincing evidence that IT itself may serve as an effective tool for stabilizing inflation expectations and for reducing inflation persistence; the empirical research focused on advanced economies has failed to provide convincing evidence on the beneficial effects of IT on inflation performance, while there is some evidence that the gains from the IT regime may have been more prevalent in the emerging market economies; there is not convincing evidence that IT is associated with either higher output growth or lower output variability; the empirical research suggests that IT may have differential effects on exchange-rate volatility in advanced economies versus EMEs; although the empirical evidence on the impact of IT on fiscal policy is quite limited, it supports the idea that IT indeed improves fiscal discipline; the empirical support to the proposition that IT is associated with lower disinflation costs seems to be rather weak. Therefore, the accumulated empirical literature implies that IT does not produce superior macroeconomic benefits in comparison with the alternative monetary strategies or, at most, they are quite modest.
Atnafu Lambebo Tonja, Hellina Hailu Nigatu, Olga Kolesnikova
et al.
This paper describes CIC NLP's submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. We present the system descriptions for three methods. We used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) -- Helsinki NLP Spanish-English translation model, and experimented with different transfer learning setups. We experimented with 11 languages from America and report the setups we used as well as the results we achieved. Overall, the mBART setup was able to improve upon the baseline for three out of the eleven languages.
Daniel Lundén, Gizem Çaylak, Fredrik Ronquist
et al.
Probabilistic Programming Languages (PPLs) allow users to encode statistical inference problems and automatically apply an inference algorithm to solve them. Popular inference algorithms for PPLs, such as sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC), are built around checkpoints -- relevant events for the inference algorithm during the execution of a probabilistic program. Deciding the location of checkpoints is, in current PPLs, not done optimally. To solve this problem, we present a static analysis technique that automatically determines checkpoints in programs, relieving PPL users of this task. The analysis identifies a set of checkpoints that execute in the same order in every program run -- they are aligned. We formalize alignment, prove the correctness of the analysis, and implement the analysis as part of the higher-order functional PPL Miking CorePPL. By utilizing the alignment analysis, we design two novel inference algorithm variants: aligned SMC and aligned lightweight MCMC. We show, through real-world experiments, that they significantly improve inference execution time and accuracy compared to standard PPL versions of SMC and MCMC.
Ebbie Awino, Lilian Wanzare, Lawrence Muchemi
et al.
Building automatic speech recognition (ASR) systems is a challenging task, especially for under-resourced languages that need to construct corpora nearly from scratch and lack sufficient training data. It has emerged that several African indigenous languages, including Kiswahili, are technologically under-resourced. ASR systems are crucial, particularly for the hearing-impaired persons who can benefit from having transcripts in their native languages. However, the absence of transcribed speech datasets has complicated efforts to develop ASR models for these indigenous languages. This paper explores the transcription process and the development of a Kiswahili speech corpus, which includes both read-out texts and spontaneous speech data from native Kiswahili speakers. The study also discusses the vowels and consonants in Kiswahili and provides an updated Kiswahili phoneme dictionary for the ASR model that was created using the CMU Sphinx speech recognition toolbox, an open-source speech recognition toolkit. The ASR model was trained using an extended phonetic set that yielded a WER and SER of 18.87% and 49.5%, respectively, an improved performance than previous similar research for under-resourced languages.
In its quest to restore land to millions of its citizens dispossessed under colonial and apartheid regimes, South Africa adopted a Restitution of Land Rights Act and set up a Land Claims Court in 1994 and 1996, respectively. This article uses select judgments of the Land Claims Court to critique the interpretative mindset of judges and the ideological neutrality of certain definitions in the Restitution Act. It argues that the colonial legacy of legal positivism and 20th century anthropological imagery inhibits the access to justice of dispossessed Africans living on the periphery of land rights. It uses the word ‘chained’ to describe communities whose restitution of land rights depends on their ability to (re)imagine themselves through a judicial prism of fossilized colonial ideas of traditional structures, lineage, and unbroken practices. The article recommends measures for promoting a South African legal culture that is sensitive to legal pluralism, mindful of indigenous law’s flexibility, and distrustful of undue standardization that stifles people’s access to justice.
History of Africa, African languages and literature
Building effective neural machine translation (NMT) models for very low-resourced and morphologically rich African indigenous languages is an open challenge. Besides the issue of finding available resources for them, a lot of work is put into preprocessing and tokenization. Recent studies have shown that standard tokenization methods do not always adequately deal with the grammatical, diacritical, and tonal properties of some African languages. That, coupled with the extremely low availability of training samples, hinders the production of reliable NMT models. In this paper, using Fon language as a case study, we revisit standard tokenization methods and introduce Word-Expressions-Based (WEB) tokenization, a human-involved super-words tokenization strategy to create a better representative vocabulary for training. Furthermore, we compare our tokenization strategy to others on the Fon-French and French-Fon translation tasks.
Jama Hussein Mohamud, Lloyd Acquaye Thompson, Aissatou Ndoye
et al.
This paper describes the results of an informal collaboration launched during the African Master of Machine Intelligence (AMMI) in June 2020. After a series of lectures and labs on speech data collection using mobile applications and on self-supervised representation learning from speech, a small group of students and the lecturer continued working on automatic speech recognition (ASR) project for three languages: Wolof, Ga, and Somali. This paper describes how data was collected and ASR systems developed with a small amount (1h) of transcribed speech as training data. In these low resource conditions, pre-training a model on large amounts of raw speech was fundamental for the efficiency of ASR systems developed.
Aymen Ben Elhaj Mabrouk, Moez Ben Haj Hmida, Chayma Fourati
et al.
Searching for an available, reliable, official, and understandable information is not a trivial task due to scattered information across the internet, and the availability lack of governmental communication channels communicating with African dialects and languages. In this paper, we introduce an Artificial Intelligence Powered chatbot for crisis communication that would be omnichannel, multilingual and multi dialectal. We present our work on modified StarSpace embedding tailored for African dialects for the question-answering task along with the architecture of the proposed chatbot system and a description of the different layers. English, French, Arabic, Tunisian, Igbo,Yorùbá, and Hausa are used as languages and dialects. Quantitative and qualitative evaluation results are obtained for our real deployed Covid-19 chatbot. Results show that users are satisfied and the conversation with the chatbot is meeting customer needs.