Hasil "Indo-Iranian languages and literature"

arXiv Open Access 2025

Dual-Language General-Purpose Self-Hosted Visual Language and new Textual Programming Language for Applications

Mahmoud Samir Fayed

Most visual programming languages (VPLs) are domain-specific, with few general-purpose VPLs like Programming Without Coding Technology (PWCT). These general-purpose VPLs are developed using textual programming languages and improving them requires textual programming. In this thesis, we designed and developed PWCT2, a dual-language (Arabic/English), general-purpose, self-hosting visual programming language. Before doing so, we specifically designed a textual programming language called Ring for its development. Ring is a dynamically typed language with a lightweight implementation, offering syntax customization features. It permits the creation of domain-specific languages through new features that extend object-oriented programming, allowing for specialized languages resembling Cascading Style Sheets (CSS) or Supernova language. The Ring Compiler and Virtual Machine are designed using the PWCT visual programming language where the visual implementation is composed of 18,945 components that generate 24,743 lines of C code, which increases the abstraction level and hides unnecessary details. Using PWCT to develop Ring allowed us to realize several issues in PWCT, which led to the development of the PWCT2 visual programming language using the Ring textual programming language. PWCT2 provides approximately 36 times faster code generation and requires 20 times less storage for visual source files. It also allows for the conversion of Ring code into visual code, enabling the creation of a self-hosting VPL that can be developed using itself. PWCT2 consists of approximately 92,000 lines of Ring code and comes with 394 visual components. PWCT2 is distributed to many users through the Steam platform and has received positive feedback, On Steam, 1772 users have launched the software, and the total recorded usage time exceeds 17,000 hours, encouraging further research and development.

en cs.PL, cs.SE

Detail Sumber

arXiv Open Access 2025

Automatic Speech Recognition (ASR) for African Low-Resource Languages: A Systematic Literature Review

Sukairaj Hafiz Imam, Tadesse Destaw Belay, Kedir Yassin Husse et al.

ASR has achieved remarkable global progress, yet African low-resource languages remain rigorously underrepresented, producing barriers to digital inclusion across the continent with more than +2000 languages. This systematic literature review (SLR) explores research on ASR for African languages with a focus on datasets, models and training methods, evaluation techniques, challenges, and recommends future directions. We employ the PRISMA 2020 procedures and search DBLP, ACM Digital Library, Google Scholar, Semantic Scholar, and arXiv for studies published between January 2020 and July 2025. We include studies related to ASR datasets, models or metrics for African languages, while excluding non-African, duplicates, and low-quality studies (score <3/5). We screen 71 out of 2,062 records and we record a total of 74 datasets across 111 languages, encompassing approximately 11,206 hours of speech. Fewer than 15% of research provided reproducible materials, and dataset licensing is not clear. Self-supervised and transfer learning techniques are promising, but are hindered by limited pre-training data, inadequate coverage of dialects, and the availability of resources. Most of the researchers use Word Error Rate (WER), with very minimal use of linguistically informed scores such as Character Error Rate (CER) or Diacritic Error Rate (DER), and thus with limited application in tonal and morphologically rich languages. The existing evidence on ASR systems is inconsistent, hindered by issues like dataset availability, poor annotations, licensing uncertainties, and limited benchmarking. Nevertheless, the rise of community-driven initiatives and methodological advancements indicates a pathway for improvement. Sustainable development for this area will also include stakeholder partnership, creation of ethically well-balanced datasets, use of lightweight modelling techniques, and active benchmarking.

en cs.CL

Detail Sumber

arXiv Open Access 2025

HySemRAG: A Hybrid Semantic Retrieval-Augmented Generation Framework for Automated Literature Synthesis and Methodological Gap Analysis

Alejandro Godinez

We present HySemRAG, a framework that combines Extract, Transform, Load (ETL) pipelines with Retrieval-Augmented Generation (RAG) to automate large-scale literature synthesis and identify methodological research gaps. The system addresses limitations in existing RAG architectures through a multi-layered approach: hybrid retrieval combining semantic search, keyword filtering, and knowledge graph traversal; an agentic self-correction framework with iterative quality assurance; and post-hoc citation verification ensuring complete traceability. Our implementation processes scholarly literature through eight integrated stages: multi-source metadata acquisition, asynchronous PDF retrieval, custom document layout analysis using modified Docling architecture, bibliographic management, LLM-based field extraction, topic modeling, semantic unification, and knowledge graph construction. The system creates dual data products - a Neo4j knowledge graph enabling complex relationship queries and Qdrant vector collections supporting semantic search - serving as foundational infrastructure for verifiable information synthesis. Evaluation across 643 observations from 60 testing sessions demonstrates structured field extraction achieving 35.1% higher semantic similarity scores (0.655 $\pm$ 0.178) compared to PDF chunking approaches (0.485 $\pm$ 0.204, p < 0.000001). The agentic quality assurance mechanism achieves 68.3% single-pass success rates with 99.0% citation accuracy in validated responses. Applied to geospatial epidemiology literature on ozone exposure and cardiovascular disease, the system identifies methodological trends and research gaps, demonstrating broad applicability across scientific domains for accelerating evidence synthesis and discovery.

en cs.IR, cs.AI

Detail Sumber

arXiv Open Access 2025

Synthetic Voice Data for Automatic Speech Recognition in African Languages

Brian DeRenzi, Anna Dixon, Mohamed Aymane Farhi et al.

Speech technology remains out of reach for most of the over 2300 languages in Africa. We present the first systematic assessment of large-scale synthetic voice corpora for African ASR. We apply a three-step process: LLM-driven text creation, TTS voice synthesis, and ASR fine-tuning. Eight out of ten languages for which we create synthetic text achieved readability scores above 5 out of 7. We evaluated ASR improvement for three (Hausa, Dholuo, Chichewa) and created more than 2,500 hours of synthetic voice data at below 1% of the cost of real data. Fine-tuned Wav2Vec-BERT-2.0 models trained on 250h real and 250h synthetic Hausa matched a 500h real-data-only baseline, while 579h real and 450h to 993h synthetic data created the best performance. We also present gender-disaggregated ASR performance evaluation. For very low-resource languages, gains varied: Chichewa WER improved about 6.5% relative with a 1:2 real-to-synthetic ratio; a 1:1 ratio for Dholuo showed similar improvements on some evaluation data, but not on others. Investigating intercoder reliability, ASR errors and evaluation datasets revealed the need for more robust reviewer protocols and more accurate evaluation data. All data and models are publicly released to invite further work to improve synthetic data for African languages.

en cs.CL

Detail DOI Sumber

arXiv Open Access 2024

vitaLITy 2: Reviewing Academic Literature Using Large Language Models

Hongye An, Arpit Narechania, Emily Wall et al.

Academic literature reviews have traditionally relied on techniques such as keyword searches and accumulation of relevant back-references, using databases like Google Scholar or IEEEXplore. However, both the precision and accuracy of these search techniques is limited by the presence or absence of specific keywords, making literature review akin to searching for needles in a haystack. We present vitaLITy 2, a solution that uses a Large Language Model or LLM-based approach to identify semantically relevant literature in a textual embedding space. We include a corpus of 66,692 papers from 1970-2023 which are searchable through text embeddings created by three language models. vitaLITy 2 contributes a novel Retrieval Augmented Generation (RAG) architecture and can be interacted with through an LLM with augmented prompts, including summarization of a collection of papers. vitaLITy 2 also provides a chat interface that allow users to perform complex queries without learning any new programming language. This also enables users to take advantage of the knowledge captured in the LLM from its enormous training corpus. Finally, we demonstrate the applicability of vitaLITy 2 through two usage scenarios. vitaLITy 2 is available as open-source software at https://vitality-vis.github.io.

en cs.HC

Detail Sumber

arXiv Open Access 2024

Cocobo: Exploring Large Language Models as the Engine for End-User Robot Programming

Yate Ge, Yi Dai, Run Shan et al.

End-user development allows everyday users to tailor service robots or applications to their needs. One user-friendly approach is natural language programming. However, it encounters challenges such as an expansive user expression space and limited support for debugging and editing, which restrict its application in end-user programming. The emergence of large language models (LLMs) offers promising avenues for the translation and interpretation between human language instructions and the code executed by robots, but their application in end-user programming systems requires further study. We introduce Cocobo, a natural language programming system with interactive diagrams powered by LLMs. Cocobo employs LLMs to understand users' authoring intentions, generate and explain robot programs, and facilitate the conversion between executable code and flowchart representations. Our user study shows that Cocobo has a low learning curve, enabling even users with zero coding experience to customize robot programs successfully.

en cs.HC, cs.AI

Detail Sumber

arXiv Open Access 2024

LLAssist: Simple Tools for Automating Literature Review Using Large Language Models

Christoforus Yoga Haryanto

This paper introduces LLAssist, an open-source tool designed to streamline literature reviews in academic research. In an era of exponential growth in scientific publications, researchers face mounting challenges in efficiently processing vast volumes of literature. LLAssist addresses this issue by leveraging Large Language Models (LLMs) and Natural Language Processing (NLP) techniques to automate key aspects of the review process. Specifically, it extracts important information from research articles and evaluates their relevance to user-defined research questions. The goal of LLAssist is to significantly reduce the time and effort required for comprehensive literature reviews, allowing researchers to focus more on analyzing and synthesizing information rather than on initial screening tasks. By automating parts of the literature review workflow, LLAssist aims to help researchers manage the growing volume of academic publications more efficiently.

en cs.DL, cs.AI

Detail Sumber

arXiv Open Access 2024

Making Hybrid Languages: A Recipe

Leif Andersen, Cameron Moy, Stephen Chang et al.

The dominant programming languages support only linear text to express ideas. Visual languages offer graphical representations for entire programs, when viewed with special tools. Hybrid languages, with support from existing tools, allow developers to express their ideas with a mix of textual and graphical syntax tailored to an application domain. This mix puts both kinds of syntax on equal footing and, importantly, the enriched language does not disrupt a programmer's typical workflow. This paper presents a recipe for equipping existing textual programming languages as well as accompanying IDEs with a mechanism for creating and using graphical interactive syntax. It also presents the first hybrid language and IDE created using the recipe.

en cs.PL, cs.HC

Detail Sumber

DOAJ Open Access 2023

Sacred Groves, the Brahmanical Hermit, and Some Remarks on ahiṃsā and Vegetarianism

Cinzia Pieruccini

The term sacred grove‘ is used to denote an area of vegetation that is afforded special protection on religious grounds. In India, where sacred groves are known by a wide repertoire of local names, such places may be found right from the Himalayas up to the far South. Sacred groves host veneration of natural phenomena or elements of landscape, but also ancestral, local, folk or tribal gods and Sanskritised deities; the use of their resources is strictly regulated. Research studies on sacred groves in India often consider them to be a legacy of archaic economic forms, possibly harking back to the stage of hunters-gatherers, and an expression of a religiosity dating back to a remote, non-Aryan, pre-Vedic antiquity. However, main sources for our knowledge of Indian antiquity, namely the literary sources, provide no direct record of voices of such archaic societies. Nonetheless, the same sources allow us to highlight some important aspects of the sacredness anciently ascribed to vegetation, forest, and specific places therein. The present paper proposes to focus on the Brahmanical hermit‘s distinct relationship with the forest and examine some aspects related to food.

Indo-Iranian languages and literature, Languages and literature of Eastern Asia, Africa, Oceania

Detail DOI Sumber

DOAJ Open Access 2023

Back Matter

Indo-Iranian languages and literature, Languages and literature of Eastern Asia, Africa, Oceania

Detail Sumber

arXiv Open Access 2023

LitSumm: Large language models for literature summarisation of non-coding RNAs

Andrew Green, Carlos Ribas, Nancy Ontiveros-Palacios et al.

Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fixed number of curators worldwide presents a major challenge to developers of biomedical knowledgebases. Very few knowledgebases have resources to scale to the whole relevant literature and all have to prioritise their efforts. In this work, we take a first step to alleviating the lack of curator time in RNA science by generating summaries of literature for non-coding RNAs using large language models (LLMs). We demonstrate that high-quality, factually accurate summaries with accurate references can be automatically generated from the literature using a commercial LLM and a chain of prompts and checks. Manual assessment was carried out for a subset of summaries, with the majority being rated extremely high quality. We apply our tool to a selection of over 4,600 ncRNAs and make the generated summaries available via the RNAcentral resource. We conclude that automated literature summarization is feasible with the current generation of LLMs, provided careful prompting and automated checking are applied.

en q-bio.GN, cs.AI

Detail Sumber

DOAJ Open Access 2022

A Critique on the Book Fusus al-Hakam and The School of Ibn Arabi

Parisa Goudarzi, Zahra Jebraeilzade

Fusus al-Hakam is written by Muhyiddin Ibn Arabi. During the nearly eight centuries since this work was written, many writers have tried to unravel its knots with their descriptions and interpretations. The book Fusus al-Hakam and The School of Ibn Arabi by Abu al-'Ala Afifi is one of the most recent commentaries on Fusus al-Hakam, which was scrutinized in the form of commentaries. In this article, we try to introduce and critique this book translated by Mohammad Javad Gohari. The present book has been written as a "commentary" on Fusus al-Hakam; therefore, not all expressions of Fusus have been studied by the author. In the present book, Afifi has tried to explain in a language understandable regarding the mystical and philosophical issues of Fusus and to solve some linguistic and intellectual difficulties of Ibn Arabi for the western readers who are less familiar with the issues of unity. Therefore, this book can be helpful in understanding the complex issues of Fusus al-Hakam. Another prominent point of this book is the author's lack of prejudice against Ibn Arabi, which has turned him into a fair critic and impartial judge in the evaluation of Fusus. The author also tries to help to better establish them in the mind of the reader by emphasizing and repeating the key topics of Ibn Arabi's view in the book, including the unity of existence, the relationship between the universe and God, and related topics.

Indo-Iranian languages and literature, General Works

Detail DOI Sumber

DOAJ Open Access 2022

Introduction

Anna A. Ślączka

Indo-Iranian languages and literature, Languages and literature of Eastern Asia, Africa, Oceania

Detail DOI Sumber

DOAJ Open Access 2021

An Analytical Evaluation of Research in the Field of Figurative Language in the Literary Arts ‎Journal

Shirzad Tayefi, Samaneh Mansouri Alhashem

Scientific journals are considered as a key axis in the transfer and spreading of knowledge in various fields of study, and the promotion of research depends on the activities carried out in these journals. In Literary Arts Journal, significant and applied studies are not few. Hence, considering the scientific nature that such studies give to literary studies, they help a lot in rejecting the thinking in which literature is considered a matter of taste and being unscientific. Therefore, analytical and pathological approaches to explain the positive and negative points of the articles could be beneficial. The extent of research in the field of rhetorical sciences, especially the field of expression studies, as the most prominent form of transmission and spread of literary knowledge, highlights the importance of content and methodological analysis of suchresearch.Therefore, this research evaluates the content and structure of the articles with a descriptive-analytical approach and leads to the point that some of the articles have weaknesses in terms of structure, content, and violation of rules in using sources. The results could help provide a basis for scientific journals to include rules in their guidelines that could raise the quality of rhetorical studies on the subject of rhetoric and increase referrals to them. IntroductionRhetorical issues are considered as one of the main research approaches in the field of Persian language and literature, in which the application of literary techniques and how to express meaning are discussed. Although such an approach to literary works has a long history and is considered an ancient science, as long as the expression of emotions and actions of the human soul are endless and take a different color in each age and time, such sciences will inevitably need to progress and evolve according to it. This depends on scientific research that provides this platform with a new structure and content. DiscussionIn facing a text, the researcher is confronted with the outer and superficial layers on the one hand and the hidden layer of meaning and content on the other hand. Asthe effect of structure on the content and vice versa is inevitable, in designing the frameworks and general principles of literary research, paying attention to the text content is especially important. Accordingly, what forms the general principles of evaluation and analysis of research works can be examined in two areas.2-1. Structural factors and research methodologyIf we consider the structure and methodology of research as a set of strategies of the research path, from the choice of title to conclusion, it is natural that each science has its structure according to the subject and principles, so that thinkers in that field, using those principles, organize the information based on the initial hypothesis or question and form new findings and ideas. In general, regarding the structural elements suitable for scientific research, several rules have been proposed by experts in this field, according to which specific structural elements can be determined for literary research. Elements such as; title, abstract, keywords, introduction, research question, research hypothesis, research method, research background, necessity and importance of research, theoretical foundations, analysis, conclusion, and references can be considered as a fixed structural model in literary research.2-2. Important components in content analysis and specialized concepts of the textRegarding the research content, the main focus is on the achievement or role of the research work in scientific progress; in other words, in this part of the work, the researcher needs to clearly and unambiguously state the benefits and achievements of the research. If he/she cannot convey it to the audience with a certain verbal skill, the value of the achievement could not be understood by the audience. ConclusionExisting research in the field of rhetoric in the Literary Arts Journal were evaluated using the components of quality assessment of articles. The results show the following:In the title section: Despite the observance of structural principles in choosing the title, little effort has been made to innovate and take initiative in this area.Abstract: The abstract structure of most articles in the statistical community is not in line with the structural principles of the research.Keywords: In most articles, keywords are in accordance with the structural and content criteria of an original research.Introduction: The text of the introduction in the articles of the statistical community has a relatively high clarity and eloquence.Problem statement: In case studies, the problem statement is not very important and up-to-date.Research objectives: only a small percentage of articles have well stated the purpose of the research.Research questions and hypotheses: In the research, little attention has been paid to the purpose and research questions and hypotheses.Research background: Despite the emphasis of the Literary Arts Journal method on mentioning the research background, some articles i community have not mentioned it.Research method: A small percentage of have a research method.The content of the reviewed articles has high quality suggesting the sufficient mastery and knowledge of the researchers in the field of research.

Language and Literature, Indo-Iranian languages and literature

Detail DOI Sumber

DOAJ Open Access 2020

Review of The Oxford Handbook of Nietzsche

Maryam Arab

Nietzsche, as one of the most popular philosophers and elite thinkers that formed modern thought, has been studied from many dimensions. There are many books about him, and many scholars are still doing research on his life, works, and thoughts. The plurality of the works and the differences in the representation of Nietzsche’s image has led to doubt and confusion in the determination of Nietzsche’s position and understanding of his thought. The present study aims at introducing and evaluating a major, recent, and important work in Nietzsche field, called “The Oxford Handbook of Nietzsche”. This scientific and standard work has thirty-two essays from world-renowned scholars. Essays have been organized in six discrete sections such as biography, historical relations, principal works and fundamental issues such as values, epistemology and metaphysics, and developments of will to power. These essays contain striking and precise points that can influence our understanding of Nietzsche’s philosophy and provide us with a broad knowledge of the main elements of his philosophy, such as superman, will to power, and eternal recurrence. Therefore this paper, after an overview of the book and description of its general features, summarizes the content of each essay to arouse the audience’s interest in pursuing a thorough study.

Indo-Iranian languages and literature, General Works

Detail DOI Sumber

DOAJ Open Access 2020

Monika Browarczyk, Narrating Lives, Narrating Selves: Women’s Autobiographies in Hindi

Sanjukta Das Gupta

Indo-Iranian languages and literature, Languages and literature of Eastern Asia, Africa, Oceania

Detail DOI Sumber

DOAJ Open Access 2020

The Ottoman-Kurdish Bedirhani Family Between Imperial and Post-Imperial Contexts: Navigating Change and Narrating Experiences of Transition. PhD Dissertation by Barbara Henning

Indo-Iranian languages and literature, Literature (General)

Detail DOI Sumber

DOAJ Open Access 2020

نگاهی به پارادایم قدسی در فلسفه ادبیات و هنر معاصر ایران

یاسر فراشاهی نژاد

پارادایم به معنی سرمشق‌ها و نمونه‌های حاکم بر ذهن و زبان دانشمندان در دوره‌های مختلف است و با اندکی تسامح، در علوم انسانی و هنر نیز نظام‌های فکری غالب یا همان پارادایم‌ها قابل شناسایی است. هنرمندان بسیاری خود را ذیل این نمونه‌های غالب می‌بینند و پیوندی عمیق با اندیشه‌های حاکم در زمان خود می‌یابند. در ایران از سالهای 1320 غالب هنرمندان و نویسندگان خود را وقف تبلیغ برای پارادایم حاکم؛ یعنی مارکسیسم کرده‌بودند و در دهه چهل، هنر مدرنِ خودارجاع «کانتی» بود که هنرمندان و نویسندگان متعددی را مجذوب خود ساخت. یک دهه بعد، تفکرات بوم‌گرای اسلامی، که در ادامه‌ی برخوردهای مختلف با سنت پدید آمده بود، زمینه را برای رشد زیباشناسی «قدسی» ایجاد کرد. به بیان دیگر روشنفکران و نواندیشان دینی با بهره‌گیری از آرای فلاسفه غرب و سنت‌های مذهبی، نوع جدیدی از فلسفه‌ی هنر و ادبیات را در ایران پی نهادند. این نواندیشان که افرادی چون علی شریعتی، مرتضی مطهری، عبدالکریم سروش، داوری اردکانی و دیگران بودند، گرچه در موارد متعددی اختلاف نظر داشته و دارند، در زیباشناسی به حضور یک «امر قدسی» در هنر اسلامی باور دارند. بنابراین، باورمندی به امرقدسی در کنار شرایط اجتماعی و سیاسی ، به خصوص پس از انقلاب اسلامی، زمینه را برای رشد هنر اسلامی- انقلابی فراهم ساخته است.

Language and Literature, Indo-Iranian languages and literature

Detail DOI Sumber

DOAJ Open Access 2020

A Review of Rhythm in Persian Translations of the Works of Christian Bobin: Mahvash Ghavimi’s Translation of Geai and Isabelle Bruges

Saber Mohseni

The purpose of this article is to study the question of rhythm in the Persian translations of the work of Christian Bobin, presented by Mahvash Ghavimi. In Bobin’s writings, we meet people who seek happiness, but instead of fighting problems and changing conditions, they accept life as it is. After presenting the thought that Bobin confides to his readers in a simple and fluid style, we approach the question of rhythm as it is treated in the writings of Henri Meschonnic, a French translator. According to him, every text has its own rhythm which plays a crucial role in its significance. The translator must discover the factors involved in the significance of the text and recreate them in the translated text. We are then interested in studying the translation of two books of Bobin, entitled Geai and Isabelle Bruges, to see how the translator overcame the problems of recreating the rhythm.

Indo-Iranian languages and literature, General Works

Detail DOI Sumber

arXiv Open Access 2020

Contextual Linear Types for Differential Privacy

Matías Toro, David Darais, Chike Abuah et al.

Language support for differentially-private programming is both crucial and delicate. While elaborate program logics can be very expressive, type-system based approaches using linear types tend to be more lightweight and amenable to automatic checking and inference, and in particular in the presence of higher-order programming. Since the seminal design of Fuzz, which is restricted to $ε$-differential privacy in its original design, significant progress has been made to support more advancedvariants of differential privacy, like($ε$,$δ$)-differential privacy. However, supporting these advanced privacy variants while also supporting higher-order programming in full has proven to be challenging. We present Jazz, a language and type system which uses linear types and latent contextual effects to support both advanced variants of differential privacy and higher-order programming. Latent contextual effects allow delaying the payment of effects for connectives such as products, sums and functions, yielding advantages in terms of precision of the analysis and annotation burden upon elimination, as well as modularity. We formalize the core of Jazz, prove it sound for privacy via a logical relation for metric preservation, and illustrate its expressive power through a number of case studies drawn from the recent differential privacy literature.

en cs.PL, cs.LO

Detail Sumber

Hasil untuk "Indo-Iranian languages and literature"