Hasil "Romanic languages"

arXiv Open Access 2026

Turn Complexity of Context-free Languages, Pushdown Automata and One-Counter Automata

Giovanni Pighizzini

A turn in a computation of a pushdown automaton is a switch from a phase in which the height of the pushdown store increases to a phase in which it decreases. Given a pushdown or one-counter automaton, we consider, for each string in its language, the minimum number of turns made in accepting computations. We prove that it cannot be decided if this number is bounded by any constants. Furthermore, we obtain a non-recursive trade-off between pushdown and one-counter automata accepting in a finite number of turns and finite-turn pushdown automata, that are defined requiring that the constant bound is satisfied by each accepting computation. We prove that there are languages accepted in a sublinear but not constant number of turns, with respect to the input length. Furthermore, there exists an infinite proper hierarchy of complexity classes, with the number of turns bounded by different sublinear functions. In addition, there is a language requiring a number of turns which is not constant but grows slower than each of the functions defining the above hierarchy.

en cs.FL

Detail Sumber

DOAJ Open Access 2025

Crònica parlamentària de Catalunya. Segon semestre de 2024. “La legislatura arrenca amb un impuls decidit pel català, però el Pacte Nacional per la Llengua es fa esperar”

Roser Serra i Albert

Resum comentat de les iniciatives parlamentàries del segon semestre de 2024. El període coincideix amb l’inici d’una nova legislatura que posa un èmfasi especial en el reforç de l’ús social i la vitalitat de la llengua catalana. Una legislatura que ha estat designada políticament com “la legislatura del català”, en un context d’emergència lingüística. El recull inclou les primeres intervencions dels membres del Govern, que han mostrat el seu compromís amb la llengua, malgrat que el Pacte Nacional per la Llengua encara no s’ha aprovat. El treball analitza els procediments parlamentaris vinculats al multilingüisme i la política lingüística, així com les referències a la llengua catalana, l’aranès i la llengua de signes catalana.

Language and Literature, Romanic languages

Detail DOI Sumber

DOAJ Open Access 2025

Non è un linguaggio per donne: le parole del diritto tra resistenze e opportunità

Barbara Malaisi

The legal language is a complex specialist language anchored in an established linguistic tradition and resistant to change, even when this would be more than appropriate, as in the case of adaptation to an increasingly widespread social sensitivity towards the use of gender-inclusive language. Italian law is strongly oriented to the use of overextended masculine and the rules on gender language, in addition to not covering all levels of normative production, are soft law rules whose transgression has no consequences at the legal level. The essay reflects on the need to positivize some minimum ways of writing law that are respectful of gender, in a democratic and truly inclusive perspective.

Romanic languages

Detail DOI Sumber

arXiv Open Access 2025

CrossTL: A Universal Programming Language Translator with Unified Intermediate Representation

Nripesh Niketan, Vaatsalya Shrivastva

We present CrossTL, a universal programming language translator enabling bidirectional translation between multiple languages through a unified intermediate representation called CrossGL. Traditional approaches require separate translators for each language pair, leading to exponential complexity growth. CrossTL uses a single universal IR to facilitate translations between CUDA, HIP, Metal, DirectX HLSL, OpenGL GLSL, Vulkan SPIR-V, Rust, and Mojo, with Slang support in development. Our system consists of: language-specific lexers/parsers converting source code to ASTs, bidirectional CrossGL translation modules implementing ToCrossGLConverter classes for importing code and CodeGen classes for target generation, and comprehensive backend implementations handling full translation pipelines. We demonstrate effectiveness through comprehensive evaluation across programming domains, achieving successful compilation and execution across all supported backends. The universal IR design enables adding new languages with minimal effort, requiring only language-specific frontend/backend components. Our contributions include: (1) a unified IR capturing semantics of multiple programming paradigms, (2) a modular architecture enabling extensibility, (3) a comprehensive framework supporting GPU compute, graphics programming, and systems languages, and (4) empirical validation demonstrating practical viability of universal code translation. CrossTL represents a significant step toward language-agnostic programming, enabling write-once, deploy-everywhere development.

en cs.PL, cs.CL

Detail DOI Sumber

arXiv Open Access 2025

Graph Rewriting Language as a Platform for Quantum Diagrammatic Calculi

Kayo Tei, Haruto Mishina, Naoki Yamamoto et al.

Systematic discovery of optimization paths in quantum circuit simplification remains a challenge. Today, ZX-calculus, a computing model for quantum circuit transformation, is attracting attention for its highly abstract graph-based approach. Whereas existing tools such as PyZX and Quantomatic offer domain-specific support for quantum circuit optimization, visualization and theorem-proving, we present a complementary approach using LMNtal, a general-purpose hierarchical graph rewriting language, to establish a diagrammatic transformation and verification platform with model checking. Our methodology shows three advantages: (1) manipulation of ZX-diagrams through native graph transformation rules, enabling direct implementation of basic rules; (2) quantified pattern matching via QLMNtal extensions, greatly simplifying rule specification; and (3) interactive visualization and validation of optimization paths through state space exploration. Through case studies, we demonstrate how our framework helps understand optimization paths and design new algorithms and strategies. This suggests that the declarative language LMNtal and its toolchain could serve as a new platform to investigate quantum circuit transformation from a different perspective.

en cs.PL

Detail Sumber

arXiv Open Access 2025

IndoNLP 2025: Shared Task on Real-Time Reverse Transliteration for Romanized Indo-Aryan languages

Deshan Sumanathilaka, Isuri Anuradha, Ruvan Weerasinghe et al.

The paper overviews the shared task on Real-Time Reverse Transliteration for Romanized Indo-Aryan languages. It focuses on the reverse transliteration of low-resourced languages in the Indo-Aryan family to their native scripts. Typing Romanized Indo-Aryan languages using ad-hoc transliterals and achieving accurate native scripts are complex and often inaccurate processes with the current keyboard systems. This task aims to introduce and evaluate a real-time reverse transliterator that converts Romanized Indo-Aryan languages to their native scripts, improving the typing experience for users. Out of 11 registered teams, four teams participated in the final evaluation phase with transliteration models for Sinhala, Hindi and Malayalam. These proposed solutions not only solve the issue of ad-hoc transliteration but also empower low-resource language usability in the digital arena.

en cs.CL

Detail Sumber

DOAJ Open Access 2024

L’émergence d’un nouveau modèle de traduction – théorie et pratique de la traduction multimodale sur l’exemple de la traduction en français de l’album Kłopot d’Iwona Chmielewska

Anna Kochanowska

Translation activity is traditionally perceived as monomodal, focused on the written or spoken word. However, the evolution of contemporary societies, increasingly visual, often pushes the translator to work on a multimodal product, where language is only one of the modes of communication, hence the emergence of models of multimodal translation. The present contribution studies the literary multimodal translation model based on the album Kłopot by Iwona Chmielewska, world-renowned author and illustrator. The contribution hopes to demonstrate the challenges and opportunities of the multimodal translation workshop, seen as a growing professional activity in today’s multimodal societies.

Romanic languages, Philology. Linguistics

Detail DOI Sumber

arXiv Open Access 2024

RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization

Jaavid Aktar Husain, Raj Dabre, Aswanth Kumar et al.

This study addresses the challenge of extending Large Language Models (LLMs) to non-English languages that use non-Roman scripts. We propose an approach that utilizes the romanized form of text as an interface for LLMs, hypothesizing that its frequent informal use and shared tokens with English enhance cross-lingual alignment. Our approach involves the continual pretraining of an English LLM like Llama 2 on romanized text of non-English, non-Roman script languages, followed by instruction tuning on romanized data. The results indicate that romanized text not only reduces token fertility by 2x-4x but also matches or outperforms native script representation across various NLU, NLG, and MT tasks. Moreover, the embeddings computed on romanized text exhibit closer alignment with their English translations than those from the native script. Our approach presents a promising direction for leveraging the power of English LLMs in languages traditionally underrepresented in NLP. Our code is available on https://github.com/AI4Bharat/romansetu.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

Polymorphic Records for Dynamic Languages

Giuseppe Castagna, Loïc Peyrot

We define and study "row polymorphism" for a type system with set-theoretic types, specifically union, intersection, and negation types. We consider record types that embed row variables and define a subtyping relation by interpreting types into sets of record values and by defining subtyping as the containment of interpretations. We define a functional calculus equipped with operations for field extension, selection, and deletion, its operational semantics, and a type system that we prove to be sound. We provide algorithms for deciding the typing and subtyping relations. This research is motivated by the current trend of defining static type system for dynamic languages and, in our case, by an ongoing effort of endowing the Elixir programming language with a gradual type system.

en cs.PL

Detail DOI Sumber

arXiv Open Access 2024

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Ashmal Vayani, Dinura Dissanayake, Hasindri Watawana et al.

Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is increasingly important to ensure they understand cultural contexts, respect local sensitivities, and support low-resource languages, all while effectively integrating corresponding visual cues. In pursuit of culturally diverse global multimodal models, our proposed All Languages Matter Benchmark (ALM-bench) represents the largest and most comprehensive effort to date for evaluating LMMs across 100 languages. ALM-bench challenges existing models by testing their ability to understand and reason about culturally diverse images paired with text in various languages, including many low-resource languages traditionally underrepresented in LMM research. The benchmark offers a robust and nuanced evaluation framework featuring various question formats, including true/false, multiple choice, and open-ended questions, which are further divided into short and long-answer categories. ALM-bench design ensures a comprehensive assessment of a model's ability to handle varied levels of difficulty in visual and linguistic reasoning. To capture the rich tapestry of global cultures, ALM-bench carefully curates content from 13 distinct cultural aspects, ranging from traditions and rituals to famous personalities and celebrations. Through this, ALM-bench not only provides a rigorous testing ground for state-of-the-art open and closed-source LMMs but also highlights the importance of cultural and linguistic inclusivity, encouraging the development of models that can serve diverse global populations effectively. Our benchmark is publicly available.

en cs.CV, cs.CL

Detail Sumber

arXiv Open Access 2024

Overview of the First Workshop on Language Models for Low-Resource Languages (LoResLM 2025)

Hansi Hettiarachchi, Tharindu Ranasinghe, Paul Rayson et al.

The first Workshop on Language Models for Low-Resource Languages (LoResLM 2025) was held in conjunction with the 31st International Conference on Computational Linguistics (COLING 2025) in Abu Dhabi, United Arab Emirates. This workshop mainly aimed to provide a forum for researchers to share and discuss their ongoing work on language models (LMs) focusing on low-resource languages, following the recent advancements in neural language models and their linguistic biases towards high-resource languages. LoResLM 2025 attracted notable interest from the natural language processing (NLP) community, resulting in 35 accepted papers from 52 submissions. These contributions cover a broad range of low-resource languages from eight language families and 13 diverse research areas, paving the way for future possibilities and promoting linguistic inclusivity in NLP.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

Statically Contextualizing Large Language Models with Typed Holes

Andrew Blinn, Xiang Li, June Hyung Kim et al.

Large language models (LLMs) have reshaped the landscape of program synthesis. However, contemporary LLM-based code completion systems often hallucinate broken code because they lack appropriate context, particularly when working with definitions not in the training data nor near the cursor. This paper demonstrates that tight integration with the type and binding structure of a language, as exposed by its language server, can address this contextualization problem in a token-efficient manner. In short, we contend that AIs need IDEs, too! In particular, we integrate LLM code generation into the Hazel live program sketching environment. The Hazel Language Server identifies the type and typing context of the hole being filled, even in the presence of errors, ensuring that a meaningful program sketch is always available. This allows prompting with codebase-wide contextual information not lexically local to the cursor, nor necessarily in the same file, but that is likely to be semantically local to the developer's goal. Completions synthesized by the LLM are then iteratively refined via further dialog with the language server. To evaluate these techniques, we introduce MVUBench, a dataset of model-view-update (MVU) web applications. These applications serve as challenge problems due to their reliance on application-specific data structures. We find that contextualization with type definitions is particularly impactful. After introducing our ideas in the context of Hazel we duplicate our techniques and port MVUBench to TypeScript in order to validate the applicability of these methods to higher-resource languages. Finally, we outline ChatLSP, a conservative extension to the Language Server Protocol (LSP) that language servers can implement to expose capabilities that AI code completion systems of various designs can use to incorporate static context when generating prompts for an LLM.

en cs.PL, cs.AI

Detail DOI Sumber

arXiv Open Access 2023

Transport via Partial Galois Connections and Equivalences

Kevin Kappelmann

Multiple types can represent the same concept. For example, lists and trees can both represent sets. Unfortunately, this easily leads to incomplete libraries: some set-operations may only be available on lists, others only on trees. Similarly, subtypes and quotients are commonly used to construct new type abstractions in formal verification. In such cases, one often wishes to reuse operations on the representation type for the new type abstraction, but to no avail: the types are not the same. To address these problems, we present a new framework that transports programs via equivalences. Existing transport frameworks are either designed for dependently typed, constructive proof assistants, use univalence, or are restricted to partial quotient types. Our framework (1) is designed for simple type theory, (2) generalises previous approaches working on partial quotient types, and (3) is based on standard mathematical concepts, particularly Galois connections and equivalences. We introduce the notion of partial Galois connections and equivalences and prove their closure properties under (dependent) function relators, (co)datatypes, and compositions. We formalised the framework in Isabelle/HOL and provide a prototype. This is the extended version of "Transport via Partial Galois Connections and Equivalences", 21st Asian Symposium on Programming Languages and Systems, 2023.

en cs.PL, cs.LO

Detail DOI Sumber

arXiv Open Access 2023

Proceedings of the 18th International Workshop on Logical Frameworks and Meta-Languages: Theory and Practice

Alberto Ciaffaglione, Carlos Olarte

Logical frameworks and meta-languages form a common substrate for representing, implementing and reasoning about a wide variety of deductive systems of interest in logic and computer science. Their design, implementation and their use in reasoning tasks, ranging from the correctness of software to the properties of formal systems, have been the focus of considerable research over the last two decades. This workshop brings together designers, implementors and practitioners to discuss various aspects impinging on the structure and utility of logical frameworks, including the treatment of variable binding, inductive and co-inductive reasoning techniques and the expressiveness and lucidity of the reasoning process.

en cs.LO, cs.PL

Detail DOI Sumber

arXiv Open Access 2023

Polymorphic Type Inference for Dynamic Languages

Giuseppe Castagna, Mickaël Laurent, Kim Nguyen

We present a type system that combines, in a controlled way, first-order polymorphism with intersectiontypes, union types, and subtyping, and prove its safety. We then define a type reconstruction algorithm that issound and terminating. This yields a system in which unannotated functions are given polymorphic types(thanks to Hindley-Milner) that can express the overloaded behavior of the functions they type (thanks tothe intersection introduction rule) and that are deduced by applying advanced techniques of type narrowing(thanks to the union elimination rule). This makes the system a prime candidate to type dynamic languages.

en cs.PL

Detail DOI Sumber

DOAJ Open Access 2022

Dimension linguistique, discursive et intellectuelle de la rédaction de texte académique en langue étrangère. Exemple des résumés de mémoires de licence et de master à la philologie française

Monika Grabowska, Witold Ucherek

The article focuses on the French summaries of BA & MA theses written by students of French philology at the University of Wrocław between 2015 and 2020. The objective is to determine to what extent the linguistic, discursive and intellectual dimensions of this short academic text constitute, for the students, a source of challenges during the writing process. The general conclusion is that very often the summary of a diploma thesis looks like a report detailing the activities of the student, instead of summarizing an intellectual trajectory and informing of the results of the research.

Romanic languages, Philology. Linguistics

Detail Sumber

S2 Open Access 2021

Development of scientific, technological and innovation space in Ukraine and EU countries

The joint monograph presents the current research of scientific innovation field in Ukraine and EU countries. General questions of comparative-historical, typological linguistics, Romanic and Germanic languages, history of pedagogy, theory and methods of teaching, pedagogical and developmental psychology, psychology of activity in special conditions, etc. are considered. The publication is intended for scientists, educators, graduate and undergraduate students, as well as a general audience.

en

Detail DOI Sumber

S2 Open Access 2020

Kritische Bemerkungen zur Lage der Romanistik im Zeichen von ›English only‹

Hans Goebl

The 180-year old discipline of Romance Studies hinges upon the broadest possible multilingualism of those who undertake it. This means that every researcher working in the field of Romance Studies must have comprehensive knowledge of a wide array of Romanic languages, alongside with a basic sympathy towards the Romanic countries, languages and cultures. All of this has been fundamentally questioned in the past 30 years by the spreading misconception that even in the humanities it is only the use of English that guarantees the quality of research and scientific publications. This article tackles this issue on the example of the author’s biography and analyzes some distinctive ›case studies‹ without reaching a compromise.

1 sitasi en

Detail DOI Sumber

DOAJ Open Access 2020

Producción del fonema /s/ en una muestra de niños hablantes del español de Chile: adquisición de los aspectos dialectales

Pilar Vivar Vivar, Eduardo Arteaga Viveros, Eduardo Arteaga Viveros et al.

Se investigó la adquisición de la consonante fricativa alveolar /s/ en posición implosiva (medial y final de palabra) en una muestra compuesta por 161 informantes hablantes del español de Chile, divididos en cuatro grupos etarios desde los 2;0 a los 3;11 años. El siguiente estudio tuvo por objetivo principal abordar este segmento desde una perspectiva dialectal, con el fin de conocer en qué momento de la adquisición los niños y las niñas comienzan a utilizar las variantes de este sonido, es decir, [h] y [Ø]. Los resultados fueron analizados según las variables edad y nivel socioeconómico.

Language and Literature, Romanic languages

Detail Sumber

arXiv Open Access 2020

The Commutative Closure of Shuffle Expressions over Group Languages is Regular

Stefan Hoffmann

We show that the commutative closure combined with the iterated shuffle is a regularity-preserving operation on group languages. In particular, for commutative group languages, the iterated shuffle is a regularity-preserving operation. We also give bounds for the size of minimal recognizing automata. Then, we use these results to deduce that the commutative closure of any shuffle expression over group languages, i.e., expressions involving shuffle, iterated shuffle, concatenation, Kleene star and union in any order, starting with the group languages, always yields a regular language.

en cs.FL

Detail Sumber

Hasil untuk "Romanic languages"