Perish or Flourish? A Holistic Evaluation of Large Language Models for Code Generation in Functional Programming
Nguyet-Anh H. Lang, Eric Lang, Thanh Le-Cong
et al.
Functional programming provides strong foundations for developing reliable and secure software systems, yet its adoption remains not widespread due to the steep learning curve. Recent advances in Large Language Models (LLMs) for code generation present new opportunities to lower these barriers. However, extensive evaluations of LLMs largely focus on imperative programming languages, and their capabilities in functional programming languages (FP) remain underexplored. To address this gap, we introduce FPEval, a holistic evaluation framework built on FPBench, a new benchmark of 721 programming tasks across three difficulty levels on three mainstream FP languages: Haskell, Ocaml and Scala. FPEval provides compehensive evaluation infrastructures with both test validations with comprehensive test suites and static analysis tools to assess both functional correctness and code style and maintainability. Using this framework, we evaluate state-of-the-art LLMs, including GPT-3.5, GPT-4o, and GPT-5, for code generation in functional programming languages and Java as an imperative baseline. Our results demonstrate that LLM performance in functional programming improves substantially with model advancement; however, error rates remain significantly higher in purely functional languages (Haskell and OCaml) than in hybrid (Scala) or imperative (Java) languages. Moreover, LLMs frequently generate non-idiomatic functional code that follows imperative patterns, raising concerns about code style and long-term maintainability. Finally, we show that LLMs can partially self-repair both correctness and quality issues when provided with static analysis feedback and hand-crafted instructions for common types of issues.
A Layered Implementation Framework for Regular Languages
Baudouin Le Charlier
I present the most fundamental features of an implemented system designed to manipulate representations of regular languages. The system is structured into two layers, allowing regular languages to be represented in an increasingly compact, efficient, and integrated way. Both layers are first presented at a high level, adequate to design and prove the correctness of abstract algorithms. Then, their low-level implementations are described meticulously. At the high level, the first layer offers a notion of normalized regular expressions ensuring that the set of all syntactic derivatives of an expression is finite. At the low level, normalized expressions are uniquely represented by identifiers, i.e. by standard integers. The second layer, called the background, introduces additional notions to record, integrate, and simplify things computed within the first layer. At the high level, normalized expressions denoting the same regular language can be unified by grouping them into equivalence classes. One shortest expression is chosen in each class as its representative, which can be used to form equations relating expressions to their derivatives. This paper also presents extensive experimental results to demonstrate the usefulness of the proposed framework and, in particular, the fact that it makes it possible to represent large sets of regular languages in a unified way where distinct identifiers designate different languages, represented by both a small expression and a minimal deteministic automaton.
An Efficient Implementation of Guard-Based Synchronization for an Object-Oriented Programming Language
Shucai Yao, Emil Sekerinski
In the shared variable model of concurrency, guarded atomic actions restrict the possible interference between processes by regions of atomic execution. The guard specifies the condition for entering an atomic region. That is a convenient model for the specification and verification of concurrent programs, but has eschewed efficient execution so far. This article shows how guarded atomic actions, when attached to objects, can be implemented highly efficiently using a combination of coroutines, operating-system worker threads, and dedicated management of object queues and stacks. The efficiency of an experimental language, Lime, is shown to compare favourably with that of C/Pthreads, Go, Erlang, Java, and Haskell on synthetic benchmarks.
Dictionary of P. S. Pallas: Archaic and Innovative Features
Inna B. Mandzhieva
Introduction. The paper examines and describes some archaic and innovative features identified in a word list of P. S. Pallas from the treatise titled ‘Comparative Dictionaries of All Languages and Dialects Collected by the Order of Her Imperial Majesty’ and printed by I. K. Schnor (St. Petersburg) in 1787–1789. Special attention be paid to specific traits once inherent to various Kalmyk dialects. Goals. So, the study attempts a description of archaic and innovative features traced in the Kalmyk word list contained in the mentioned work of P. S. Pallas. Materials. The analysis focuses on the Kalmyk word list (including word combinations) from the specified dictionary that comprises a total of 531 entries. The former be supplemented with Proto-Mongolic reconstructions by H. Nugteren, seventeenth- and eighteenth-century Kalmyk dictionaries by G. F. Müller, B. Bergmann and J. Klaproth published in G. Doerfer’s Ältere Westeuropäische Quellen zur Kalmückischen Sprachengeschichte, The Dictionary of Kalmyk edited by B. Muniev. Results. The LingvoDoc-based survey has yielded certain correspondence rows of vowel graphemes compiled from ones in the dictionary of P. S. Pallas, that of modern Kalmyk, and reconstruction works of H. Nugteren. There are some archaisms and innovations that may have been recorded from an unidentified dialect, and the traces can be observed in other seventeenth- and eighteenth-century dictionaries too. The examined materials contain features of both Dorbet, Torghut, Buzav and Orenburg dialects.
History (General), Oriental languages and literatures
Semidirect Product Decompositions for Periodic Regular Languages
Yusuke Inoue, Kenji Hashimoto, Hiroyuki Seki
The definition of period in finite-state Markov chains can be extended to regular languages by considering the transitions of DFAs accepting them. For example, the language $(ΣΣ)^*$ has period two because the length of a recursion (cycle) in its DFA must be even. This paper shows that the period of a regular language appears as a cyclic group within its syntactic monoid. Specifically, we show that a regular language has period $P$ if and only if its syntactic monoid is isomorphic to a submonoid of a semidirect product between a specific finite monoid and the cyclic group of order $P$. Moreover, we explore the relation between the structure of Markov chains and our result, and apply this relation to the theory of probabilities of languages. We also discuss the Krohn-Rhodes decomposition of finite semigroups, which is strongly linked to our methods.
A Second Soul: Celebrating the Many Languages of Programming -- Festschrift in Honor of Peter Thiemann's Sixtieth Birthday
Annette Bieniusa, Markus Degen, Stefan Wehr
This Festschrift is dedicated to Peter Thiemann on the occasion of his sixtieth birthday, celebrating his significant contributions to the field of programming languages. Over the span of more than three decades, Peter has worked on a wide array of topics. This collection of five articles reflects the diversity of his work. The articles cover areas such as partial evaluation and reversible programming, proof assistants and dependent types, discrete mathematics and dynamic programming, functional and object-oriented programming.
A Comparison of The Content Of The Azerbaijan Press in The Reza Shah Period and The First Four Year Rule Od Mohammad Reza Shah (1925-1945)
Zühre Nur Celep
The quality of the newspapers published in Azerbaijan during the reign of Reza Shah had changed radically change compared to previous periods. The pages of the publications of the period were devoted to praising Reza Shah and contained various internal news and advertisements about Iran and Azerbaijan. No criticisms were found regarding the current situation of the country or the senior officials of the Pahlavi administration. Newspapers had a short lifespan and didn’t feature political or social articles addressing issues such as human rights and freedoms. However, the newspapers published between 1941-1945 started to include criticisms of the complicated situations of Iran and Azerbaijan during the administration of Reza Shah and his son. In addition, the publication of Turkish articles in newspapers, as well as the increase in quality and quantity indicated that the newspapers had removed the police state atmosphere that had been dominating the community. The most important reason for the oppressive situation during the initial period of 1925-1940 had been the absolute nature of the political power, its overwhelming dominance, and the lack of freedom of expression. However, the prevailing environment of political freedom during the second period of 1941-1945 led to the strengthening of the activities of parties and structures with Soviet tendencies. As a result, newspaper publications were realized in integration with these aforementioned movements.
Oriental languages and literatures
Decolonizing imperialist discourse in Jane Austen’s Persuasion: A Saidian perspective
Muna Abd-Rabbo, Ghadir Zalloum, Ziad Nemrawi
In his highly influential work Culture and Imperialism , Edward Said unravels the imperialist undertones in Jane Austen’s Mansfield Park . Throughout the chapter entitled “Jane Austen and the Empire,” Said demonstrates how this seemingly domestic novel of manners, not normally associated with imperialism, is actually densely saturated with colonialist discourse. For Said, the marginalized representation of the colonized territory of Antigua as simply a “colonial garden” for the British imperial patriarch further accentuates the superior sense of colonialist entitlement. Thus, Said’s approach in decolonizing the imperialist discourse in Mansfield Park may be extended to other canonical works not generally considered imperialist in nature. In this article, the researchers utilize Said’s strategies involved in his reading of Mansfield Park to probe the imperialist nuances in Austen’s Persuasion , a novel usually categorized as a romance/novel of manners which depicts two lovers’ second chance at happiness despite all the social obstacles in their way. The researchers attempt to foreground the imperialist rhetoric in this novel, specifically Austen’s tendency to romanticize and glorify the rising British naval society as the champions of the Empire. Furthermore, this article investigates the absent, peripheral representation of colonial terrains as opposed to the privileged, central position of the British Empire in the narrative.
Oriental languages and literatures
Unifying Static and Dynamic Intermediate Languages for Accelerator Generators
Caleb Kim, Pai Li, Anshuman Mohan
et al.
Compilers for accelerator design languages (ADLs) translate high-level languages into application-specific hardware. ADL compilers rely on a hardware control interface to compose hardware units. There are two choices: static control, which relies on cycle-level timing; or dynamic control, which uses explicit signalling to avoid depending on timing details. Static control is efficient but brittle; dynamic control incurs hardware costs to support compositional reasoning. Piezo is an ADL compiler that unifies static and dynamic control in a single intermediate language (IL). Its key insight is that the IL's static fragment is a refinement of its dynamic fragment: static code admits a subset of the run-time behaviors of the dynamic equivalent. Piezo can optimize code by combining facts from static and dynamic submodules, and it opportunistically converts code from dynamic to static control styles. We implement Piezo as an extension to an existing dynamic ADL compiler, Calyx. We use Piezo to implement an MLIR frontend, a systolic array generator, and a packet-scheduling hardware generator to demonstrate its optimizations and the static-dynamic interactions it enables.
Language-Integrated Query for Temporal Data (Extended version)
Simon Fowler, Vashti Galpin, James Cheney
Modern applications often manage time-varying data. Despite decades of research on temporal databases, which culminated in the addition of temporal data operations into the SQL:2011 standard, temporal data query and manipulation operations are unavailable in most mainstream database management systems, leaving developers with the unenviable task of implementing such functionality from scratch. In this paper, we extend \emph{language-integrated query} to support writing temporal queries and updates in a uniform host language, with the language performing the required rewriting to emulate temporal capabilities automatically on any standard relational database. We introduce two core languages, $λ_{\mathsf{TLINQ}}$ and $λ_{\mathsf{VLINQ}}$, for manipulating transaction time and valid time data respectively, and formalise existing implementation strategies by giving provably correct semantics-preserving translations into a non-temporal core language, $λ_{\mathsf{LINQ}}$. We show how existing work on query normalisation supports a surprisingly simple implementation strategy for \emph{sequenced joins}. We implement our approach in the Links programming language, and describe a non-trivial case study based on curating COVID-19 statistics.
Removing Qualified Names in Modular Languages
Keehang Kwon, Daeseong Kang
Although the notion of qualified names is popular in module systems, it causes severe complications. In this paper, we propose an alternative to qualified names. The key idea is to import the declarations in other modules to the current module before they are used. In this way, all the declarations can be accessed locally. However, this approach is not efficient in memory usage. Our contribution is the {\it module weakening} scheme which allows us to import the minimal parts. As an example of this approach, we propose a module system for functional languages.
Symmetries in Reversible Programming: From Symmetric Rig Groupoids to Reversible Programming Languages
Vikraman Choudhury, Jacek Karwowski, Amr Sabry
The $\mathitΠ$ family of reversible programming languages for boolean circuits is presented as a syntax of combinators witnessing type isomorphisms of algebraic datatypes. In this paper, we give a denotational semantics for this language, using the language of weak groupoids à la Homotopy Type Theory, and show how to derive an equational theory for it, presented by 2-combinators witnessing equivalences of reversible circuits. We establish a correspondence between the syntactic groupoid of the language and a formally presented univalent subuniverse of finite types. The correspondence relates 1-combinators to 1-paths, and 2-combinators to 2-paths in the universe, which is shown to be sound and complete for both levels, establishing full abstraction and adequacy. We extend the already established Curry-Howard correspondence for $\mathitΠ$ to a Curry-Howard-Lambek correspondence between Reversible Logic, Reversible Programming Languages, and Symmetric Rig Groupoids, by showing that the syntax of $\mathitΠ$ is presented by the free symmetric rig groupoid, given by finite sets and permutations. Our proof uses techniques from the theory of group presentations and rewriting systems to solve the word problem for symmetric groups. Using the formalisation of our results, we show how to perform normalisation-by-evaluation, verification, and synthesis of reversible logic gates, motivated by examples from quantum computing.
German wh-copying: a top-down analysis
Giuseppe Rugna
German wh-copying is often taken to represent clear evidence for successive cyclicity and for the Copy Theory of Movement. The generative literature has focused on a particular type of wh-copying displaying morphophonological identity among the overtly realized members of the A’-chain. The present article discusses the case of two additional types of wh-copying found in German, i.e. ‘imperfect’ and ‘complex’ wh-coping. It will be argued that standard bottom-up analyses run into a few complications when extended to account for the latter types of wh-copying. A novel analysis embedded in a Top-Down derivational model of grammar is then proposed, which is argued to be conceptually as well as empirically superior over more traditional alternatives. The analysis of complex wh-copying in German is further extended to the case of Afrikaans and dialectal Dutch.
Language. Linguistic theory. Comparative grammar, Oriental languages and literatures
Metadiscourse Markers in Scientific Journal Articles
Veronica Esti Nugrahani, Barli Bram
This paper aimed to investigate the use of metadiscourse markers in scientific journal articles. Data of this qualitative research consisted of metadiscourse markers collected from eight journal articles of a special edition published by LLT Journal: A Journal on Language and Language Teaching. The collected metadiscourse markers used in the journal articles were analyzed using discourse analysis based on ten metadiscourse marker categories. Results showed that the analysed journal articles contained 708 metadiscourse markers, with more interactive metadiscourse markers, reaching 529 occurrences, than interactional metadiscourse markers, occurring 179 times. Transitions, such as “but” and “thus”, with 249 occurrences, were the most frequently-used metadiscourse marker and boosters, such as “in fact” and “definitely”, with 24 occurrences, were the least productive marker. Thus, readers can gain a better understanding of the use of metadiscourse markers when using English. It is expected that English language learners and instructors can benefit from the results of this study, particularly concerning the use of metadiscourse markers in academic writing.
Language and Literature, English language
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
Chien-Sheng Wu, Steven Hoi, Richard Socher
et al.
The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice. In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling. We propose a contrastive objective function to simulate the response selection task. Our pre-trained task-oriented dialogue BERT (TOD-BERT) outperforms strong baselines like BERT on four downstream task-oriented dialogue applications, including intention recognition, dialogue state tracking, dialogue act prediction, and response selection. We also show that TOD-BERT has a stronger few-shot ability that can mitigate the data scarcity problem for task-oriented dialogue.
A Simple Language Model for Task-Oriented Dialogue
Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu
et al.
Task-oriented dialogue is often decomposed into three tasks: understanding user input, deciding actions, and generating a response. While such decomposition might suggest a dedicated model for each sub-task, we find a simple, unified approach leads to state-of-the-art performance on the MultiWOZ dataset. SimpleTOD is a simple approach to task-oriented dialogue that uses a single, causal language model trained on all sub-tasks recast as a single sequence prediction problem. This allows SimpleTOD to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2. SimpleTOD improves over the prior state-of-the-art in joint goal accuracy for dialogue state tracking, and our analysis reveals robustness to noisy annotations in this setting. SimpleTOD also improves the main metrics used to evaluate action decisions and response generation in an end-to-end setting: inform rate by 8.1 points, success rate by 9.7 points, and combined score by 7.2 points.
Implementasi Metode Qiyasiyah terhadap Kemampuan Santri dalam Memahami Kitab Al-Jurumiyah
Mochamad Mu’izzuddin
Penelitian ini bertujuan untuk mengetahui implementasi metode qiyȃsiy di Pesantren Ath-Thahiriyah, mengetahui kemampuan santri memahami Al-Jurumiyah, mengetahui hubungan impementasi metode qiyȃsiy dengan kemampuan santri memahami kitab Al-Jurumiyah, dan mengetahui pengaruh implementasi metode qiyȃsiy terhadap kemampuan santri memahami kitab Jurumiyah. Metode yang digunakan dalam penelitian ini yaitu survey dengan pendekatan korelasional dan kuasi eksperimen dengan desain Nonequivalent Control Grup Pretest-Postest Design. Populasi dan sampel dalam penelitian ini adalah seluruh santri Pesantren Ath-Thahiriyah Lontar Baru Kota Serang Provinsi Banten yang berjumlah 30 orang. Data penelitian dikumpulkan melalui angket, wawancara, dan tes. Data diolah melalui bantang SPSS versi 16,0. Hasil penelitian menunjukkan bahwa implementasi metode qiyȃsiy di pesantren Ath-Thahiriyah Lontar Baru dilaksanakan setiap kajian Al-Jurumiyah yang dinyatakan kategori nilai sering/baik dan nilai rerata persentasenya adalah 50,7%, kemampuan santri dalam memahami kitab Al-Jurumiyah menunjukkan nilai rerata 86,83, median 90,75, dan modus 98,59 yang dikategorikan sangat baik, tidak terdapat hubungan positif dan signifikan antara implementasi metode qiyȃsiy dengan kemampuan memahami kitab Jurumiyah sebesar 0,119, dan memberikan pengaruh positif dan signifikan antara implementasi metode qiyȃsiy dengan kemampuan memahami kitab Jurumiyah sebesar 8,20 dan besaran kontribusi variable X terhadap variabel Y sebesar 67,24 %, sisanya sekitar 32,76% dipengaruhi oleh faktor-faktor lainnya yang tidak diteliti.
Education, Education (General)
Book review
Oriental languages and literatures