We study grammar-constrained decoding (GCD) as a coupling between an autoregressive next-token distribution and a reachability oracle over a pushdown system compiled from a context-free grammar (CFG). We prove an oracle invariance theorem: language-equivalent grammars induce identical admissible next-token sets for every prefix, hence identical logit masks, yet can yield provably different compiled state spaces and online ambiguity costs. We give exact control-state blowup counts for the canonical $a^n b^n$ language under redundant nonterminal delegation, and introduce a left-to-right structural ambiguity cost (SAC) measuring incremental packed-parse-forest growth per token. For two equivalent grammars over all finite strings, SAC is $O(1)$ per token under right-recursion but $Θ(t^2)$ per token and $Θ(n^3)$ cumulatively under concatenation. We establish engine-independent lower bounds: any sound, retrieval-efficient, parse-preserving online masking engine must incur $Ω(t^2)$ work per token on a specific constant-size CFG family, unconditionally within this model. We define decoding-cost equivalence classes of grammars and prove existence of minimal-SAC representatives within bounded rewrite families. Finally, we characterize the true conditional sampler via a Doob $h$-transform and derive sharp one-step KL and total-variation distortion bounds for hard-masked decoding in terms of survival-probability spread among admissible next tokens. We integrate these results with Transformer and Mixture-of-Experts architectures, derive latency envelopes in terms of vocabulary size, active state sets, and beam width, and connect SAC to instrumentation-based predictive performance models and automated grammar optimization.
Celem artykułu jest omówienie najistotniejszych nawiązań sagi o wiedźminie Andrzeja Sapkowskiego do mitologii, kultury i historii, a także zobrazowanie występujących w niej językowych aspektów powieściowych. Artykuł ma na celu ukazanie, iż nawet literatura popularna (fantasy) zawiera w sobie wiele interesujących elementów, które warto poddać pogłębionej analizie. Na podstawie dotychczasowych badań powieści wiedźmińskich oraz gier komputerowych omówiono najważniejsze odwołania cyklu wiedźmińskiego, przedstawiono skrócone opisy istot nieludzkich, a także przeanalizowano stworzony przez Sapkowskiego język bohaterów jako nowy system językowy. Tezy badawcze poparte zostały cytatami z poszczególnych tomów sagi oraz gier. Podjęto próbę odczytania znaczeń niektórych form językowych ze Starszej Mowy w oparciu o dostępne źródła. Badania te ukazują, iż saga o wiedźminie nie bez przyczyny stała się fenomenem w popkulturze światowej.
What counts as evidence for syntactic structure? In traditional generative grammar, systematic contrasts in grammaticality such as subject-auxiliary inversion and the licensing of parasitic gaps are taken as evidence for an internal, hierarchical grammar. In this paper, we test whether large language models (LLMs), trained only on surface forms, reproduce these contrasts in ways that imply an underlying structural representation. We focus on two classic constructions: subject-auxiliary inversion (testing recognition of the subject boundary) and parasitic gap licensing (testing abstract dependency structure). We evaluate models including GPT-4 and LLaMA-3 using prompts eliciting acceptability ratings. Results show that LLMs reliably distinguish between grammatical and ungrammatical variants in both constructions, and as such support that they are sensitive to structure and not just linear order. Structural generalizations, distinct from cognitive knowledge, emerge from predictive training on surface forms, suggesting functional sensitivity to syntax without explicit encoding.
The article deals with contemporary approaches to teaching English grammar in U.S. universities in the context of ongoing transformations in higher education and the growing demand for communicative and professionally oriented foreign language training. Grammar instruction is viewed not as an isolated component of language learning, but as an integral part of communicative competence development that supports students’ ability to use language accurately, fluently, and appropriately in academic and professional settings. The study focuses on four widely implemented and theoretically grounded approaches to grammar teaching in U.S. higher education institutions: Communicative Grammar Teaching (CGT), Task-Based Language Teaching (TBLT), inductive and deductive approaches to grammar instruction, and Technology-Enhanced Grammar Teaching. The article provides a detailed analysis of the theoretical foundations of each approach. Special attention is paid to the pedagogical principles underlying communicative and task-based grammar instruction, which emphasize meaningful interaction, contextualized language use, and learner engagement in problem-solving activities. The inductive and deductive approaches are examined in terms of their cognitive and methodological value, highlighting their relevance for different learning styles, proficiency levels, and instructional goals. The study also explores the role of digital technologies in grammar teaching, including online corpora, mobile applications, and adaptive learning platforms, which contribute to individualized instruction, increased learner autonomy, and formative assessment. The article argues that effective grammar teaching in U.S. universities is characterized by methodological flexibility, integration of form and meaning, and the purposeful combination of traditional and innovative instructional practices. It is concluded that the balanced use of the analyzed approaches enhances students’ grammatical accuracy, communicative competence, and motivation for learning, and contributes to the overall quality of foreign language education in higher education institutions.
Evarista Onokpasa, Sebastian Wild, Prudence W. H. Wong
In past work (Onokpasa, Wild, Wong, DCC 2023), we showed that (a) for joint compression of RNA sequence and structure, stochastic context-free grammars are the best known compressors and (b) that grammars which have better compression ability also show better performance in ab initio structure prediction. Previous grammars were manually curated by human experts. In this work, we develop a framework for automatic and systematic search algorithms for stochastic grammars with better compression (and prediction) ability for RNA. We perform an exhaustive search of small grammars and identify grammars that surpass the performance of human-expert grammars.
Surgical procedures are inherently complex and dynamic, with intricate dependencies and various execution paths. Accurate identification of the intentions behind critical actions, referred to as Primary Intentions (PIs), is crucial to understanding and planning the procedure. This paper presents a novel framework that advances PI recognition in instructional videos by combining top-down grammatical structure with bottom-up visual cues. The grammatical structure is based on a rich corpus of surgical procedures, offering a hierarchical perspective on surgical activities. A grammar parser, utilizing the surgical activity grammar, processes visual data obtained from laparoscopic images through surgical action detectors, ensuring a more precise interpretation of the visual information. Experimental results on the benchmark dataset demonstrate that our method outperforms existing surgical activity detectors that rely solely on visual features. Our research provides a promising foundation for developing advanced robotic surgical systems with enhanced planning and automation capabilities.
يتناول هذا البحث تطور النحو العربي مع التركيز على مقارنة بين المدارس النحوية التقليدية والمناهج الحديثة. يستعرض البحث نشأة النحو العربي على يد أبي الأسود الدؤلي، مع تسليط الضوء على مساهمات المدارس البصرية والكوفية خلال العصر العباسي، حيث تتميز المدرسة البصرية بالالتزام الصارم بالقواعد المعيارية، بينما تتميز الكوفية بالمرونة وقبول التنوع اللهجي. كما يناقش البحث التحديات المعاصرة في تعليم النحو، مثل تعقيد القواعد وضعف الإقبال على دراستها. يتم تحليل محاولات تبسيط النحو في العصر الحديث، بما في ذلك جهود إبراهيم مصطفى في كتاب إحياء النحو، والمناهج الدراسية الحديثة. يُبرز البحث كذلك دور التكنولوجيا الحديثة، مثل تطبيقات الذكاء الاصطناعي والبرامج التفاعلية، في تعليم النحو وتصحيح النصوص. تُظهر النتائج أهمية ابتكار أساليب جديدة لتبسيط النحو وتيسير تعلمه، لضمان استمرارية دوره في صون اللغة العربية والحفاظ على هويتها.
The current study is based on a pseudo-longitudinal design to investigate the trajectory of Foreign Language Enjoyment (FLE), Foreign Language Peace of Mind (FLPOM), Foreign Language Classroom Anxiety (FLCA), Foreign Language Boredom (FLB) among a total of 502 Beginner, Intermediate and Advanced English Foreign Language (EFL) learners in Morocco who filled out a single online questionnaire. Statistical results showed that motivation remained unchanged across skill levels but that positive emotions increased significantly and negative emotions dropped significantly, with the transition from Beginner to Intermediate skill levels showing the biggest change. The direction of relationships between the dependent variables remained similar although their strengths varied slightly across skill levels, reflecting the dynamic nature of FL learners’ emotions and motivation.
Special aspects of education, Language acquisition
Neural QCFG is a grammar-based sequence-tosequence (seq2seq) model with strong inductive biases on hierarchical structures. It excels in interpretability and generalization but suffers from expensive inference. In this paper, we study two low-rank variants of Neural QCFG for faster inference with different trade-offs between efficiency and expressiveness. Furthermore, utilizing the symbolic interface provided by the grammar, we introduce two soft constraints over tree hierarchy and source coverage. We experiment with various datasets and find that our models outperform vanilla Neural QCFG in most settings.
Natural Language Generation (NLG) refers to the operation of expressing the calculation results of a system in human language. Since the quality of generated sentences from an NLG model cannot be fully represented using only quantitative evaluation, they are evaluated using qualitative evaluation by humans in which the meaning or grammar of a sentence is scored according to a subjective criterion. Nevertheless, the existing evaluation methods have a problem as a large score deviation occurs depending on the criteria of evaluators. In this paper, we propose Grammar Accuracy Evaluation (GAE) that can provide the specific evaluating criteria. As a result of analyzing the quality of machine translation by BLEU and GAE, it was confirmed that the BLEU score does not represent the absolute performance of machine translation models and GAE compensates for the shortcomings of BLEU with flexible evaluation of alternative synonyms and changes in sentence structure.
We study relationship between first order multiplicative linear logic (MLL1), which has been known to provide representations to different categorial grammars, and the recently introduced extended tensor type calculus (ETTC). We identify a fragment of MLL1, which seems sufficient for many grammar representations, and establish a correspondence between ETTC and this fragment. The system ETTC, thus, can be seen as an alternative syntax and intrinsic deductive system together with a geometric representation for the latter. We also give a natural deduction formulation of ETTC, which might be convenient.
Alla Mykhailivna Bogush, Tetiana Mykhailivna Korolova, Popova Oleksandra Volodymyrivna
The article covers the issues related to the development of reading skills of the students majoring/minoring in English and Chinese (as non-native languages). In the backdrop of linguistic differences between English and Chinese, this action research was conducted to investigate the components of the reading skills, which are to be developed within the Bachelor programs. The primary purpose of the article is to analyze the methodological background for teaching Ukrainian students to perceive information from authentic texts. The methods of induction and deduction enabled us to analyze and generalize the theoretical bases for the investigated topic, to systemize the results of the study (the reading tactics and strategies, classification of reading activities). The study was based on focused observation using the register as a tool for data collecting for two semesters each in three groups of third-year students at Ushynsky University. The total sample size was 54. The article presents an analysis of difficulties in reading English and Chinese texts: 1) phonological level – differences in sound pronunciation (English: /T/, /D/ /w/, /N/, /x/, etc.; Chinese: the alveolo-palatal consonants j, q, x; affricates zh, z; consonant r, etc.), the phonetic phenomena (English: nasal plosion, lateral plosion, loss of plosion, assimilation, reduction/elision, etc.; Chinese: tone, erization); 2) lexical level – conversion (in English) and transposition (in Chinese), homonymy, polysemy; 3) grammatical level – the division of lexicon into parts of speech, different word order in English and Chinese sentences, (non)segmentation of English and Chinese syntagms/clauses/compound sentences, use of tenses, etc. The article contains some recommendations for English and Chinese reading classrooms.
The heterogeneity of tools that support temporal logic formulae poses several challenges in terms of interoperability. In particular, a standard syntax for temporal logic on finite traces, despite similar to the one for infinite traces, is currently missing. This document proposes a standard grammar for several temporal logic formalisms interpreted over finite traces, like Linear Temporal Logic (LTLf), Linear Dynamic Logic (LDLf), Pure-Past Linear Temporal Logic (PLTLf) and Pure-Past Linear Dynamic Logic (PLDLf).
Oskar van der Wal, Silvan de Boer, Elia Bruni
et al.
In this paper, we consider the syntactic properties of languages emerged in referential games, using unsupervised grammar induction (UGI) techniques originally designed to analyse natural language. We show that the considered UGI techniques are appropriate to analyse emergent languages and we then study if the languages that emerge in a typical referential game setup exhibit syntactic structure, and to what extent this depends on the maximum message length and number of symbols that the agents are allowed to use. Our experiments demonstrate that a certain message length and vocabulary size are required for structure to emerge, but they also illustrate that more sophisticated game scenarios are required to obtain syntactic properties more akin to those observed in human language. We argue that UGI techniques should be part of the standard toolkit for analysing emergent languages and release a comprehensive library to facilitate such analysis for future researchers.
In the history of human creativity, the act of imagining the impossible has always been at the core of the physical and metaphysical perception of the unknown. The scholarly debate regarding the nature of the impossible gained particular relevance in the context of British Enlightenment when the expanding sciences, along with literature, attempted to provide empirical validation to inexplicable and supernatural phenomena. In this way, the discrepancies between the overlapping ontologies of the Age of Faith and the Age of Reason became apparent as the ancestral literary practice of the fantastic merged with the rising genre of the novel. The assimilation of the conventional tropes of supernatural literature within the narrative frame of formal realism led to the development of two fortunate sub-genres: the Gothic and Science Fiction. The former evolved around the mutual disruption of the empirically-based conception of reality and the transgression of the moral code implied in the construction of civic order. The latter derived from the relocation of specific gothic features into a larger dimension of social anxiety concerning the abuses of reason concealed as a path towards common good and future progress.
By exploring the evolution of the gothic imagery and its dissolution into the narrative horizon of Science Fiction, this article will trace the early modern roots of the dialogue between science and literature in the human quest for the impossible. The thesis that Gothic and Science Fiction are historically interdependent will be reviewed in light of the common matrix of fear and desire which characterises their ideological function.
Geography. Anthropology. Recreation, Language. Linguistic theory. Comparative grammar
This paper describes continuing work on semantic frame slot filling for a command and control task using a weakly-supervised approach. We investigate the advantages of using retraining techniques that take the output of a hierarchical hidden markov model as input to two inductive approaches: (1) discriminative sequence labelers based on conditional random fields and memory-based learning and (2) probabilistic context-free grammar induction. Experimental results show that this setup can significantly improve F-scores without the need for additional information sources. Furthermore, qualitative analysis shows that the weakly supervised technique is able to automatically induce an easily interpretable and syntactically appropriate grammar for the domain and task at hand.
Peter beim Graben, Ronald Römer, Werner Meyer
et al.
Speech-controlled user interfaces facilitate the operation of devices and household functions to laymen. State-of-the-art language technology scans the acoustically analyzed speech signal for relevant keywords that are subsequently inserted into semantic slots to interpret the user's intent. In order to develop proper cognitive information and communication technologies, simple slot-filling should be replaced by utterance meaning transducers (UMT) that are based on semantic parsers and a \emph{mental lexicon}, comprising syntactic, phonetic and semantic features of the language under consideration. This lexicon must be acquired by a cognitive agent during interaction with its users. We outline a reinforcement learning algorithm for the acquisition of the syntactic morphology and arithmetic semantics of English numerals, based on minimalist grammar (MG), a recent computational implementation of generative linguistics. Number words are presented to the agent by a teacher in form of utterance meaning pairs (UMP) where the meanings are encoded as arithmetic terms from a suitable term algebra. Since MG encodes universal linguistic competence through inference rules, thereby separating innate linguistic knowledge from the contingently acquired lexicon, our approach unifies generative grammar and reinforcement learning, hence potentially resolving the still pending Chomsky-Skinner controversy.