Hasil "Comparative grammar"

DOAJ Open Access 2025

Elaborating a Methodology for Gauging a Politician’s Communicative Personality

Denis S. Mukhortov, Elizaveta A. Zhovner

Communicative behaviour studies require using numerous methodological approaches depending on the goal and tasks of research. Linguopolitical personology conceives of political communication as an institutionalized phenomenon aimed at holding power or winning the race for power, which allows researchers to employ a particular toolset to explore a politician’s communicative behaviour. This article seeks to provide effective methods in crafting communicative types of political personality. A typology hinges upon cross-disciplinary criteria and includes seven types - The Defender, The Statist, The Servant, The Warrior, The Blame Maker, The Ruler, The Idealist, each commensurate with an overarching communicative goal and dependent lexical sets. It is tested by scrutinizing the British parliamentary debates of 2010-2022 and determining the strength of a type correlation by noun and verb frequency; to that end research exploits the Sketch Engine content analysis program. The proposed methodological algorithm, if supplemented by delving into strategies and tactics, can be regarded as a universal tool for analyzing a politician’s communicative behaviour holistically.

Language. Linguistic theory. Comparative grammar, Semantics

Detail DOI Sumber

arXiv Open Access 2025

A First Context-Free Grammar Applied to Nawatl Corpora Augmentation

Juan-José Guzmán-Landa, Juan-Manuel Torres-Moreno, Miguel Figueroa-Saavedra et al.

In this article we introduce a context-free grammar (CFG) for the Nawatl language. Nawatl (or Nahuatl) is an Amerindian language of the $π$-language type, i.e. a language with few digital resources, in which the corpora available for machine learning are virtually non-existent. The objective here is to generate a significant number of grammatically correct artificial sentences, in order to increase the corpora available for language model training. We want to show that a grammar enables us significantly to expand a corpus in Nawatl which we call $π$-\textsc{yalli}. The corpus, thus enriched, enables us to train algorithms such as FastText and to evaluate them on sentence-level semantic tasks. Preliminary results show that by using the grammar, comparative improvements are achieved over some LLMs. However, it is observed that to achieve more significant improvement, grammars that model the Nawatl language even more effectively are required.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2025

Inferring Attributed Grammars from Parser Implementations

Andreas Pointner, Josef Pichler, Herbert Prähofer

Software systems that process structured inputs often lack complete and up-to-date specifications, which specify the input syntax and the semantics of input processing. While grammar mining techniques have focused on recovering syntactic structures, the semantics of input processing remains largely unexplored. In this work, we introduce a novel approach for inferring attributed grammars from parser implementations. Given an input grammar, our technique dynamically analyzes the implementation of recursive descent parsers to reconstruct the semantic aspects of input handling, resulting in specifications in the form of attributed grammars. By observing program executions and mapping the program's runtime behavior to the grammar, we systematically extract and embed semantic actions into the grammar rules. This enables comprehensive specification recovery. We demonstrate the feasibility of our approach using an initial set of programs, showing that it can accurately reproduce program behavior through the generated attributed grammars.

en cs.SE

Detail Sumber

DOAJ Open Access 2024

Mîrektiya Kêsanê

Hanifi Taşkın

Gelek mîrektiyên kurdan hene ku gelek bi hêz temendar bûne. Ji wan yek Mîrektiya Bitlîsê ye ku 760 salî desthilatdar bûye. Mirektiyan Cizîrê û Hekariyê jî bi nav û deng bûne. Bi navê beglik û mîrek wekî Miks, Hîzan Mehmûdî û wekî din jî hebûne. Lêbelê Mîrektiya Kêsanê heta niha nehatiye naskirin. Xebat li ser vê Mîrektiyê nehatine kirin. Me jî xebatek li ser vê mijarê hem di qadê da hem jî di çavkaniyan da kir. Bi alîkriya rêberên herêmê em çûne Sunbanê. Me yek ji endamên wê mîrmalbatê ku Cemîl Begê ye tesbît kir û agahiyên pêwîst jê wergirtin. Çi ku li ser erdnigariya vê binemala mîrektiyê bi hûrgilî yan xebat nehatine kirin, yan jî ji aliyê çand, huner, dîrok û erdkolojiyê ve kêm xebatên zanistî çêbûne. Lewra hê jî gelek cih hene ku li ser herêmeke din tên hesibandin, yan jî nayêne naskirin; lê di nav gel da bi navekî cuda têne binavkirin. Ew herêm yan jî şaxên malmîrekan tam nehatine tesbîtkirin. Zemanê berê, serdema navîn rêveberiya herêma Kurdistanê ji aliyê axa, mîr û begên kurdan ve hatiye îdarekirin. Yek ji wan jî Mîrektiya/Beglika Kêsanê ye. Ev Mîretiya han, ji dema şerê Ulama û Şeref Xanê Bitlîsî bi şûn ve ava bûye û heta dema Murad Begê Kêsanî dewam kiriye. Desthilatdariya Osmanî ew rûxandiye. Mîrekên Kêsanê herçiqas nêzê 250 û 300 sal ev herêm îdare kirine jî dîroka medreseya wê digihîje heta 383 salan berê. Ev herêma ku em ê li serê bixebitin, navenda Mîrektiyê Kêsanê û Keleha Sunbanê derdora 60-70ê gundî nav xwe da dihewîne. Em ê behsa Keleha Mîran, Medreseya ‘Ebdulah Beg, dêr û camiyên wan bikin. Herweha dê jiyana Fileh û Musulmanên ku wê hingê çawa bi awayekî biedalet bi hev ra mi‘amele kirine bibe mijara lêkolînê. Dê şopa çîrok û serpêhatiya vê mîrektiya ku wekî sancaqekê bûye cihê gerandin û nivîstekên medrese û kêlikên têkildarî Begen Kêsanê bihên vekolan.

Indo-Iranian languages and literature, Language. Linguistic theory. Comparative grammar

Detail DOI Sumber

DOAJ Open Access 2024

Human-computer interaction in translation and interpreting: software and applications

Felix do Carmo, José Ramos, Carlos S. C. Teixeira

This issue of Revista Tradumàtica explores how technology, including machine translation, AI, and accessibility tools, transforms professional translation. Articles address psychological impacts, productivity, quality, and usability. Highlights include autonomy’s link to job satisfaction, stress from concurrent workflows, and challenges with large language models and remote interpreting platforms. Accessibility studies emphasize user involvement in design. While technology boosts productivity, it introduces stress and uncertainty, underscoring the importance of user-driven development to enhance satisfaction, autonomy, and translation quality.

Translating and interpreting

Detail DOI Sumber

DOAJ Open Access 2023

The effects of using L1 Chinese or L2 English in planning on speaking performance among high- and low-proficient EFL learners

Yingsheng Liu, Pui-sze Yeung

Abstract Speaking constitutes one of the main goals of learning a second language (L2). Despite the increasing attention on the role of planning and language transfer in L2 learning, the combined effect of using different languages and pre-task planning on language production remains unclear. This study investigated whether the use of different languages in planning affects speaking performance and whether the effect differs by language proficiency. A total of 84 students in Chinese universities learning English as a foreign language participated in several speaking tasks after planning using their first language (L1) Chinese or L2 English. Findings showed that using L1 in planning results in significantly higher syntactic complexity, accuracy, and fluency in speaking performance than using L2 in planning, while the difference in lexical diversity were not statistically significant. Further analysis shows that for speech accuracy, the facilitative effect of L1 was stronger among low-proficient than high-proficient learners. Findings from this study support the use of L2 learners’ entire linguistic repertoire in speaking activities and provides implications on speech production theories as well as translanguaging pedagogies.

Special aspects of education, Language acquisition

Detail DOI Sumber

arXiv Open Access 2023

Context Matters: Adaptive Mutation for Grammars

Pedro Carvalho, Jessica Mégane, Nuno Lourenço et al.

This work proposes Adaptive Facilitated Mutation, a self-adaptive mutation method for Structured Grammatical Evolution (SGE), biologically inspired by the theory of facilitated variation. In SGE, the genotype of individuals contains a list for each non-terminal of the grammar that defines the search space. In our proposed mutation, each individual contains an array with a different, self-adaptive mutation rate for each non-terminal. We also propose Function Grouped Grammars, a grammar design procedure, to enhance the benefits of the proposed mutation. Experiments were conducted on three symbolic regression benchmarks using Probabilistic Structured Grammatical Evolution (PSGE), a variant of SGE. Results show our approach is similar or better when compared with the standard grammar and mutation.

en cs.NE

Detail DOI Sumber

arXiv Open Access 2023

Ordered Context-Free Grammars Revisited

Brink van der Merwe

We continue our study of ordered context-free grammars, a grammar formalism that places an order on the parse trees produced by the corresponding context-free grammar. In particular, we simplify our previous definition of a derivation of a string for a given ordered context-free grammar, and present a parsing algorithm, using shared packed parse forests, with time complexity O(n^4), where n is the length of the input string being parsed.

en cs.FL

Detail DOI Sumber

arXiv Open Access 2023

From Double Pushout Grammars to Hypergraph Lambek Grammars With and Without Exponential Modality

Tikhon Pshenitsyn

We study how to relate well-known hypergraph grammars based on the double pushout (DPO) approach and grammars over the hypergraph Lambek calculus HL (called HL-grammars). It turns out that DPO rules can be naturally encoded by types of HL using methods similar to those used by Kanazawa for multiplicative-exponential linear logic. In order to generalize his reasonings we extend the hypergraph Lambek calculus by adding the exponential modality, which results in a new calculus HMEL0; then we prove that any DPO grammar can be converted into an equivalent HMEL0-grammar. We also define the conjunctive Kleene star, which behaves similarly to this exponential modality, and establish a similar result. If we add neither the exponential modality nor the conjunctive Kleene star to HL, then we can still use the same encoding and show that any DPO grammar with a linear restriction on the length of derivations can be converted into an equivalent HL-grammar.

en cs.LO, cs.FL

Detail DOI Sumber

CrossRef Open Access 2023

The Comparative Construction

en

Detail DOI Sumber

DOAJ Open Access 2022

El lenguaje y las barreras que impiden la comunicación

Rosario Rodríguez del Busto, Emanuel Exequiel Guanco

La comunicación constituye uno de los desafíos más grande de la humanidad, ya que a pesar de compartir un mismo sistema, la comunicación no siempre es posible. En muchas ocasiones, el uso que los hablantes hacen del lenguaje es un reproductor de esta problemática. Es por esto, que el tema que nos convoca son los discursos sociales en torno al lenguaje, puestos en diálogo con dos teorías: la teoría saussureana y la teoría peirceana. Como punto de partida vamos a analizar, desde estas dos perspectivas, ciertas barreras y desigualdades que produce el uso del lenguaje.

Social sciences (General), Discourse analysis

Detail DOI Sumber

DOAJ Open Access 2022

The role of tutor's questioning in mentoring learners' responses to and uptake of feedback on writing

Murad Abdu Saeed, Atef Odeh AbuSa’aleek, Huda Suleiman Al Qunayeer

How teachers can provide effective feedback that promotes' students' active responses to and use of it is the question of the current debate in research. The need for teachers to formulate/compose their feedback in the form of questioning alleviates their authoritative roles in the process. Therefore, this study explored the role of teacher Google Doc-based feedback given in the form of questions on the assignments of 14 pairs of undergraduates in a Malaysian university in fostering their responses to feedback and uptake of it in writing. The results revealed that the feedback questions fall into single Yes/No questions, single Wh-questions, and a combination of both, which served as eliciting responses, eliciting information, seeking clarifications, requesting, checking certainty, and inviting learners to respond to and interact over the e-feedback before using it in revising their texts. Findings indicate that Google Docs functions as an interactive platform where students diversify their responses to e-feedback, such as commenting on the e-feedback, interacting around the e-feedback issues, seeking further feedback, resolving the e-feedback, and addressing the e-feedback through edits/text revisions. Furthermore, the way e-feedback questioning is formulated influences how students respond to and use e-feedback in revising their assignments. The study provides valuable suggestions for teacher feedback practices in graduate courses in higher educational institutions.

Special aspects of education, Language. Linguistic theory. Comparative grammar

Detail DOI Sumber

arXiv Open Access 2022

P(Expression|Grammar): Probability of deriving an algebraic expression with a probabilistic context-free grammar

Urh Primožič, Ljupčo Todorovski, Matej Petković

Probabilistic context-free grammars have a long-term record of use as generative models in machine learning and symbolic regression. When used for symbolic regression, they generate algebraic expressions. We define the latter as equivalence classes of strings derived by grammar and address the problem of calculating the probability of deriving a given expression with a given grammar. We show that the problem is undecidable in general. We then present specific grammars for generating linear, polynomial, and rational expressions, where algorithms for calculating the probability of a given expression exist. For those grammars, we design algorithms for calculating the exact probability and efficient approximation with arbitrary precision.

en cs.FL, cs.LG

Detail Sumber

arXiv Open Access 2022

Re-evaluating the Need for Multimodal Signals in Unsupervised Grammar Induction

Boyi Li, Rodolfo Corona, Karttikeya Mangalam et al.

Are multimodal inputs necessary for grammar induction? Recent work has shown that multimodal training inputs can improve grammar induction. However, these improvements are based on comparisons to weak text-only baselines that were trained on relatively little textual data. To determine whether multimodal inputs are needed in regimes with large amounts of textual training data, we design a stronger text-only baseline, which we refer to as LC-PCFG. LC-PCFG is a C-PFCG that incorporates em-beddings from text-only large language models (LLMs). We use a fixed grammar family to directly compare LC-PCFG to various multi-modal grammar induction methods. We compare performance on four benchmark datasets. LC-PCFG provides an up to 17% relative improvement in Corpus-F1 compared to state-of-the-art multimodal grammar induction methods. LC-PCFG is also more computationally efficient, providing an up to 85% reduction in parameter count and 8.8x reduction in training time compared to multimodal approaches. These results suggest that multimodal inputs may not be necessary for grammar induction, and emphasize the importance of strong vision-free baselines for evaluating the benefit of multimodal approaches.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2022

Efficient Enumeration Algorithms for Annotated Grammars

Antoine Amarilli, Louis Jachiet, Martín Muñoz et al.

We introduce annotated grammars, an extension of context-free grammars which allows annotations on terminals. Our model extends the standard notion of regular spanners, and is more expressive than the extraction grammars recently introduced by Peterfreund. We study the enumeration problem for annotated grammars: fixing a grammar, and given a string as input, enumerate all annotations of the string that form a word derivable from the grammar. Our first result is an algorithm for unambiguous annotated grammars, which preprocesses the input string in cubic time and enumerates all annotations with output-linear delay. This improves over Peterfreund's result, which needs quintic time preprocessing to achieve this delay bound. We then study how we can reduce the preprocessing time while keeping the same delay bound, by making additional assumptions on the grammar. Specifically, we present a class of grammars which only have one derivation shape for all outputs, for which we can enumerate with quadratic time preprocessing. We also give classes that generalize regular spanners for which linear time preprocessing suffices.

en cs.FL, cs.DS

Detail DOI Sumber

DOAJ Open Access 2021

Corpus-informed application based on Korean Learners’ Corpus: substitution errors of topic and nominative markers

Jihye Chun, Mi Hyun Kim

Abstract This study aims to demonstrate the need for learner-corpus-informed applications and proposes methods of application that promote the proper use of Korean topic and nominative markers. This study extracted 3004 pieces of error from the error-annotated corpus of the Korean Learners’ Corpus, the largest Korean learner corpus to date. A detailed examination of the above data was conducted to subdivide the types of substitution errors related to the topic and nominative markers, and to analyze the error rate according to the type of error and level of proficiency. The statistical data revealed no consistent correlation between the error rate and proficiency level. Furthermore, based on the proportion of error types by proficiency level, this study proposes the use of common mistake boxes with real errors; these errors are generally committed by learners of all proficiency levels and are not presumed problematic by grammarians or intuition-based teachers. These boxes can, therefore, be utilized as a practical tool for inclusion in pedagogical materials, such as learner’s dictionaries and textbooks.

Special aspects of education, Language acquisition

Detail DOI Sumber

arXiv Open Access 2021

On LL(k) linear conjunctive grammars

Ilya Olkhovsky, Alexander Okhotin

Linear conjunctive grammars are a family of formal grammars with an explicit conjunction operation allowed in the rules, which is notable for its computational equivalence fo one-way real-time cellular automata, also known as trellis automata. This paper investigates the LL($k$) subclass of linear conjunctive grammars, defined by analogy with the classical LL($k$) grammars: these are grammars that admit top-down linear-time parsing with $k$-symbol lookahead. Two results are presented. First, every LL($k$) linear conjunctive grammar can be transformed to an LL(1) linear conjunctive grammar, and, accordingly, the hierarchy with respect to $k$ collapses. Secondly, a parser for these grammars that works in linear time and uses logarithmic space is constructed, showing that the family of LL($k$) linear conjunctive languages is contained in the complexity class $L$.

en cs.FL

Detail Sumber

DOAJ Open Access 2020

Development of voice onset time in an ongoing phonetic differentiation in Austrian German plosives: Reversing a near-merger

Luef Eva Maria

Sound change in the form of plosive mergers has been reported for a variety of languages and is the result of a reduction of phonetic distance between two (or more) sounds. The present study is concerned with the opposite development of phonetic differentiation in plosives (akin to a phonetic split), a less commonly reported phenomenon that is taking place in Austrian German at the moment. A previously small (or null) phonetic distinction between fortis and lenis plosives – a presumed near-merger – is gradually developing into a clear phonetic contrast in younger speakers. In the present study, voice onset time of word-initial plosives was measured in two generations of Austrian speakers (born in the middle and at the end of the 20th century), yielding an ongoing phonetic differentiation where the voice onset time of lenis consonants is shortened while, at the same time, that of fortis consonants is lengthened. These results present an insight into the recent diachronic development of Austrian German and the changes in plosive production that are currently taking place.

Language. Linguistic theory. Comparative grammar

Detail DOI Sumber

arXiv Open Access 2020

Evolutionary Grammar-Based Fuzzing

Martin Eberlein, Yannic Noller, Thomas Vogel et al.

A fuzzer provides randomly generated inputs to a targeted software to expose erroneous behavior. To efficiently detect defects, generated inputs should conform to the structure of the input format and thus, grammars can be used to generate syntactically correct inputs. In this context, fuzzing can be guided by probabilities attached to competing rules in the grammar, leading to the idea of probabilistic grammar-based fuzzing. However, the optimal assignment of probabilities to individual grammar rules to effectively expose erroneous behavior for individual systems under test is an open research question. In this paper, we present EvoGFuzz, an evolutionary grammar-based fuzzing approach to optimize the probabilities to generate test inputs that may be more likely to trigger exceptional behavior. The evaluation shows the effectiveness of EvoGFuzz in detecting defects compared to probabilistic grammar-based fuzzing (baseline). Applied to ten real-world applications with common input formats (JSON, JavaScript, or CSS3), the evaluation shows that EvoGFuzz achieved a significantly larger median line coverage for all subjects by up to 48% compared to the baseline. Moreover, EvoGFuzz managed to expose 11 unique defects, from which five have not been detected by the baseline.

en cs.SE

Detail Sumber

arXiv Open Access 2020

Grammar Compression By Induced Suffix Sorting

Daniel S. N. Nunes, Felipe A. Louza, Simon Gog et al.

A grammar compression algorithm, called GCIS, is introduced in this work. GCIS is based on the induced suffix sorting algorithm SAIS, presented by Nong et al. in 2009. The proposed solution builds on the factorization performed by SAIS during suffix sorting. A context-free grammar is used to replace factors by non-terminals. The algorithm is then recursively applied on the shorter sequence of non-terminals. The resulting grammar is encoded by exploiting some redundancies, such as common prefixes between right-hands of rules, sorted according to SAIS. GCIS excels for its low space and time required for compression while obtaining competitive compression ratios. Our experiments on regular and repetitive, moderate and very large texts, show that GCIS stands as a very convenient choice compared to well-known compressors such as Gzip, 7-Zip, and RePair, the gold standard in grammar compression. In exchange, GCIS is slow at decompressing. Yet, grammar compressors are more convenient than Lempel-Ziv compressors in that one can access text substrings directly in compressed form, without ever decompressing the text. We demonstrate that GCIS is an excellent candidate for this scenario because it shows to be competitive among its RePair based alternatives. We also show, how GCIS relation with SAIS makes it a good intermediate structure to build the suffix array and the LCP array during decompression of the text.

en cs.DS

Detail Sumber

Hasil untuk "Comparative grammar"