Hasil untuk "Comparative grammar"

Menampilkan 20 dari ~3706648 hasil · dari arXiv, DOAJ, CrossRef, Semantic Scholar

JSON API
arXiv Open Access 2026
Automated Grammar-based Algebraic Multigrid Design With Evolutionary Algorithms

Dinesh Parthasarathy, Wayne Mitchell, Arjun Gambhir et al.

Although multigrid is asymptotically optimal for solving many important partial differential equations, its efficiency relies heavily on the careful selection of the individual algorithmic components. In contrast to recent approaches that can optimize certain multigrid components using deep learning techniques, we adopt a complementary strategy, employing evolutionary algorithms to construct efficient multigrid cycles from proven algorithmic building blocks. Here, we will present its application to generate efficient algebraic multigrid methods with so-called \emph{flexible cycling}, that is, level-specific smoothing sequences and non-recursive cycling patterns. The search space with such non-standard cycles is intractable to navigate manually, and is generated using genetic programming (GP) guided by context-free grammars. Numerical experiments with the linear algebra library, \emph{hypre}, demonstrate the potential of these non-standard GP cycles to improve multigrid performance both as a solver and a preconditioner.

en cs.CE, cs.AI
arXiv Open Access 2025
Multidimensional Sorting: Comparative Statics

Job Boerma, Andrea Ottolini, Aleh Tsyvinski

In sorting literature, comparative statics for multidimensional assignment models with general output functions and input distributions is an important open question. We provide a complete theory of comparative statics for technological change in general multidimensional assignment models. Our main result is that any technological change is uniquely decomposed into two distinct components. The first component (gradient) gives a characterization of changes in marginal earnings through a Poisson equation. The second component (divergence-free) gives a characterization of labor reallocation. For U.S. data, we quantify equilibrium responses in sorting and earnings with respect to cognitive skill-biased technological change.

en econ.GN, math.OC
arXiv Open Access 2025
Efficient Story Point Estimation With Comparative Learning

Monoshiz Mahbub Khan, Xiaoyin Xi, Andrew Meneely et al.

Story points are unitless, project-specific effort estimates that help developers plan their sprints. Traditionally, developers have collaboratively estimated story points using planning poker or other manual techniques. Machine learning can reduce this burden, but only with sufficient context from the historical decisions made by the project team. That is, state-of-the-art models, such as GPT2SP and FastText-SVM, only make accurate (within-project) predictions when they are trained on data from the same project. The goal of this study is to streamline story point estimation by evaluating a comparative learning-based framework for calibrating project-specific story point prediction models. Instead of assigning a specific story point value to every backlog item, developers are presented with pairs of items and asked to indicate which item requires more effort. Using these comparative judgments, a machine learning model was trained to predict the story point estimates. We empirically evaluated our technique using data from 23,313 manual estimates across 16 projects. The model trained on comparative judgments achieved, on average, a 0.34 Spearman's rank correlation coefficient between its predictions and the ground truth story points. This is similar to, if not better than, the performance of a state-of-the-art regression model trained on ground truth story points. Through human subject experiments, the advantages of comparative judgments were validated - higher confidence, lower annotation time, and comparable agreement were observed for comparative judgments compared to direct ratings. In summary, the proposed comparative learning approach is more efficient than regression-based approaches, given its better performance, lower required annotation time, and higher training data reliability.

en cs.AI, cs.SE
arXiv Open Access 2025
SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

Qing Li, Jiahui Geng, Derui Zhu et al.

Unlearning methods for vision-language models (VLMs) have primarily adapted techniques from large language models (LLMs), relying on weight updates that demand extensive annotated forget sets. Moreover, these methods perform unlearning at a coarse granularity, often leading to excessive forgetting and reduced model utility. To address this issue, we introduce SAUCE, a novel method that leverages sparse autoencoders (SAEs) for fine-grained and selective concept unlearning in VLMs. Briefly, SAUCE first trains SAEs to capture high-dimensional, semantically rich sparse features. It then identifies the features most relevant to the target concept for unlearning. During inference, it selectively modifies these features to suppress specific concepts while preserving unrelated information. We evaluate SAUCE on two distinct VLMs, LLaVA-v1.5-7B and LLaMA-3.2-11B-Vision-Instruct, across two types of tasks: concrete concept unlearning (objects and sports scenes) and abstract concept unlearning (emotions, colors, and materials), encompassing a total of 60 concepts. Extensive experiments demonstrate that SAUCE outperforms state-of-the-art methods by 18.04% in unlearning quality while maintaining comparable model utility. Furthermore, we investigate SAUCE's robustness against widely used adversarial attacks, its transferability across models, and its scalability in handling multiple simultaneous unlearning requests. Our findings establish SAUCE as an effective and scalable solution for selective concept unlearning in VLMs.

en cs.CV, cs.AI
arXiv Open Access 2024
Computational Model for Parsing Expression Grammars

Alexander Rubtsov, Nikita Chudinov

We present a computational model for Parsing Expression Grammars (PEGs). The predecessor of PEGs top-down parsing languages (TDPLs) were discovered by A. Birman and J. Ullman in the 1960-s, B. Ford showed in 2004 that both formalisms recognize the same class named Parsing Expression Languages (PELs). A. Birman and J. Ullman established such important properties like TDPLs generate any DCFL and some non-context-free languages like $a^nb^nc^n$, a linear-time parsing algorithm was constructed as well. But since this parsing algorithm was impractical in the 60-s TDPLs were abandoned and then upgraded by B. Ford to PEGs, so the parsing algorithm was improved (from the practical point of view) as well. Now PEGs are actively used in compilers (eg., Python replaced LL(1)-parser with a PEG one) so as for text processing as well. In this paper, we present a computational model for PEG, obtain structural properties of PELs, namely proof that PELs contain Boolean closure of regular closure of DCFLs and PELs are closed over left concatenation with regular closure of DCFLs. We present an extension of the PELs class based on the extension of our computational model. Our model is an upgrade of deterministic pushdown automata (DPDA) such that during the pop of a symbol it is allowed to return the head to the position of the push of the symbol. We provide a linear-time simulation algorithm for the 2-way version of this model, which is similar to the famous S. Cook linear-time simulation algorithm of 2-way DPDA.

en cs.FL
DOAJ Open Access 2024
Które rośliny czynią człowieka chorym? Próba kontrastywnej analizy językowego obrazu świata w polskich i niemieckich nazwach chorób z elementem roślinnym

Piotr Aleksander Owsiński

Artykuł stanowi prezentację wyników językowo-kognitywnej analizy wybranych nazw chorób i dolegliwości zawierających element roślinny. Celem eksploracji jest próba udzielenia odpowiedzi na pytanie, czy między polskimi i niemieckimi terminami medycznymi istnieje izomorfizm pod względem utrwalonego w obu językach językowego obrazu świata, który jest charakterystyczny dla konkretnego kręgu kulturowego. Na podstawie badania można wysnuć wniosek, że większość analizowanych terminów określających chorobę lub dolegliwość wykazuje całkowitą lub częściową ekwiwalencję, która nie tylko w nauce języka obcego, lecz także w kontaktach między lekarzem i pacjentem jawi się jako istotny czynnik wspierający zarówno proces nauczania lub uczenia się języka obcego, jak i zrozumienie partnera komunikacji oraz recepcję przekazywanej treści w konkretnym kontekście sytuacyjnym.

Language. Linguistic theory. Comparative grammar
DOAJ Open Access 2024
Developing Process Writing Ability in Virtual Learning Environment via (Reinforced) Metalinguistic Corrective Feedback

Maryam Naderi Farsani, پرویز علوی نیا, Mehdi Sarkhosh

<p>The current study was performed to investigate the impact of metalinguistic oral and written corrective feedback on learners&rsquo; process writing ability through virtual learning environment. To this aim, a total of 66 Iranian EFL students in Shahrekord University participated in the study. To conduct the study, a sample of IELTS expository writing (Writing Task 1) was administered to all participants for homogeneity purposes. Then, each of the two classes was divided into two parts, and each was randomly assigned to one of the four comparison groups (oral metalinguistic feedback, written metalinguistic feedback, oral metalinguistic + error logs, and written metalinguistic + error logs). Next, the writing pretest (a process writing task) was given to participants prior to instruction. The treatment lasted for eight weeks, and then process writing posttest was administered. The results revealed that all groups made progress from pretest to posttest. However, no significant difference was found among the four types of metalinguistic corrective feedback. The implications of the findings are discussed throughout the paper.</p>

Language and Literature, Language. Linguistic theory. Comparative grammar
DOAJ Open Access 2023
LA CULTURE DU BROUILLON DANS LES PRODUCTIONS ÉCRITES

Tahar Kasmi

Notre recherche a pour vocation d’étudier les intentions dans lesquelles on pourrait intégrer la culture du brouillon dans le chemin de l’apprentissage de la production d’écrits. Elle s’est effectuée en contexte plurilingue algérien avec des élèves de 4èAM, en vue de les aider dans l’accès à l’écrit en langue étrangère. Elle adopte une méthodologie empirique qualitative et ouvre des perspectives sociodidactiques qui mettent le focus sur les qualités de l’élève (ce qu’il sait faire) plus que ses difficultés (ce qu’il ne sait pas faire) et sur le profit issu des interactions dans la classe.

Philology. Linguistics, Language. Linguistic theory. Comparative grammar
arXiv Open Access 2022
The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative

Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal et al.

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step towards assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behaviour in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs are able to recognise the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge.

en cs.CL
arXiv Open Access 2021
Interactive Dimensionality Reduction for Comparative Analysis

Takanori Fujiwara, Xinhai Wei, Jian Zhao et al.

Finding the similarities and differences between groups of datasets is a fundamental analysis task. For high-dimensional data, dimensionality reduction (DR) methods are often used to find the characteristics of each group. However, existing DR methods provide limited capability and flexibility for such comparative analysis as each method is designed only for a narrow analysis target, such as identifying factors that most differentiate groups. This paper presents an interactive DR framework where we integrate our new DR method, called ULCA (unified linear comparative analysis), with an interactive visual interface. ULCA unifies two DR schemes, discriminant analysis and contrastive learning, to support various comparative analysis tasks. To provide flexibility for comparative analysis, we develop an optimization algorithm that enables analysts to interactively refine ULCA results. Additionally, the interactive visualization interface facilitates interpretation and refinement of the ULCA results. We evaluate ULCA and the optimization algorithm to show their efficiency as well as present multiple case studies using real-world datasets to demonstrate the usefulness of this framework.

en cs.LG, cs.HC
arXiv Open Access 2021
Comparative evaluation of point process forecasts

Jonas Brehmer, Tilmann Gneiting, Marcus Herrmann et al.

Stochastic models of point patterns in space and time are widely used to issue forecasts or assess risk, and often they affect societally relevant decisions. We adapt the concept of consistent scoring functions and proper scoring rules, which are statistically principled tools for the comparative evaluation of predictive performance, to the point process setting, and place both new and existing methodology in this framework. With reference to earthquake likelihood model testing, we demonstrate that extant techniques apply in much broader contexts than previously thought. In particular, the Poisson log-likelihood can be used for theoretically principled comparative forecast evaluation in terms of cell expectations. We illustrate the approach in a simulation study and in a comparative evaluation of operational earthquake forecasts for Italy.

arXiv Open Access 2021
DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction

Freddy C. Chua, Nigel P. Duffy

We address the challenge of extracting structured information from business documents without detailed annotations. We propose Deep Conditional Probabilistic Context Free Grammars (DeepCPCFG) to parse two-dimensional complex documents and use Recursive Neural Networks to create an end-to-end system for finding the most probable parse that represents the structured information to be extracted. This system is trained end-to-end with scanned documents as input and only relational-records as labels. The relational-records are extracted from existing databases avoiding the cost of annotating documents by hand. We apply this approach to extract information from scanned invoices achieving state-of-the-art results despite using no hand-annotations.

en cs.CL, cs.AI
arXiv Open Access 2020
Institutions and China's comparative development

Paul Minard

Robust assessment of the institutionalist account of comparative development is hampered by problems of omitted variable bias and reverse causation, since institutional quality is not randomly assigned with respect to geographic and human capital endowments. A recent series of papers has applied spatial regression discontinuity designs to estimate the impact of institutions on incomes at international borders, drawing inference from the abrupt discontinuity in governance at borders, whereas other determinants of income vary smoothly across borders. I extend this literature by assessing the importance of sub-national variation in institutional quality at provincial borders in China. Employing nighttime lights emissions as a proxy for income, across multiple specifications I find no evidence in favour of an institutionalist account of the comparative development of Chinese provinces.

DOAJ Open Access 2020
Guatemalan Spanish in contact: Prosody and intonation

Yolanda Congosto Martín

This study forms part of the Geoprosadic project: the Geoprosodic and Sociodialectal Study of North American Spanish. The main objective of this project is to describe and compare the prosody of three geographical areas, Los Angeles, Mexico and Guatemala, that are closely related historically, socially and linguistically due to the contact established over time and the coexistence of people and cultures. The sole aim is to examine the prosodic differences between the Spanish spoken by Latin American people living in Los Angeles (mainly Mexicans, Salvadorans and Guatemalans) and the Spanish spoken by those who have never left their native countries, and to compare whether the spatio-temporal distance of the former and their immersion in a different sociolinguistic sphere, in contact with the English language and other varieties of Spanish, have brought about differences of a geoprosodic nature. In this case, the study focuses on the Guatemalan Spanish of various speakers. From a methodological point of view, the research is linked to the international project AMPER. The analytical methods of AMPER are followed and the research is limited to the study of female intonation and sentences with a subject–verb–object (SVO) structure from corpus 1 (declaratives and absolute interrogatives). The results of this research corroborate the initial hypothesis and establish the melodic differences between both groups of speakers, particularly regarding declaratives.

Language. Linguistic theory. Comparative grammar
DOAJ Open Access 2020
Predicate formation and verb-stranding ellipsis in Uzbek

Vera Gribanova

This paper investigates the interaction between head movement of the verb and ellipsis of vP (verb-stranding ellipsis, VSE) in Uzbek — an understudied Turkic language of Central Asia. I argue that Uzbek verbal predicates are formed by head movement, while non-verbal predicates are formed by a species of Local Dislocation (Embick &amp; Noyer 2001; Embick 2003). Uzbek has two distinct ellipsis strategies that yield similar strings: argument ellipsis (AE) and VSE. VSE occurs only with (head-moved) verbs, and can elide non-verbal predicates, while AE cannot. Uzbek VSE imposes a strict identity requirement on the heads extracted from the ellipsis site (the Verbal Identity Condition (Goldberg 2005b)). Both the genuine existence of this condition, and its source, have recently come under scrutiny; this paper presents Uzbek evidence in support of the claim that the Verbal Identity Condition is genuinely present in a subset of typologically diverse languages with VSE (see Gribanova 2018b). Variable crosslinguistic behavior with respect to the Verbal Identity Condition is predicted by an independently supported view of head movement (Harizanov &amp; Gribanova 2019) in which certain types of head movement are syntactic — yielding the potential for mismatches of extracted material, by analogy with phrasal movement (Merchant 2001) — while others are postsyntactic (yielding the Uzbek-type VSE pattern). The Uzbek investigation therefore provides crucial evidence in favor of a particular view of the crosslinguistic landscape of VSE, and moves us a step closer to explaining why head movement out of ellipsis domains varies systematically in its behavior across languages.

Language. Linguistic theory. Comparative grammar
arXiv Open Access 2019
A Comparative Survey of Recent Natural Language Interfaces for Databases

Katrin Affolter, Kurt Stockinger, Abraham Bernstein

Over the last few years natural language interfaces (NLI) for databases have gained significant traction both in academia and industry. These systems use very different approaches as described in recent survey papers. However, these systems have not been systematically compared against a set of benchmark questions in order to rigorously evaluate their functionalities and expressive power. In this paper, we give an overview over 24 recently developed NLIs for databases. Each of the systems is evaluated using a curated list of ten sample questions to show their strengths and weaknesses. We categorize the NLIs into four groups based on the methodology they are using: keyword-, pattern-, parsing-, and grammar-based NLI. Overall, we learned that keyword-based systems are enough to answer simple questions. To solve more complex questions involving subqueries, the system needs to apply some sort of parsing to identify structural dependencies. Grammar-based systems are overall the most powerful ones, but are highly dependent on their manually designed rules. In addition to providing a systematic analysis of the major systems, we derive lessons learned that are vital for designing NLIs that can answer a wide range of user questions.

en cs.DB, cs.CL
arXiv Open Access 2019
Computational Induction of Prosodic Structure

Dafydd Gibbon

The present study has two goals relating to the grammar of prosody, understood as the rhythms and melodies of speech. First, an overview is provided of the computable grammatical and phonetic approaches to prosody analysis which use hypothetico-deductive methods and are based on learned hermeneutic intuitions about language. Second, a proposal is presented for an inductive grounding in the physical signal, in which prosodic structure is inferred using a language-independent method from the low-frequency spectrum of the speech signal. The overview includes a discussion of computational aspects of standard generative and post-generative models, and suggestions for reformulating these to form inductive approaches. Also included is a discussion of linguistic phonetic approaches to analysis of annotations (pairs of speech unit labels with time-stamps) of recorded spoken utterances. The proposal introduces the inductive approach of Rhythm Formant Theory (RFT) and the associated Rhythm Formant Analysis (RFA) method are introduced, with the aim of completing a gap in the linguistic hypothetico-deductive cycle by grounding in a language-independent inductive procedure of speech signal analysis. The validity of the method is demonstrated and applied to rhythm patterns in read-aloud Mandarin Chinese, finding differences from English which are related to lexical and grammatical differences between the languages, as well as individual variation. The overall conclusions are (1) that normative language-to-language phonological or phonetic comparisons of rhythm, for example of Mandarin and English, are too simplistic, in view of diverse language-internal factors due to genre and style differences as well as utterance dynamics, and (2) that language-independent empirical grounding of rhythm in the physical signal is called for.

en cs.CL, cs.SD
DOAJ Open Access 2019
Autori reali e agenti fittizi (narratori fittizi, autori fittizi)

Alberto Voltolini

In quest’articolo, sosterrò che un resoconto plausibile della narrazione di finzione deve comportare una distinzione concettuale tra almeno le tre figure seguenti: autori reali, narratori fittizi, autori fittizi. Gli autori reali possono coincidere, anche se raramente, con i narratori fittizi o con gli autori fittizi. Un narratore fittizio, però, non può mai coincidere con un autore fittizio, perché o l’uno o l’altro è l’‘agente fittizio’, il fattore contestuale che contribuisce a fornire un contenuto semantico (verocondizionale) agli enunciati coinvolgenti la finzione che, nel loro uso fittizio, sono narrati dall’uno o dall’altro. Proprio per questo, tuttavia, le ragioni per cui abbiamo bisogno di un autore fittizio come distinto da un narratore fittizio coincidono solo parzialmente con quelle fornite da Currie (1990). Abbiamo bisogno di un autore fittizio per le stesse ragioni ‘semantiche’ che rendono necessario un narratore fittizio; vale a dire, come anticipato, per dar conto delle condizioni di verità fittizie e dei valori di verità fittizi che gli enunciati coinvolgenti la finzione hanno nel loro uso fittizio. Abbiamo infatti bisogno di un narratore fittizio o di un autore fittizio per avere un ‘agente’ del rilevante contesto fittizio che consenta ad un enunciato che coinvolge la finzione, nel suo uso fittizio, di ‘dire qualcosa’ per finta, cioè di avere un contenuto semantico (verocondizionale) fittizio, e quindi anche un valore di verità fittizio. Ma non abbiamo bisogno di un autore fittizio per ragioni ‘epistemiche’, aventi a che fare con l’affidabilità nella narrazione; pace Currie (1990), tale autore non dev’essere onnisciente, proprio come non lo è il narratore fittizio.

Geography. Anthropology. Recreation, Language. Linguistic theory. Comparative grammar

Halaman 45 dari 185333