Hasil "North Germanic. Scandinavian"

arXiv Open Access 2025

Family Matters: Language Transfer and Merging for Adapting Small LLMs to Faroese

Jenny Kunz, Iben Nyholm Debess, Annika Simonsen

We investigate strategies for adapting small, efficient language models to Faroese, a low-resource North Germanic language. Starting from English-pretrained models, we apply continued pre-training on related Scandinavian languages -- individually or combined via model merging -- before fine-tuning on Faroese. We compare full fine-tuning with parameter-efficient adaptation via LoRA, assessing their effects on general language modeling performance, linguistic accuracy, and text comprehension. To address the lack of existing Faroese evaluation resources, we construct two new minimal-pair probing benchmarks, one for linguistic acceptability and one for text comprehension, and complement them with human evaluations conducted by native Faroese linguists. Our results show that transfer from related languages is essential, but the optimal source language is task-dependent: Icelandic improves linguistic accuracy, while Danish boosts reading comprehension. The choice of adaptation method likewise depends on the target task: LoRA yields stronger linguistic acceptability and marginally higher human evaluation scores, whereas full fine-tuning produces better comprehension performance and more robust downstream fine-tuning. Merging multiple related languages under full fine-tuning (but not LoRA) improves general language modeling, though its benefits in the linguistic acceptability and comprehension probes are less consistent.

en cs.CL

Detail Sumber

arXiv Open Access 2025

LLMs for Legal Subsumption in German Employment Contracts

Oliver Wardas, Florian Matthes

Legal work, characterized by its text-heavy and resource-intensive nature, presents unique challenges and opportunities for NLP research. While data-driven approaches have advanced the field, their lack of interpretability and trustworthiness limits their applicability in dynamic legal environments. To address these issues, we collaborated with legal experts to extend an existing dataset and explored the use of Large Language Models (LLMs) and in-context learning to evaluate the legality of clauses in German employment contracts. Our work evaluates the ability of different LLMs to classify clauses as "valid," "unfair," or "void" under three legal context variants: no legal context, full-text sources of laws and court rulings, and distilled versions of these (referred to as examination guidelines). Results show that full-text sources moderately improve performance, while examination guidelines significantly enhance recall for void clauses and weighted F1-Score, reaching 80\%. Despite these advancements, LLMs' performance when using full-text sources remains substantially below that of human lawyers. We contribute an extended dataset, including examination guidelines, referenced legal sources, and corresponding annotations, alongside our code and all log files. Our findings highlight the potential of LLMs to assist lawyers in contract legality review while also underscoring the limitations of the methods presented.

en cs.CL

Detail Sumber

CrossRef Open Access 2025

Old French exploitation toponyms in the northern Low Countries and their significance for medieval Dutch settlement history

Alexia E. Kerkhof

Abstract This article explores the influence of Old French terminology on medieval settlement history in the Low Countries, revealing lexical exchanges that disseminated from the bilingual zone in Belgium to the southern Netherlands. It argues that the southern Dutch toponyms saert, triest , and mortel reflect the high medieval dissemination of southern technological and organizational expertise and proposes that feudal officials played a role in introducing Romance onomastic material into the Dutch toponymic landscape.

en

Detail DOI Sumber

arXiv Open Access 2024

How Entangled is Factuality and Deception in German?

Aswathy Velutharambath, Amelie Wührl, Roman Klinger

The statement "The earth is flat" is factually inaccurate, but if someone truly believes and argues in its favor, it is not deceptive. Research on deception detection and fact checking often conflates factual accuracy with the truthfulness of statements. This assumption makes it difficult to (a) study subtle distinctions and interactions between the two and (b) gauge their effects on downstream tasks. The belief-based deception framework disentangles these properties by defining texts as deceptive when there is a mismatch between what people say and what they truly believe. In this study, we assess if presumed patterns of deception generalize to German language texts. We test the effectiveness of computational models in detecting deception using an established corpus of belief-based argumentation. Finally, we gauge the impact of deception on the downstream task of fact checking and explore if this property confounds verification models. Surprisingly, our analysis finds no correlation with established cues of deception. Previous work claimed that computational models can outperform humans in deception detection accuracy, however, our experiments show that both traditional and state-of-the-art models struggle with the task, performing no better than random guessing. For fact checking, we find that Natural Language Inference-based verification performs worse on non-factual and deceptive content, while prompting Large Language Models for the same task is less sensitive to these properties.

en cs.CL

Detail Sumber

CrossRef Open Access 2023

Statistical evidence and surprise unified under possibility theory

David R. Bickel

AbstractSander Greenland argues that reported results of hypothesis tests should include the surprisal, the base‐2 logarithm of the reciprocal of a p‐value. The surprisal measures how many bits of evidence in the data warrant rejecting the null hypothesis. A generalization of surprisal also can measure how much the evidence justifies rejecting a composite hypothesis such as the complement of a confidence interval. That extended surprisal, called surprise, quantifies how many bits of astonishment an agent believing a hypothesis would experience upon observing the data. While surprisal is a function of a point in hypothesis space, surprise is a function of a subset of hypothesis space. Satisfying the conditions of conditional min‐plus probability, surprise inherits a wealth of tools from possibility theory. The equivalent compatibility function has been recently applied to the replication crisis, to adjusting p‐values for prior information, and to comparing scientific theories.

3 sitasi en

Detail DOI Sumber

arXiv Open Access 2022

I still have Time(s): Extending HeidelTime for German Texts

Andy Lücking, Manuel Stoeckel, Giuseppe Abrami et al.

HeidelTime is one of the most widespread and successful tools for detecting temporal expressions in texts. Since HeidelTime's pattern matching system is based on regular expression, it can be extended in a convenient way. We present such an extension for the German resources of HeidelTime: HeidelTime-EXT . The extension has been brought about by means of observing false negatives within real world texts and various time banks. The gain in coverage is 2.7% or 8.5%, depending on the admitted degree of potential overgeneralization. We describe the development of HeidelTime-EXT, its evaluation on text samples from various genres, and share some linguistic observations. HeidelTime ext can be obtained from https://github.com/texttechnologylab/heideltime.

en cs.CL

Detail Sumber

arXiv Open Access 2022

White-Box Attacks on Hate-speech BERT Classifiers in German with Explicit and Implicit Character Level Defense

Shahrukh Khan, Mahnoor Shahid, Navdeeppal Singh

In this work, we evaluate the adversarial robustness of BERT models trained on German Hate Speech datasets. We also complement our evaluation with two novel white-box character and word level attacks thereby contributing to the range of attacks available. Furthermore, we also perform a comparison of two novel character-level defense strategies and evaluate their robustness with one another.

en cs.CL

Detail Sumber

arXiv Open Access 2022

Enhancing the German Transmission Grid Through Dynamic Line Rating

Philipp Glaum, Fabian Hofmann

The German government recently announced that 80\% of the power supply should come from renewable energy by 2030. One key task lies in reorganizing the transmission system such that power can be transported from sites with good renewable potentials to the load centers. Dynamic Line Rating (DLR), which allows the dynamic calculation of transmission line capacities based on prevailing weather conditions rather than conservative invariant ratings, offers the potential to exploit existing grid capacities better. In this paper, we analyze the effect of DLR on behalf of a detailed power system model of Germany including all of today's extra high voltage transmission lines and substations. The evolving synergies between DLR and an increased wind power generation lead to savings of around 400 million euro per year in the short term and 900 million per year in a scenario for 2030.

en physics.soc-ph

Detail Sumber

arXiv Open Access 2022

A Transfer Learning Based Model for Text Readability Assessment in German

Salar Mohtaj, Babak Naderi, Sebastian Möller et al.

Text readability assessment has a wide range of applications for different target people, from language learners to people with disabilities. The fast pace of textual content production on the web makes it impossible to measure text complexity without the benefit of machine learning and natural language processing techniques. Although various research addressed the readability assessment of English text in recent years, there is still room for improvement of the models for other languages. In this paper, we proposed a new model for text complexity assessment for German text based on transfer learning. Our results show that the model outperforms more classical solutions based on linguistic features extraction from input text. The best model is based on the BERT pre-trained language model achieved the Root Mean Square Error (RMSE) of 0.483.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2021

High-resolution CARMA Observation of Molecular Gas in the North America and Pelican Nebulae

Shuo Kong, Héctor G. Arce, John M. Carpenter et al.

We present the first results from a CARMA high-resolution $^{12}$CO(1-0), $^{13}$CO(1-0), and C$^{18}$O(1-0) molecular line survey of the North America and Pelican (NAP) Nebulae. CARMA observations have been combined with single-dish data from the Purple Mountain 13.7m telescope to add short spacings and produce high-dynamic-range images. We find that the molecular gas is predominantly shaped by the W80 HII bubble that is driven by an O star. Several bright rims are probably remnant molecular clouds heated and stripped by the massive star. Matching these rims in molecular lines and optical images, we construct a model of the three-dimensional structure of the NAP complex. Two groups of molecular clumps/filaments are on the near side of the bubble, one being pushed toward us, whereas the other is moving toward the bubble. Another group is on the far side of the bubble and moving away. The young stellar objects in the Gulf region reside in three different clusters, each hosted by a cloud from one of the three molecular clump groups. Although all gas content in the NAP is impacted by feedback from the central O star, some regions show no signs of star formation, while other areas clearly exhibit star formation activity. Other molecular gas being carved by feedback includes the cometary structures in the Pelican Head region and the boomerang features at the boundary of the Gulf region. The results show that the NAP complex is an ideal place for the study of feedback effects on star formation.

en astro-ph.GA, astro-ph.SR

Detail DOI Sumber

S2 Open Access 2020

Decadal Variability in the Impact of Atmospheric Circulation Patterns on the Winter Climate of Northern Russia

G. Marshall

The Arctic continues to warm at a much faster rate than the global average. One process contributing to “Arctic amplification” involves changes in low-frequency macroscale atmospheric circulation patterns and their consequent influence on regional climate. Here, using ERA5 data, we examine decadal changes in the impact of seven such patterns on winter near-surface temperature (SAT) and precipitation (PPN) in northern Russia and calculate the temporal consistency of any statistically significant relationships. We demonstrate that the 40-yr climatology hides considerable decadal variability in the spatial extent of such circulation pattern–climate relationships across the region, with few areas where their temporal consistency exceeds 60%. This is primarily a response to the pronounced decadal expansion/contraction and/or mobility of the circulation patterns’ centers of action. The North Atlantic Oscillation (NAO) is the dominant pattern (having the highest temporal consistency) affecting SAT west of the Urals. Farther east, the Scandinavian (SCA), Polar/Eurasian (POL), and West Pacific patterns are successively the dominant pattern influencing SAT across the West Siberian Plains, Central Siberian Plateau, and mountains of Far East Siberia, respectively. From west to east, the SCA, POL, and Pacific–North American patterns exert the most consistent decadal influence on PPN. The only temporally invariant significant decadal relationships occur between the NAO and SAT and the SCA and PPN in small areas of the North European Plain.

8 sitasi en Environmental Science

Detail DOI Sumber

CrossRef Open Access 2020

Tense and Aspect in Germanic Languages

Kristin Melum Eide

1 sitasi en

Detail DOI Sumber

CrossRef Open Access 2020

Grammatical Reflexes of Information Structure in Germanic Languages

Caroline Féry

1 sitasi en

Detail DOI Sumber

arXiv Open Access 2020

Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

Kate McCurdy, Sharon Goldwater, Adam Lopez

Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do? Kirov and Cotterell (2018) argue that the answer is yes: modern Encoder-Decoder (ED) architectures learn human-like behavior when inflecting English verbs, such as extending the regular past tense form -(e)d to novel words. However, their work does not address the criticism raised by Marcus et al. (1995): that neural models may learn to extend not the regular, but the most frequent class -- and thus fail on tasks like German number inflection, where infrequent suffixes like -s can still be productively generalized. To investigate this question, we first collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model. The speaker data show high variability, and two suffixes evince 'regular' behavior, appearing more often with phonologically atypical inputs. Encoder-decoder models do generalize the most frequently produced plural class, but do not show human-like variability or 'regular' extension of these other plural markers. We conclude that modern neural models may still struggle with minority-class generalization.

en cs.CL

Detail DOI Sumber

CrossRef Open Access 2020

Second Language Acquisition of Germanic Languages

Carrie Jackson

en

Detail DOI Sumber

S2 Open Access 2020

Minni and Muninn: Memory in Medieval Nordic Culture ed. by Pernille Hermann, Stephen A. Mitchell, and Agnes S. Arnórsdóttir (review)

Ingunn Ásdísardóttir

saga book. stephen a mitchell online shopping for. sites of memory in the irish landscape approaching ogham. minni and muninn brepols. minni and muninn memory in medieval nordic culture. staff professor stefan brink phd frse. ways to explore undergraduate advising resources and support. professor dr stefan brink dependency and slavery. scripta islandica 66 2015 diva portal. stephen mitchell department of germanic languages. stephen mitchell harvard university ma harvard. learning from the past to understand the present 536 ad. department of anglo saxon norse and celtic. gateway courses sample department suggested and or. a global history of literature and the environment edited. agnes arnorsdottir forskning aarhus universitet. jan alexander van nahl scripta islandica 66 2015. history research output the university of aberdeen. emeritus professor margaret clunies ross. pensum læringskrav vms2102 høst 2019 universitetet i. minnunga mæn the usage of old knowledgeable men in. core. landnámabók og kristnisaga manuscript handrit is. hugin and munin norse mythology for smart people. miðlun og minnisrannsóknir arnastofnun. skald definition of skald by medical dictionary. pdf the icelandic sagas and saga landscapes writing. skalds definition of skalds by the free dictionary. stephen mitchell the standing mittee on medieval studies. minni and muninn acta scandinavica. memory imagery and visuality in old norse literature. constructing the past jstor. john lindow professor emeritus department of scandinavian. terry gunnell scripta islandica 66 2015. pdf minni and muninn memory in medieval nordic culture. minni and muninn memory in medieval nordic culture book. minnis meaning and origin of the name minnis nameaning net. introduction minni and muninn memory in medieval nordic. kennings in mind and memory duo. recension av minni and muninn memory in medieval nordic. hiller diana gendered perceptions of florentine last. the fluidity of tradition place names travelogues and. prof drkate heslop. old norse stephen mitchell. eddic article about eddic by the free dictionary. minni and muninn memory in medieval nordic culture ed by. stephen a mitchell co uk. harald fairhair. memory and old norse mythology minni and muninn

en

Detail DOI Sumber

arXiv Open Access 2017

The large scale impact of offshore wind farm structures on pelagic primary productivity in the southern North Sea

Kaela Slavik, Carsten Lemmen, Wenyan Zhang et al.

The increasing demand for renewable energy is projected to result in a 40-fold increase in offshore wind electricity in the European Union by 2030. Despite a great number of local impact studies for selected marine populations, the regional ecosystem impacts of offshore wind farm structures are not yet well assessed nor understood. Our study investigates whether the accumulation of epifauna, dominated by the filter feeder Mytilus edulis (blue mussel), on turbine structures affects pelagic primary productivity and ecosystem functioning in the southern North Sea. We estimate the anthropogenically increased potential distribution based on the current projections of turbine locations and reported patterns of M. edulis settlement. This distribution is integrated through the Modular Coupling System for Shelves and Coasts to state-of-the-art hydrodynamic and ecosystem models. Our simulations reveal non-negligible potential changes in regional annual primary productivity of up to 8% within the offshore wind farm area, and induced maximal increases of the same magnitude in daily productivity also far from the wind farms. Our setup and modular coupling are effective tools for system scale studies of other environmental changes arising from large-scale offshore wind-farming such as ocean physics and distributions of pelagic top predators.

en q-bio.PE

Detail DOI Sumber

S2 Open Access 1976

A Caledonian plate tectonic model

W. E. ADRIAN PHILLIPS, C. Stillman, T. Murphy

363 sitasi en Geology

Detail DOI Sumber

CrossRef Open Access 2016

Prescriptive infinitives in the modern North Germanic languages: An ancient phenomenon in child-directed speech

Janne Bondi Johannessen

The prescriptive infinitive can be found in the North Germanic languages, is very old, and yet is largely unnoticed and undescribed. It is used in a very limited pragmatic context of a pleasant atmosphere by adults towards very young children, or towards pets or (more rarely) adults. It has a set of syntactic properties that distinguishes it from the imperative: Negation is pre-verbal, subjects are pre-verbal, subjects are third person and are only expressed by lexical DPs, not personal pronouns. It can be found in modern child language corpora, but probably originated beforead500. The paper is largely descriptive, but some theoretical solutions to the puzzles of this construction are proposed.

2 sitasi en

Detail DOI Sumber

arXiv Open Access 2016

Statistical state dynamics based theory for the formation and equilibration of Saturn's north polar jet

Brian F. Farrell, Petros J. Ioannou

Coherent jets with most of the kinetic energy of the flow are common in atmospheric turbulence. In the gaseous planets these jets are maintained by incoherent turbulence excited by small-scale convection. Large-scale coherent waves are sometimes observed to coexist with the jets; a prominent example is Saturn's hexagonal North polar jet (NPJ). The mechanism responsible for forming and maintaining such a turbulent state remains elusive. The coherent planetary-scale component of the turbulence arises and is maintained by interaction with the incoherent small-scale turbulence component. Theoretical understanding of the dynamics of the jet/wave/turbulence coexistence regime is gained by employing a statistical state dynamics (SSD) model. Here, a second-order closure implementation of a two-layer beta-plane SSD is used to develop a theory that accounts for the structure and dynamics of the NPJ. Asymptotic analysis of the SSD equilibrium in the weak jet damping limit predicts a universal jet structure in agreement with NPJ observations. This asymptotic theory also predicts the wavenumber (six) of the prominent jet perturbation. Analysis with this model of the jet/wave/turbulence regime dynamics reveals that jet formation is controlled by the effective value of $β$; the required value of this parameter for correspondence with observation is obtained. As this is a robust prediction it is taken as an indirect observation of a deep poleward sloping stable layer beneath the NPJ. The slope required is obtained from observations of NPJ structure as is the small-scale turbulence excitation required to maintain the jet. The observed jet structure is then predicted by the theory as is the wave-six disturbance. This wave, which is identified with the least stable mode of the equilibrated jet, is shown to be primarily responsible for equilibrating the jet with the observed structure and amplitude.

en physics.ao-ph, astro-ph.EP

Detail DOI Sumber

Hasil untuk "North Germanic. Scandinavian"