Hasil untuk "North Germanic. Scandinavian"

Menampilkan 20 dari ~948886 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar

JSON API
arXiv Open Access 2025
GG-BBQ: German Gender Bias Benchmark for Question Answering

Shalaka Satheesh, Katrin Klug, Katharina Beckh et al.

Within the context of Natural Language Processing (NLP), fairness evaluation is often associated with the assessment of bias and reduction of associated harm. In this regard, the evaluation is usually carried out by using a benchmark dataset, for a task such as Question Answering, created for the measurement of bias in the model's predictions along various dimensions, including gender identity. In our work, we evaluate gender bias in German Large Language Models (LLMs) using the Bias Benchmark for Question Answering by Parrish et al. (2022) as a reference. Specifically, the templates in the gender identity subset of this English dataset were machine translated into German. The errors in the machine translated templates were then manually reviewed and corrected with the help of a language expert. We find that manual revision of the translation is crucial when creating datasets for gender bias evaluation because of the limitations of machine translation from English to a language such as German with grammatical gender. Our final dataset is comprised of two subsets: Subset-I, which consists of group terms related to gender identity, and Subset-II, where group terms are replaced with proper names. We evaluate several LLMs used for German NLP on this newly created dataset and report the accuracy and bias scores. The results show that all models exhibit bias, both along and against existing social stereotypes.

en cs.CL, cs.CY
arXiv Open Access 2025
Modeling the Cosmic Ultraviolet Background at the North Galactic Pole

Jayant Murthy

I explore models of the dust-scattered component of the Cosmic Ultraviolet Background (CUVB) at the North Galactic Pole (NGP) in order to develop a framework for calculating the dust-scattered light as a function of the optical depths. As expected, I find that the dust-scattered emission scales linearly with reddening up to $E(B-V) \approx 0.1$\ mag and derive a parametric model for this dependence. I have applied these models to fit the far-ultraviolet (1350--1800 Å) observations from the \textit{Galaxy Evolution Explorer (GALEX)} finding that the optical constants of the interstellar dust grains -- albedo ($a$) and phase function asymmetry factor ($g$) -- are consistent with predictions from the Astrodust model ($a = 0.33$, $g = 0.68$). I detect an isotropic offset of $267 \pm 7$ ph cm$^{-2}$ s$^{-1}$ sr$^{-1}$ Å$^{-1}$, half of which remains unaccounted for by known Galactic or extragalactic sources. I will now extend my analysis to wider sky regions with the goal of generating high-resolution extinction maps.

en astro-ph.IM
arXiv Open Access 2025
Climate adaptation of millet and sorghum varieties in North-Eastern Senegal: cross-referencing rainfall, thermal and phenological parameters

Awa Amadou Sall, Elhadji Faye, Pierre Guillemin et al.

Millet (Pennisetum glaucum) and sorghum (Sorghum bicolor) are the main rainfed cereals grown in North-Eastern Senegal. However, faced with constraints such as falling rainfall, rising temperatures and frequent dry spells, their production is tending to decline. This article examines the climatic constraints and other shocks suffered by rainfed millet varieties Souna__3, ICTP 8203, GB 8735, Gawane and Chakti, as well as those as sorghum CE__180-33, Payenne and Golob{é}, which are the main varieties released and currently grown in north-eastern Senegal. Based on data collected in Podor, Matam and Lingu{è}re, the article analyses the adaptation of different millet and sorghum varieties to climatic condition and their evolution over time The results show a rainfall deficit since the early 1970s, combined by greater thermal constraints. Analysis of the differences between cumulative rainfall and maximum evapotranspiration for varieties at different growth stages reveals constant water deficits for Souna__3 millet and CE 180-33 sorghum. In contrast, Chakti millet shows positive water balances in over 80% of years in the east and west of the study area, and in 47% of cases in the north. Only Chakti and ICTP 8203 are adapted to the climatic conditions of the eastern and western zones, with a probability of suitability of over 80% for the periods 1931-1969 and 1999-2020. However, none of the varieties is adapted to the climatic conditions in the north. In addition to these climatic constraints, the interviewed farmers attribute the decline in agricultural production to livestock straying, attacks by bird pests and parasitic infestations. exacerbate agricultural losses. It is therefore essential to develop complementary strategies including wider dissemination of varieties better adapted to current climatic conditions, such as Chakti and ICTP 8203, and the strengthening of crop protection systems, notably through biological control and integrated pest management.

en physics.geo-ph
arXiv Open Access 2025
Analysis of Traffic Congestion in North Campus, Delhi University Using Continuous Time Models

Siddhartha Mahajan, Harsh Raj, Sonam Tanwar

This project investigates traffic congestion within North Campus, Delhi University (DU), using continuous time simulations implemented in UXSim to model vehicle movement and interaction. The study focuses on several key intersections, identifies recurring congestion points, and evaluates the effectiveness of conventional traffic management measures. Implementing signal timing optimization and modest intersection reconfiguration resulted in measurable improvements in simulated traffic flow. The results provide practical insights for local traffic management and illustrate the value of continuous time simulation methods for informing short-term interventions and longer-term planning.

en eess.SY, math.OC
arXiv Open Access 2025
Functoriality of Enriched Data Types

Lukas Mulder, Paige Randall North, Maximilien Péroux

In previous work, categories of algebras of endofunctors were shown to be enriched in categories of coalgebras of the same endofunctor, and the extra structure of that enrichment was used to define a generalization of inductive data types. These generalized inductive data types are parametrized by a coalgebra $C$, so we call them $C$-inductive data types; we call the morphisms induced by their universal property $C$-inductive functions. We extend that work by incorporating natural transformations into the theory: given a suitable natural transformation between endofunctors, we show that this induces enriched functors between their categories of algebras which preserve $C$-inductive data types and $C$-inductive functions. Such $C$-inductive data types are often finite versions of the corresponding inductive data type, and we show how our framework can extend classical initial algebra semantics to these types. For instance, we show that our theory naturally produces partially inductive functions on lists, changes in list element types, and tree pruning functions.

en math.CT, cs.LO
S2 Open Access 2024
Climate extremes in Svalbard over the last two millennia are linked to atmospheric blocking

François Lapointe, A. Karmalkar, R. S. Bradley et al.

Arctic precipitation in the form of rain is forecast to become more prevalent in a warmer world but with seasonal and interannual changes modulated by natural modes of variability. Experiencing rapid hydroclimatic changes in the Arctic, Svalbard serves as an ideal study location due to its exposure to oceanic and atmospheric variability in the North Atlantic region. Here we use climate data from paleoproxies, observations, and a climate model to demonstrate that wet and warm extremes in Svalbard over the last two millennia are linked to the presence of atmospheric blocking regimes over Scandinavia and the Ural mountain region. Rainfall episodes lead to the deposition of coarse sediment particles and high levels of calcium in Linnévatnet, a lake in southwest Svalbard, with the coarsest sediments consistently deposited during atmospheric blocking events. A unique annually resolved sediment record from Linnévatnet confirms that this linkage has been persistent over the past 2000 years. Our record also shows that a millennial-scale decline in Svalbard precipitation ended around the middle of the 19th century, followed by several unprecedented extreme events in recent years. As warming continues and sea ice recedes, future Svalbard floods will become more intense during episodes of Scandinavian and Ural blocking. The study links heavy past rainfall events in Svalbard to Scandinavian blocking. Lake sediment data spanning the last 2 millennia warns of worse floods with continued warming, especially intense during atmospheric blocking conditions.

13 sitasi en Medicine
CrossRef Open Access 2024
Old and Middle English adverbs of degree in their wider West Germanic context

Lourens Visser

Abstract Research on adverbs of degree in Old and Middle English has been largely self-contained and has paid little attention to developments that were happening in the neighbouring West Germanic languages. While research on these other languages is less extensive, “Middle” Germanic has been identified as a period of convergence for the usage of adverbs of degree (Visser 2023). The present study analyses the usage patterns of seven adverbs in both Old and Middle English using data from different corpora: swīðe/swīthe, ful, miċle/muchel, sāre/sǭre, ġearwe/yāre, fela/fę̄le, and hearde/harde. It is found that their development differs strikingly from their Continental West Germanic counterparts, and they appear to preserve more primary usage patterns.

arXiv Open Access 2024
German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset

Laura Mascarell, Ribin Chalumattu, Annette Rios

The advent of Large Language Models (LLMs) has led to remarkable progress on a wide range of natural language processing tasks. Despite the advances, these large-sized models still suffer from hallucinating information in their output, which poses a major issue in automatic text summarization, as we must guarantee that the generated summary is consistent with the content of the source document. Previous research addresses the challenging task of detecting hallucinations in the output (i.e. inconsistency detection) in order to evaluate the faithfulness of the generated summaries. However, these works primarily focus on English and recent multilingual approaches lack German data. This work presents absinth, a manually annotated dataset for hallucination detection in German news summarization and explores the capabilities of novel open-source LLMs on this task in both fine-tuning and in-context learning settings. We open-source and release the absinth dataset to foster further research on hallucination detection in German.

en cs.CL, cs.AI
arXiv Open Access 2024
Measuring data types

Lukas Mulder, Paige Randall North, Maximilien Péroux

In this article, we combine Sweedler's classic theory of measuring coalgebras -- by which $k$-algebras are enriched in $k$-coalgebras for $k$ a field -- with the theory of W-types -- by which the categorical semantics of inductive data types in functional programming languages are understood. In our main theorem, we find that under some hypotheses, algebras of an endofunctor are enriched in coalgebras of the same endofunctor, and we find polynomial endofunctors provide many interesting examples of this phenomenon. We then generalize the notion of initial algebra of an endofunctor using this enrichment, thus generalizing the notion of W-type. This article is an extended version of arXiv:2303.16793, it adds expository introductions to the original theories of measuring coalgebras and W-types along with some improvements to the main theory and many explicitly worked examples.

en math.CT, cs.LO
arXiv Open Access 2023
MEDBERT.de: A Comprehensive German BERT Model for the Medical Domain

Keno K. Bressem, Jens-Michalis Papaioannou, Paul Grundmann et al.

This paper presents medBERTde, a pre-trained German BERT model specifically designed for the German medical domain. The model has been trained on a large corpus of 4.7 Million German medical documents and has been shown to achieve new state-of-the-art performance on eight different medical benchmarks covering a wide range of disciplines and medical document types. In addition to evaluating the overall performance of the model, this paper also conducts a more in-depth analysis of its capabilities. We investigate the impact of data deduplication on the model's performance, as well as the potential benefits of using more efficient tokenization methods. Our results indicate that domain-specific models such as medBERTde are particularly useful for longer texts, and that deduplication of training data does not necessarily lead to improved performance. Furthermore, we found that efficient tokenization plays only a minor role in improving model performance, and attribute most of the improved performance to the large amount of training data. To encourage further research, the pre-trained model weights and new benchmarks based on radiological data are made publicly available for use by the scientific community.

en cs.CL, cs.AI
S2 Open Access 2020
A regime view of future atmospheric circulation changes in northern mid-latitudes

F. Fabiano, V. Meccia, P. Davini et al.

Abstract. Future wintertime atmospheric circulation changes in the Euro–Atlantic (EAT) and Pacific–North American (PAC) sectors are studied from a weather regimes perspective. The Coupled Model Intercomparison Project phases 5 and 6 (CMIP5 and CMIP6) historical simulation performance in reproducing the observed regimes is first evaluated, showing a general improvement in the CMIP6 models, which is more evident for EAT. The circulation changes projected by CMIP5 and CMIP6 scenario simulations are analysed in terms of the change in the frequency and persistence of the regimes. In the EAT sector, significant positive trends are found for the frequency and persistence of NAO+ (North Atlantic Oscillation) for SSP2–4.5, SSP3–7.0 and SSP5–8.5 scenarios with a concomitant decrease in the frequency of the Scandinavian blocking and Atlantic Ridge regimes. For PAC, the Pacific Trough regime shows a significant increase, while the Bering Ridge is predicted to decrease in all scenarios analysed. The spread among the model responses is linked to different levels of warming in the polar stratosphere, the tropical upper troposphere, the North Atlantic and the Arctic.

73 sitasi en Environmental Science
S2 Open Access 2020
Enhanced extended‐range predictability of the 2018 late‐winter Eurasian cold spell due to the stratosphere

Lisa‐Ann Kautz, I. Polichtchouk, T. Birner et al.

A severe cold spell with surface temperatures reaching 10 K below its climatology hit Eurasia during late February/early March 2018. This cold spell was associated with a Scandinavian blocking pattern followed by an extreme negative North Atlantic Oscillation (NAO) phase. Here we explore the predictability of this cold spell/NAO event using ensemble forecasts from the Subseasonal‐to‐Seasonal (S2S) archive of the European Centre for Medium‐Range Weather Forecasts. We find that this event was predicted with the observed strength roughly 10 days in advance. However, the probability of the cold spell occurring doubled up to 25 days in advance, when a sudden stratospheric warming (SSW) occurred. Our results indicate that the amplitude of the cold spell was increased by a regime shift to the negative NAO phase at the end of February, which was likely favoured by the SSW. We quantify the contribution of the SSW to the enhanced extended‐range forecast skill for this particular event by running forecast ensembles in which the evolution of the stratosphere is nudged to (a) the observed evolution, and (b) a time‐invariant state. In the experiment with nudging to the observed stratospheric evolution, the probability of a strong cold spell occurring is enhanced to 45%, while it is at its climatological value of 5% when the stratosphere is nudged to a time‐invariant state. These results showing enhanced predictability of surface extremes following SSWs extend previous observational evidence, which is mostly based on composite analyses, to a single event. Our results suggest that it is the subsequent evolution throughout the lower stratosphere following the SSW, rather than the occurrence of the SSW itself, that is crucial in coupling to large‐scale tropospheric flow patterns. However, we caution that probabilistic gain in predictability alone is insufficient to conclude a causal link between the SSW and the cold spell event.

68 sitasi en Environmental Science
arXiv Open Access 2022
Univalent foundations and the equivalence principle

Benedikt Ahrens, Paige Randall North

In this paper, we explore the 'equivalence principle' (EP): roughly, statements about mathematical objects should be invariant under an appropriate notion of equivalence for the kinds of objects under consideration. In set theoretic foundations, EP may not always hold: for instance, the statement '1 \in N' is not invariant under isomorphism of sets. In univalent foundations, on the other hand, EP has been proven for many mathematical structures. We first give an overview of earlier attempts at designing foundations that satisfy EP. We then describe how univalent foundations validates EP.

en math.LO, cs.LO
arXiv Open Access 2022
Agent-Based Model Framework for the North Carolina Modeling Infectious Diseases Program (NC MInD ABM) Overview, Design Concepts, and Details Protocol

Kasey Jones, Emily Hadley, Caroline Kery et al.

To help facilitate a variety of simulations related to healthcare facilities in North Carolina, we have developed an agent-based model (ABM) to accurately simulate patient (i.e., agent) movement to and from these facilities. This is an Overview, Design Concepts, and Details (ODD) Protocol, a standardized method for describing ABMs. This ODD provides detailed information on healthcare facilities in North Carolina, the agent movement to and between them, and any decisions that were made during the creation of this model. This ABM is intended to be used alongside disease-specific submodels. It can be used for purposes such as simulating the success of interventions on reducing disease transmission, simulating strain on facility resources (including staff and materials), and forecasting hospital capacity. Disease-specific ODDs should accompany this document. No details related to any submodels that use this ABM as a base model are included.

en stat.AP
S2 Open Access 2021
Improving Zero-Shot Cross-lingual Transfer Between Closely Related Languages by Injecting Character-Level Noise

Noëmi Aepli, Rico Sennrich

Cross-lingual transfer between a high-resource language and its dialects or closely related language varieties should be facilitated by their similarity. However, current approaches that operate in the embedding space do not take surface similarity into account. This work presents a simple yet effective strategy to improve cross-lingual transfer between closely related varieties. We propose to augment the data of the high-resource source language with character-level noise to make the model more robust towards spelling variations. Our strategy shows consistent improvements over several languages and tasks: Zero-shot transfer of POS tagging and topic identification between language varieties from the Finnic, West and North Germanic, and Western Romance language branches. Our work provides evidence for the usefulness of simple surface-level noise in improving transfer between language varieties.

26 sitasi en Computer Science
CrossRef Open Access 2021
Overt subject pronoun in Gothic vs null subject in Greek

Carla Falluomini

Abstract The Gothic translation of the Bible is a word-for-word rendition of a lost Greek Vorlage (reconstructed by W. Streitberg in 1908; 2nd revised edition in 1919). As previous studies have pointed out, one of the most interesting features of this version is the presence of the overt subject pronoun in instances where there is a null subject in Greek. Considering that Gothic is a null subject language, how is it possible to justify this feature? Based on a new collation that uses biblical textual witnesses not considered by Streitberg (i.e. Greek majuscule and minuscule manuscripts, Church Fathers, commentaries, lectionaries, and Vetus Latina manuscripts), this paper analyses the Gothic-Greek divergences involving the presence of the overt subject pronoun in the Gospel of John, in order to verify previous hypotheses and shed new light on this debated topic.

arXiv Open Access 2021
Algebraic Presentations of Type Dependency

Benedikt Ahrens, Jacopo Emmenegger, Paige Randall North et al.

C-systems were defined by Cartmell as the algebraic structures that correspond exactly to generalised algebraic theories. B-systems were defined by Voevodsky in his quest to formulate and prove an initiality conjecture for type theories. They play a crucial role in Voevodsky's construction of a syntactic C-system from a term monad. In this work, we construct an equivalence between the category of C-systems and the category of B-systems, thus proving a conjecture by Voevodsky. We construct this equivalence as the restriction of an equivalence between more general structures, called CE-systems and E-systems, respectively. To this end, we identify C-systems and B-systems as "stratified" CE-systems and E-systems, respectively; that is, systems whose contexts are built iteratively via context extension, starting from the empty context.

arXiv Open Access 2020
A Dataset of German Legal Documents for Named Entity Recognition

Elena Leitner, Georg Rehm, Julián Moreno-Schneider

We describe a dataset developed for Named Entity Recognition in German federal court decisions. It consists of approx. 67,000 sentences with over 2 million tokens. The resource contains 54,000 manually annotated entities, mapped to 19 fine-grained semantic classes: person, judge, lawyer, country, city, street, landscape, organization, company, institution, court, brand, law, ordinance, European legal norm, regulation, contract, court decision, and legal literature. The legal documents were, furthermore, automatically annotated with more than 35,000 TimeML-based time expressions. The dataset, which is available under a CC-BY 4.0 license in the CoNNL-2002 format, was developed for training an NER service for German legal documents in the EU project Lynx.

en cs.CL, cs.IR

Halaman 18 dari 47445