Hasil "North Germanic. Scandinavian"

arXiv Open Access 2025

A validated coupled three-dimensional hydrodynamic and spectral wind-wave model for the western north Atlantic Ocean

Maria Venolia, Reza Marsooli, Jaime R. Calzada

Wind-wave and ocean current interactions affect critical coastal and oceanic processes, yet modeling these interactions presents significant challenges. The western North Atlantic Ocean provides an ideal test environment for coupled hydrodynamics and wind wave models, thanks to its energetic surface currents such as the Gulf Stream. This study evaluates a high-resolution coupled SCHISM WWM III model, utilizing NOAA's 'STOFS-3D-Atlantic' computational mesh, while incorporating three-dimensional baroclinic dynamics to account for density stratification effects. We evaluate the model's calculated water level and tidal predictions against NOAA tide gauge measurements during December 2016. The coupled model demonstrates robust skills in reproducing tidal constituents, non-tidal components, and total water level predictions along the U.S. East and Gulf of Mexico Coasts. In addition, we systematically evaluate three wave physics parameterizations (Ardhuin, Makin and Stam, and Cycle Three) in the spectral wave model to quantify their effects on the modeled wave characteristics. This validated modeling framework enhances our ability to understand and predict complex coastal and oceanic processes, offering significant applications for coastal management, maritime operations, and climate adaptation planning throughout the western North Atlantic region.

en physics.ao-ph

Detail Sumber

arXiv Open Access 2025

MisinfoTeleGraph: Network-driven Misinformation Detection for German Telegram Messages

Lu Kalkbrenner, Veronika Solopova, Steffen Zeiler et al.

Connectivity and message propagation are central, yet often underutilized, sources of information in misinformation detection -- especially on poorly moderated platforms such as Telegram, which has become a critical channel for misinformation dissemination, namely in the German electoral context. In this paper, we introduce Misinfo-TeleGraph, the first German-language Telegram-based graph dataset for misinformation detection. It includes over 5 million messages from public channels, enriched with metadata, channel relationships, and both weak and strong labels. These labels are derived via semantic similarity to fact-checks and news articles using M3-embeddings, as well as manual annotation. To establish reproducible baselines, we evaluate both text-only models and graph neural networks (GNNs) that incorporate message forwarding as a network structure. Our results show that GraphSAGE with LSTM aggregation significantly outperforms text-only baselines in terms of Matthews Correlation Coefficient (MCC) and F1-score. We further evaluate the impact of subscribers, view counts, and automatically versus human-created labels on performance, and highlight both the potential and challenges of weak supervision in this domain. This work provides a reproducible benchmark and open dataset for future research on misinformation detection in German-language Telegram networks and other low-moderation social platforms.

en cs.CL

Detail Sumber

arXiv Open Access 2024

LLäMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch

Jan Pfister, Julia Wunderle, Andreas Hotho

We create two German-only decoder models, LLäMmlein 120M and 1B, transparently from scratch and publish them, along with the training data, for the German NLP research community to use. The model training involved several key steps, including extensive data preprocessing, the creation of a custom German tokenizer, the training itself, as well as the evaluation of the final models on various benchmarks. Throughout the training process, multiple checkpoints were saved and analyzed using the SuperGLEBer benchmark to monitor the models' learning dynamics. Compared to state-of-the-art models on the SuperGLEBer benchmark, both LLäMmlein models performed competitively, consistently matching or surpassing models with similar parameter sizes. The results show that the models' quality scales with size as expected, but performance improvements on some tasks plateaued early, offering valuable insights into resource allocation for future model development.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2023

Scientometric Rules as a Guide to Transform Science Systems in the Middle East & North Africa

Jamal El-Ouahi

This study explores how scientometric data and indicators are used to transform science systems in a selection of countries in the Middle East and North Africa. I propose that scientometric-based rules inform such transformation. First, the research shows how research managers adopt scientometrics as 'global standards'. I also show how several scientometric data and indicators are adopted following a 'glocalization' process. Finally, I demonstrate how research managers use this data to inform decision-making and policymaking processes. This study contributes to a broader understanding of the usage of scientometric indicators in the context of assessing research institutions and researchers based on their publishing activities. Related to these assessments, I also discuss how such data transforms and adapts local science systems to meet so-called 'global standards'.

en cs.DL

Detail DOI Sumber

arXiv Open Access 2023

Structure and Dynamics of a Lattice of Tetragonal Germanates R2Ge2O7 (R = Tb-Lu, Y): Ab Initio Calculation

V. S. Ryumshin, V. A. Chernyshev

The crystal structure, phonon spectrum, and elastic constants of a series of rare-earth germanates (including yttrium germanate R2Ge2O7 (R = Tb-Lu, Y)) with a tetragonal structure have been ab initio calculated within the density functional theory. The frequencies and types of fundamental vibrations and the intensities of IR and Raman modes are determined. The degrees of participation of ions in each mode are determined by analyzing the displacement vectors obtained as a result of the ab initio calculations. The calculations have been performed for the first time; there are no corresponding experimental data for the entire series of compounds (except for the IR and Raman spectra of yttrium germanate). The performed calculations made it possible to interpret and supplement the known data in the literature on IR and Raman spectra of yttrium germanate Y2Ge2O7.

en cond-mat.str-el

Detail DOI Sumber

arXiv Open Access 2023

Automatic Generation of German Drama Texts Using Fine Tuned GPT-2 Models

Mariam Bangura, Kristina Barabashova, Anna Karnysheva et al.

This study is devoted to the automatic generation of German drama texts. We suggest an approach consisting of two key steps: fine-tuning a GPT-2 model (the outline model) to generate outlines of scenes based on keywords and fine-tuning a second model (the generation model) to generate scenes from the scene outline. The input for the neural model comprises two datasets: the German Drama Corpus (GerDraCor) and German Text Archive (Deutsches Textarchiv or DTA). In order to estimate the effectiveness of the proposed method, our models are compared with baseline GPT-2 models. Our models perform well according to automatic quantitative evaluation, but, conversely, manual qualitative analysis reveals a poor quality of generated texts. This may be due to the quality of the dataset or training inputs.

en cs.CL

Detail Sumber

arXiv Open Access 2022

Bicategorical type theory: semantics and syntax

Benedikt Ahrens, Paige Randall North, Niels van der Weide

We develop semantics and syntax for bicategorical type theory. Bicategorical type theory features contexts, types, terms, and directed reductions between terms. This type theory is naturally interpreted in a class of structured bicategories. We start by developing the semantics, in the form of comprehension bicategories. Examples of comprehension bicategories are plentiful; we study both specific examples as well as classes of examples constructed from other data. From the notion of comprehension bicategory, we extract the syntax of bicategorical type theory, that is, judgment forms and structural inference rules. We prove soundness of the rules by giving an interpretation in any comprehension bicategory. The semantic aspects of our work are fully checked in the Coq proof assistant, based on the UniMath library.

en cs.LO, math.CT

Detail DOI Sumber

arXiv Open Access 2022

Klexikon: A German Dataset for Joint Summarization and Simplification

Dennis Aumiller, Michael Gertz

Traditionally, Text Simplification is treated as a monolingual translation task where sentences between source texts and their simplified counterparts are aligned for training. However, especially for longer input documents, summarizing the text (or dropping less relevant content altogether) plays an important role in the simplification process, which is currently not reflected in existing datasets. Simultaneously, resources for non-English languages are scarce in general and prohibitive for training new solutions. To tackle this problem, we pose core requirements for a system that can jointly summarize and simplify long source documents. We further describe the creation of a new dataset for joint Text Simplification and Summarization based on German Wikipedia and the German children's lexicon "Klexikon", consisting of almost 2900 documents. We release a document-aligned version that particularly highlights the summarization aspect, and provide statistical evidence that this resource is well suited to simplification as well. Code and data are available on Github: https://github.com/dennlinger/klexikon

en cs.CL

Detail Sumber

arXiv Open Access 2020

Complex Network Analysis of North American Institutions of Higher Education on Twitter

Dmitry Zinoviev, Shana Cote, Robert Diaz

North American institutions of higher education (IHEs): universities, 4- and 2-year colleges, and trade schools -- are heavily present and followed on Twitter. An IHE Twitter account, on average, has 20,000 subscribers. Many of them follow more than one IHE, making it possible to construct an IHE network, based on the number of co-followers. In this paper, we explore the structure of a network of 1,435 IHEs on Twitter. We discovered significant correlations between the network attributes: various centralities and clustering coefficients -- and IHEs' attributes, such as enrollment, tuition, and religious/racial/gender affiliations. We uncovered the community structure of the network linked to homophily -- such that similar followers follow similar colleges. Additionally, we analyzed the followers' self-descriptions and identified twelve overlapping topics that can be traced to the followers' group identities.

en cs.SI

Detail Sumber

arXiv Open Access 2018

microNER: A Micro-Service for German Named Entity Recognition based on BiLSTM-CRF

Gregor Wiedemann, Raghav Jindal, Chris Biemann

For named entity recognition (NER), bidirectional recurrent neural networks became the state-of-the-art technology in recent years. Competing approaches vary with respect to pre-trained word embeddings as well as models for character embeddings to represent sequence information most effectively. For NER in German language texts, these model variations have not been studied extensively. We evaluate the performance of different word and character embeddings on two standard German datasets and with a special focus on out-of-vocabulary words. With F-Scores above 82% for the GermEval'14 dataset and above 85% for the CoNLL'03 dataset, we achieve (near) state-of-the-art performance for this task. We publish several pre-trained models wrapped into a micro-service based on Docker to allow for easy integration of German NER into other applications via a JSON API.

en cs.CL

Detail Sumber

arXiv Open Access 2018

Possible connection between the asymmetry of North Polar Spur and Loop I with Fermi Bubbles

Kartick Chandra Sarkar

The origin of North Polar Spur (NPS) and Loop-I has been debated over almost half a century and is still unresolved. Most of the confusion is caused by the absence of any prominent counterparts of these structures in the southern Galactic hemisphere (SGH). This has also led to doubts over the claimed connection between the NPS and Fermi Bubbles (FBs). I show in this paper, that such asymmetries of NPS and Loop-I in both X-rays and $γ$-rays can be easily produced if the circumgalactic medium (CGM) density in the southern hemisphere is only smaller by $\approx 20\%$ than the northern counterpart in case of a star formation driven wind scenario. The required mechanical luminosity, $\mathcal{L} \approx 4-5\times 10^{40} $ erg s$^{-1}$ (reduces to $\approx 0.3$ M$_\odot$ yr$^{-1}$ including the non-thermal pressure) and the age of the FBs, $t_{\rm age} \approx 28$ Myr, are consistent with previous estimations in case of a star formation driven wind scenario. One of the main reasons for the asymmetry is the projection effects at the Solar location. Such a proposition is also consistent with the fact that the southern FB is $\approx 5^\circ$ bigger than the northern one. The results, therefore, indicate towards a possibility for a common origin of the NPS, Loop-I and FBs from the Galactic centre (GC). I also estimate the average sky brightness in X-ray towards the south Galactic pole and North Galactic pole in the ROSAT-R67 band and find that the error in average brightness is far too large to have any estimation of the deficiency in the southern hemisphere.

en astro-ph.GA

Detail DOI Sumber

CrossRef Open Access 2016

Growing syntax: The development of a DP in North Germanic

Kersti Börjars, Pauline Harries, Nigel Vincent

Grammaticalization as standardly conceived is a change whereby an item develops from a lexical to a grammatical or functional meaning, or from being less to more grammatical. In this article we show that this can only be part of the story; for a full account we need to understand the syntactic structures into which grammaticalizing elements fit and how they too develop. To achieve this end we consider in detail the history of definiteness marking within the noun phrase in North Germanic, and in particular in Faroese. We show how this change requires us to distinguish between projecting and nonprojecting categories, and how a category can emerge over time and only subsequently develop into a head with its own associated functional projection. The necessary structure, rather than being intrinsic to an aprioristic universal grammar, grows over time as part of the grammaticalization process. We suggest that this in turn argues for a parallel correspondence theory of grammar such as the one adopted here, LEXICAL-FUNCTIONAL GRAMMAR, in which different dimensions of linguistic structure can change at different rates.

17 sitasi en

Detail DOI Sumber

arXiv Open Access 2007

The radius and mass of the subgiant star bet Hyi from interferometry and asteroseismology

J. R. North, J. Davis, T. R. Bedding et al.

We have used the Sydney University Stellar Interferometer (SUSI) to measure the angular diameter of beta Hydri. This star is a nearby G2 subgiant whose mean density was recently measured with high precision using asteroseismology. We determine the radius and effective temperature of the star to be 1.814+/-0.017 R_sun (0.9%) and 5872+/-44 K (0.7%) respectively. By combining this value with the mean density, as estimated from asteroseismology, we make a direct estimate of the stellar mass. We find a value of 1.07+/-0.03 M_sun (2.8%), which agrees with published estimates based on fitting in the H-R diagram, but has much higher precision. These results place valuable constraints on theoretical models of beta Hyi and its oscillation frequencies.

en astro-ph

Detail DOI Sumber

arXiv Open Access 1996

A high-resolution map of the cosmic microwave background around the north celestial pole

Max Tegmark, Angelica de Oliveira-Costa, Marc Devlin et al.

We present a Wiener filtered map of the Cosmic Microwave Background (CMB) fluctuations in a disk with 15 degree diameter, centered at the North Celestial Pole. The map is based on the 1993-1995 data from the Saskatoon (SK) experiment, with an angular resolution around 1 degree in the frequency range 27.6-44.1 GHz. The signal-to-noise ratio in the map is of order two, and some individual hot and cold spots are significant at the 5 sigma level. The spatial features are found to be consistent from year to year, reenforcing the conclusion that the SK results are not dominated by residual atmospheric contamination or other non-celestial signals.

en astro-ph

Detail DOI Sumber

arXiv Open Access 1996

Extraction of V-N-Collocations from Text Corpora: A Feasibility Study for German

Elisabeth Breidt

The usefulness of a statistical approach suggested by Church et al. (1991) is evaluated for the extraction of verb-noun (V-N) collocations from German text corpora. Some problematic issues of that method arising from properties of the German language are discussed and various modifications of the method are considered that might improve extraction results for German. The precision and recall of all variant methods is evaluated for V-N collocations containing support verbs, and the consequences for further work on the extraction of collocations from German corpora are discussed. With a sufficiently large corpus (>= 6 mio. word-tokens), the average error rate of wrong extractions can be reduced to 2.2% (97.8% precision) with the most restrictive method, however with a loss in data of almost 50% compared to a less restrictive method with still 87.6% precision. Depending on the goal to be achieved, emphasis can be put on a high recall for lexicographic purposes or on high precision for automatic lexical acquisition, in each case unfortunately leading to a decrease of the corresponding other variable. Low recall can still be acceptable if very large corpora (i.e. 50 - 100 million words) are available or if corpora for special domains are used in addition to the data found in machine readable (collocation) dictionaries.

en cs.CL

Detail Sumber

arXiv Open Access 2003

A Deep, Wide Field, Optical, and Near Infrared Catalog of a Large Area around the Hubble Deep Field North

P. Capak, L. L. Cowie, E. M. Hu et al.

We have conducted a deep multi-color imaging survey of 0.2 degrees^2 centered on the Hubble Deep Field North (HDF-N). We shall refer to this region as the Hawaii-HDF-N. Deep data were collected in U, B, V, R, I, and z' bands over the central 0.2 degrees^2 and in HK' over a smaller region covering the Chandra Deep Field North (CDF-N). The data were reduced to have accurate relative photometry and astrometry across the entire field to facilitate photometric redshifts and spectroscopic followup. We have compiled a catalog of 48,858 objects in the central 0.2 degrees^2 detected at 5 sigma significance in a 3" aperture in either R or z' band. Number counts and color-magnitude diagrams are presented and shown to be consistent with previous observations. Using color selection we have measured the density of objects at 3<z<7. Our multi-color data indicates that samples selected at z>5.5 using the Lyman break technique suffer from more contamination by low redshift objects than suggested by previous studies.

en astro-ph

Detail DOI Sumber

arXiv Open Access 1998

Mercury and platinum abundances in mercury-manganese stars

C. M. Jomaron, M. M. Dworetsky, D. A. Bohlender

We report new results for the elemental and isotopic abundances of the normally rare elements mercury and platinum in HgMn stars. Typical overabundances can be 4 dex or more. The isotopic patterns do not follow the fractionation model of White et al (1976).

en astro-ph

Detail Sumber

arXiv Open Access 1998

Magnetic field distribution and element concentration on the CP2 star CU Virginis

Yu. V. Glagolevskij, E. Gerth, G. Hildebrandt et al.

We search for a relation between the published distributions of different elements and the calculated magnetic field structure, following from a dipole-quadrupole configuration, of the CP2 star CU Vir. The highest concentration of individual chemical elements on the stellar surface coincides obviously with the regions of the highest values of the magnetic field strength.

en astro-ph

Detail Sumber

arXiv Open Access 1998

Effective temperatures from A5 to G5 spectral types, using Balmer line profiles

C. van 't Veer-Menneret, C. Bentolila, D. Katz

We show how previous works (Fuhrmann et al., van 't Veer-Menneret & Megessier) demonstrate the efficiency of the use of Balmer line profiles for effective temperature determination. In agreement with them, we insist on the physical interest of this method based on the behaviour of these lines with the variations of the parameters involved in the treatment of the convective transport. The comparison between Fuhrmann's results and ours, independently obtained, exhibits a quite good agreement. We show new results of effective temperature, gravity and metallicities for a few of our programme stars, ranging from solar to overabundant metallicities.

en astro-ph

Detail Sumber

arXiv Open Access 1998

A Freely Available Morphological Analyzer, Disambiguator and Context Sensitive Lemmatizer for German

Wolfgang Lezius, Reinhard Rapp, Manfred Wettler

In this paper we present Morphy, an integrated tool for German morphology, part-of-speech tagging and context-sensitive lemmatization. Its large lexicon of more than 320,000 word forms plus its ability to process German compound nouns guarantee a wide morphological coverage. Syntactic ambiguities can be resolved with a standard statistical part-of-speech tagger. By using the output of the tagger, the lemmatizer can determine the correct root even for ambiguous word forms. The complete package is freely available and can be downloaded from the World Wide Web.

en cs.CL

Detail Sumber

Hasil untuk "North Germanic. Scandinavian"