Hasil untuk "German literature"

Menampilkan 20 dari ~8498765 hasil · dari arXiv, DOAJ, CrossRef, Semantic Scholar

JSON API
arXiv Open Access 2026
Multilingual Stutter Event Detection for English, German, and Mandarin Speech

Felix Haas, Sebastian P. Bayerl

This paper presents a multi-label stuttering detection system trained on multi-corpus, multilingual data in English, German, and Mandarin.By leveraging annotated stuttering data from three languages and four corpora, the model captures language-independent characteristics of stuttering, enabling robust detection across linguistic contexts. Experimental results demonstrate that multilingual training achieves performance comparable to and, in some cases, even exceeds that of previous systems. These findings suggest that stuttering exhibits cross-linguistic consistency, which supports the development of language-agnostic detection systems. Our work demonstrates the feasibility and advantages of using multilingual data to improve generalizability and reliability in automated stuttering detection.

en cs.SD, cs.CL
DOAJ Open Access 2025
Die Breslauer Schriftstellerin Bianca Bobertag (1846–1900) im akademischen, literarischen und frauenrechtlerischen Umfeld

Julianna Redlich

Der Artikel gewährt einen Einblick in Leben und Karriere der nahezu vergessenen Breslauer Schriftstellerin Bianca Bobertag (geb. Marbach). Skizziert werden die in ihrem Werk am häufigsten besprochenen Themen, vor allem die gesellschaftlichen Verhältnisse in intellektuellen Kreisen der in ihrem Werk literarisch gestalteten Universitätsstadt Breslau, sowie die Frauenrechte, ganz besonders die Gleichberechtigung in den Bereichen Bildung und Ehe. Überdies beschäftigt sich der Artikel auch mit einer Spurensuche zu Bianca Bobertags Familie, deren Mitglieder alle eine akademische oder literarische Laufbahn eingeschlagen haben.

Germanic languages. Scandinavian languages, German literature
DOAJ Open Access 2025
Skandinavische Forschungen am Institut für Germanistik der Universität Wrocław im Zeitraum 2010–2024

Józef Jarosz

Die vorliegende Bibliographie ist eine Fortsetzung der bisherigen bibliographischen Dokumentation wissenschaftlicher Aktivität der Mitarbeiter von der Forschungsstelle für Skandinavistik an der Universität Wrocław. Die Bibliographie umfasst Publikationen, die im Zeitraum 2010–2024 am Institut für Germanistik entstanden und sich mit der Problematik der skandinavischen Länder, ihrer Sprachen und Kulturen auseinandersetzen. Sie wurden in chronologisch-alphabetischer Ordnung aufgelistet. Als ein zusätzliches Kriterium dient der Charakter der Veröffentlichung: Zuerst werden Büchertitel aufgeführt, also Monographien, akademische Lehrbücher und Wörterbücher, dann Artikel, Rezensionen, Übersetzungen und andere Publikationen.

Germanic languages. Scandinavian languages, German literature
DOAJ Open Access 2025
A Performative Approach to Foreign Language Teaching: Experience in Organizing Poetry Slam Competitions

Maria V. Petrova

Background. Modern education strives to develop well-rounded individuals capable of effective interaction in multicultural environments, rather than focusing solely on specialized knowledge. In foreign language teaching methodologies, performative approaches are gaining increasing relevance. These approaches allow students to immerse themselves more deeply in the language environment through creative forms of expression, activating speech activity and enriching cultural awareness. Poetry competitions in the format of poetry slams in the target language represent a performative practice that combines artistic creativity, practical application of language skills, and interactive competition. Objectives. This includes identifying the strengths and weaknesses of the poetry slam format, motivational factors and barriers to student participation, as well as ways to improve the effectiveness of the technique in developing language skills, creativity and motivation. Methods. The research methodology includes a qualitative analysis combining a theoretical review of the literature on performative didactics and the history of poetry slam with an empirical study based on a poetry tournament in German among students of the Faculty of Journalism of Moscow State University in November-December 2024. Study Participants. The data were collected through three anonymous online questionnaires (Google Forms) using mixed methods: rating scales, multiple choice, open-ended questions. The sample included 16 participants of the tournament, 27 non-participating students and 4 teachers. Results. The analysis showed that the performative technique of poetry slam is effective for developing language skills, combining practical performances with creative and analytical work on texts. However, for maximum effectiveness, it is necessary to adapt the format to the level of students’ training and develop a clear methodology, including objective criteria for evaluating performances. A subjective assessment based solely on the viewer’s vote proved insufficient. As a part of the study, evaluation criteria were developed. They are proposed to be combined with traditional viewer voting to preserve the authenticity of the format. Conclusions. The Poetry Slam demonstrates significant didactic potential for the development of creative thinking, working with text and public communication. Nevertheless, the study revealed the need for additional language training for students, especially in the field of stylistics and phonetics. For the successful integration of the Poetry Slam into the educational process, a systematic approach is needed: the development of methodological recommendations along with the use of a developed performance evaluation system, combining objective criteria with subjective audience voting. This approach preserves the authenticity of the format when minimizing subjectivity, enhancing the didactic component.

Education (General), Theory and practice of education
arXiv Open Access 2024
SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments

Kai-Robin Lange, Carsten Jentsch

The application of natural language processing on political texts as well as speeches has become increasingly relevant in political sciences due to the ability to analyze large text corpora which cannot be read by a single person. But such text corpora often lack critical meta information, detailing for instance the party, age or constituency of the speaker, that can be used to provide an analysis tailored to more fine-grained research questions. To enable researchers to answer such questions with quantitative approaches such as natural language processing, we provide the SpeakGer data set, consisting of German parliament debates from all 16 federal states of Germany as well as the German Bundestag from 1947-2023, split into a total of 10,806,105 speeches. This data set includes rich meta data in form of information on both reactions from the audience towards the speech as well as information about the speaker's party, their age, their constituency and their party's political alignment, which enables a deeper analysis. We further provide three exploratory analyses, detailing topic shares of different parties throughout time, a descriptive analysis of the development of the age of an average speaker as well as a sentiment analysis of speeches of different parties with regards to the COVID-19 pandemic.

en cs.CL
arXiv Open Access 2024
Revisiting the Phenomenon of Syntactic Complexity Convergence on German Dialogue Data

Yu Wang, Hendrik Buschmeier

We revisit the phenomenon of syntactic complexity convergence in conversational interaction, originally found for English dialogue, which has theoretical implication for dialogical concepts such as mutual understanding. We use a modified metric to quantify syntactic complexity based on dependency parsing. The results show that syntactic complexity convergence can be statistically confirmed in one of three selected German datasets that were analysed. Given that the dataset which shows such convergence is much larger than the other two selected datasets, the empirical results indicate a certain degree of linguistic generality of syntactic complexity convergence in conversational interaction. We also found a different type of syntactic complexity convergence in one of the datasets while further investigation is still necessary.

en cs.CL
arXiv Open Access 2024
OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context

Steffen Kleinle, Jakob Prange, Annemarie Friedrich

When immigrating to a new country, it is easy to feel overwhelmed by the need to obtain information on financial support, housing, schooling, language courses, and other issues. If relocation is rushed or even forced, the necessity for high-quality answers to such questions is all the more urgent. Official immigration counselors are usually overbooked, and online systems could guide newcomers to the requested information or a suitable counseling service. To this end, we present OMoS-QA, a dataset of German and English questions paired with relevant trustworthy documents and manually annotated answers, specifically tailored to this scenario. Questions are automatically generated with an open-source large language model (LLM) and answer sentences are selected by crowd workers with high agreement. With our data, we conduct a comparison of 5 pretrained LLMs on the task of extractive question answering (QA) in German and English. Across all models and both languages, we find high precision and low-to-mid recall in selecting answer sentences, which is a favorable trade-off to avoid misleading users. This performance even holds up when the question language does not match the document language. When it comes to identifying unanswerable questions given a context, there are larger differences between the two languages.

en cs.CL
arXiv Open Access 2023
Founding a mathematical diffusion model in linguistics. The case study of German syntactic features in the North-Eastern Italian dialects

I. Lazzizzera

The initial motivation for this work was the linguistic case of the spread of Germanic syntactic features into Romance dialects of North-Eastern Italy, which occurred after the immigration of German people to Tyrol during the High Middle Ages. To obtain a representation of the data over the territory suitable for a mathematical formulation, an interactive map is produced as a first step, using tools of what is called Geographic Data Science. A smooth two-dimensional surface G is introduced, expressing locally which fraction of territory uses a given German language feature: it is obtained by a piecewise cubic curvature minimizing interpolant of the discrete function that says if at any surveyed locality that feature is used or not. This surface G is thought of as the value at the present time of a function describing a diffusion-convection phenomenon in two dimensions (here said tidal mode), which is subjected in a very natural way to the same equation used in physics, introducing a contextual diffusivity concept: it is shown that with two different assumptions about diffusivity, solutions of this equation, evaluated at the present time, fit well with the data interpolated by G, thus providing two convincing different pictures of diffusion-convection in the case under study, albeit simplifications and approximations. Very importantly, it is shown that the linguistic diffusion model known to linguists as Schmidt waves can be counted among the solutions of the diffusion equation

arXiv Open Access 2023
German CheXpert Chest X-ray Radiology Report Labeler

Alessandro Wollek, Sardi Hyska, Thomas Sedlmeyr et al.

This study aimed to develop an algorithm to automatically extract annotations for chest X-ray classification models from German thoracic radiology reports. An automatic label extraction model was designed based on the CheXpert architecture, and a web-based annotation interface was created for iterative improvements. Results showed that automated label extraction can reduce time spent on manual labeling and improve overall modeling performance. The model trained on automatically extracted labels performed competitively to manually labeled data and strongly outperformed the model trained on publicly available data.

en cs.CL
arXiv Open Access 2023
Preliminary Results of a Scientometric Analysis of the German Information Retrieval Community 2020-2023

Philipp Schaer, Svetlana Myshkina, Jüri Keller

The German Information Retrieval community is located in two different sub-fields: Information and computer science. There are no current studies that investigate these communities on a scientometric level. Available studies only focus on the information scientific part of the community. We generated a data set of 401 recent IR-related publications extracted from six core IR conferences from a mainly computer scientific background. We analyze this data set at the institutional and researcher level. The data set is publicly released, and we also demonstrate a mapping use case.

en cs.IR, cs.DL
DOAJ Open Access 2023
WORD-FORMATION IN AUSTRIAN STANDARD GERMAN IN THE CONTEXT OF LEXEME “KRAUT”

Minara A. Radovich, Yana V. Lazareva

Despite numerous studies, the aspect of word formation of nouns in the modern literary German language of Germany and Austria is yet to be fully covered. The purpose of this study is to determine patterns in the word-formation of substantives in modern Austrian Standard German using the example of the lexeme “Kraut” (cabbage). We used continuous sampling method and lexicographical literature for Standard German and Austrian German. As a result of the study, universal and unique language characteristics for this variant were identified. For example, in Austrian German the lexeme is replaced by a synonymous one with identical semantics, forming derivative lexical units. Word formation occurs according to the model of attributive composition, while the addition of root morphemes takes place without a connecting element, which is typical for word formation according to the composition model in Standard German. The results obtained provide a perspective for further research in patterns of word-formation in modern Standard German and modern Austrian Standard German.

Social Sciences
S2 Open Access 2018
A Corpus for Multilingual Document Classification in Eight Languages

Holger Schwenk, Xian Li

Cross-lingual document classification aims at training a document classifier on resources in one language and transferring it to a different language without any additional resources. Several approaches have been proposed in the literature and the current best practice is to evaluate them on a subset of the Reuters Corpus Volume 2. However, this subset covers only few languages (English, German, French and Spanish) and almost all published works focus on the the transfer between English and German. In addition, we have observed that the class prior distributions differ significantly between the languages. We argue that this complicates the evaluation of the multilinguality. In this paper, we propose a new subset of the Reuters corpus with balanced class priors for eight languages. By adding Italian, Russian, Japanese and Chinese, we cover languages which are very different with respect to syntax, morphology, etc. We provide strong baselines for all language transfer directions using multilingual word and sentence embeddings respectively. Our goal is to offer a freely available framework to evaluate cross-lingual document classification, and we hope to foster by these means, research in this important area.

149 sitasi en Computer Science
arXiv Open Access 2022
White-Box Attacks on Hate-speech BERT Classifiers in German with Explicit and Implicit Character Level Defense

Shahrukh Khan, Mahnoor Shahid, Navdeeppal Singh

In this work, we evaluate the adversarial robustness of BERT models trained on German Hate Speech datasets. We also complement our evaluation with two novel white-box character and word level attacks thereby contributing to the range of attacks available. Furthermore, we also perform a comparison of two novel character-level defense strategies and evaluate their robustness with one another.

en cs.CL

Halaman 14 dari 424939