Hasil "Norwegian literature"

arXiv Open Access 2025

NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark

Vladislav Mikhailov, Tita Enstad, David Samuel et al.

This paper introduces NorEval, a new and comprehensive evaluation suite for large-scale standardized benchmarking of Norwegian generative language models (LMs). NorEval consists of 24 high-quality human-created datasets -- of which five are created from scratch. In contrast to existing benchmarks for Norwegian, NorEval covers a broad spectrum of task categories targeting Norwegian language understanding and generation, establishes human baselines, and focuses on both of the official written standards of the Norwegian language: Bokmål and Nynorsk. All our datasets and a collection of over 100 human-written prompts are integrated into LM Evaluation Harness, ensuring flexible and reproducible evaluation. We describe the NorEval design and present the results of benchmarking 19 open-source pre-trained and instruction-tuned LMs for Norwegian in various scenarios. Our benchmark, evaluation framework, and annotation materials are publicly available.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2025

Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles

Samia Touileb, Vladislav Mikhailov, Marie Kroka et al.

We introduce a dataset of high-quality human-authored summaries of news articles in Norwegian. The dataset is intended for benchmarking the abstractive summarisation capabilities of generative language models. Each document in the dataset is provided with three different candidate gold-standard summaries written by native Norwegian speakers, and all summaries are provided in both of the written variants of Norwegian -- Bokmål and Nynorsk. The paper describes details on the data creation effort as well as an evaluation of existing open LLMs for Norwegian on the dataset. We also provide insights from a manual human evaluation, comparing human-authored to model-generated summaries. Our results indicate that the dataset provides a challenging LLM benchmark for Norwegian summarisation capabilities

en cs.CL

Detail Sumber

S2 Open Access 2024

Development and Evaluation of Pre-trained Language Models for Historical Danish and Norwegian Literary Texts

Ali Al-Laith, Alexander Conroy, Jens Bjerring-Hansen et al.

We develop and evaluate the first pre-trained language models specifically tailored for historical Danish and Norwegian texts. Three models are trained on a corpus of 19th-century Danish and Norwegian literature: two directly on the corpus with no prior pre-training, and one with continued pre-training. To evaluate the models, we utilize an existing sentiment classification dataset, and additionally introduce a new annotated word sense disambiguation dataset focusing on the concept of fate. Our assessment reveals that the model employing continued pre-training outperforms the others in two downstream NLP tasks on historical texts. Specifically, we observe substantial improvement in sentiment classification and word sense disambiguation compared to models trained on contemporary texts. These results highlight the effectiveness of continued pre-training for enhancing performance across various NLP tasks in historical text analysis.

9 sitasi en Computer Science

Detail DOI Sumber

DOAJ Open Access 2024

Partners in crime: Convenience case study of Norwegian publishing cartel

Petter Gottschalk

The theory of convenience addresses white-collar and corporate crime. The theory is applied in this article to a case study of Norwegian publishing houses having to pay infringement fees because of competition act violation. Cartel members agreed and coordinated a boycott of a distribution channel. This article reviews the research literature on cartels before presenting the convenience case study. Combatting cartels is a matter of reducing the attractiveness and convenience of joining cartels. Guardianship, oversight, and controls are at the core of reducing deviance convenience. Detection is an element of oversight. However, detection is rare, as this case illustrated by email sent by mistake. Combatting cartels is a matter of control at the top of organizations where typically each chief executive officer (CEO) is involved. Therefore, the corporate compliance officer should never report to the CEO but rather to the chairperson on the board and to the external auditor.

Social pathology. Social and public welfare. Criminology

Detail DOI Sumber

arXiv Open Access 2024

Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges

Per E Kummervold, Javier de la Rosa, Freddy Wetjen et al.

This article introduces NB-Whisper, an adaptation of OpenAI's Whisper, specifically fine-tuned for Norwegian language Automatic Speech Recognition (ASR). We highlight its key contributions and summarise the results achieved in converting spoken Norwegian into written forms and translating other languages into Norwegian. We show that we are able to improve the Norwegian Bokmål transcription by OpenAI Whisper Large-v3 from a WER of 10.4 to 6.6 on the Fleurs Dataset and from 6.8 to 2.2 on the NST dataset.

en cs.CL

Detail Sumber

arXiv Open Access 2024

A Norwegian Approach to Downscaling

Rasmus E. Benestad

A comprehensive geoscientific downscaling model strategy is presented outlining an approach that has evolved over the last 20 years, together with an explanation for its development, its technical aspects, and evaluation scheme. This effort has resulted in an open-source and free R-based tool, 'esd', for the benefit of sharing and improving the reproducibility of the downscaling results. Furthermore, a set of new metrics was developed as an integral part of the downscaling approach which assesses model performance with an emphasis on regional information for society (RifS). These metrics involve novel ways of comparing model results with observational data and have been developed for downscaling large multi-model global climate model ensembles. This paper presents for the first time an overview of the comprehensive framework adopted by the Norwegian Meteorological Institute for downscaling aimed at supporting climate change adaptation. A literature search suggests that this comprehensive downscaling strategy and evaluation scheme are not widely used within the downscaling community. In addition, this strategy involves a new convention for storing large datasets of ensemble results that provides fast access to information and drastically saves data volume.

en physics.geo-ph

Detail DOI Sumber

CrossRef Open Access 2024

Sexualities and Environments in the Norwegian Twentieth Century

Per Esben Svelstad

en

Detail DOI Sumber

S2 Open Access 2023

The role of state agency in path development: a longitudinal study of two Norwegian manufacturing regions

M. Steen, H. B. Lund, Asbjørn Karlsen

ABSTRACT The role of the state remains underdeveloped in the regional path development literature. This paper analyses how the Norwegian state via different roles (regulator, purchaser, owner, facilitator) directly and indirectly has enabled and influenced path development in two defence-related high-tech manufacturing regions in Norway since the end of the Second World War, notably by contributing to the modification of localised assets and the strategic coupling of those assets to extra-regional defence-related and civilian markets.

14 sitasi en

Detail DOI Sumber

DOAJ Open Access 2023

“that which is common to us all”: Karl Ove Knausgaard as Reader of Joyce

Tarso do Amaral de Souza Cruz

In his monumental autobiographical series of novels My Struggle, acclaimed Norwegian novelist Karl Ove Knausgaard devotes a considerable number of pages to discuss James Joyce’s fictional works. In the last volume of the series – The End –, practically the entire body of Joyce’s fiction – from early works such as Stephen Hero and Dubliners to the modernist masterpieces Ulysses and Finnegans Wake – is included in a discussion on the Irish novelist’s literature. Only one among Joyce’s major works is not tackled by Knausgaard in The End: A Portrait of the Artist as a Young Man. Nonetheless, it is precisely Knausgaard who writes the preface to a celebrated Centennial edition of Joyce’s first novel in which, amidst other topics, he ponders over what he understands to be “the very essence of literature.” The article aims at highlighting some key aspects of Knausgaard’s take on Joyce’s fictional output and provide enough evidence to support the hypothesis that the Norwegian writer’s conceptualization of the literary phenomenon, including Joyce’s work, is based upon questionable essentialist premises.

History (General) and history of Europe

Detail DOI Sumber

DOAJ Open Access 2023

Nordic Humour

Lita Lundquist

Starting from my former empirical studies but supplemented with fresh fictional “data” from Lars von Trier’s latest TV series Riget Exodus (2022), I first describe how Danes use humour in very characteristic ways, also in cross-cultural professional settings. Next, I explain not only Danish humour but all national humour with the notion of humour socialisation, which integrates and combines national humour with the national language on the one hand, and the specific national process of civilisation on the other hand. Moving to Nordic humour, I focus on how Danes and Swedes perceive each other’s humour, and then explain divergences between the humour of these two Nordic countries. These differences, I conclude, are the result mainly of differences in their respective civilising processes, while I am waiting and hoping for deeper comparative linguistic studies of the use of ‘humour warning signals in Danish and Swedish.

Norwegian literature

Detail DOI Sumber

DOAJ Open Access 2023

What is a ‘rare’ language in translation? The experience of distance reading

Svetlana Yu. Bochaver, Ekaterina V. Tereshko

This article examines the perception of ‘rare’ and ‘common’ languages through literary translations. The study is based on the materials from De Bezige Bij Publishing House in the Netherlands, comparing the periods of 2010—2013 and 2020—2023. A significant increase in the role of translators is reflected in the rise of translation share in the publishing house. There is an observed growth in the number of source languages for translation, with a decrease in the proportion of English. Translations from French, Italian, German, Scandinavian languages, Portuguese, and Japanese have emerged. A comparison with the Polyandria Russian Publishing House during the period of 2020—2023 reveals common and distinct source languages. Both publishers translate literature into Danish, Finnish, and French to a similar extent. The Russian publishing house represents Norwegian and Japanese to a greater extent, while the Dutch publishing house releases more translations from German, Swedish, Turkish, and Italian. The Russian publisher also includes Icelandic, Albanian, Korean, and Croatian, while the Dutch publisher includes Hebrew, Romanian, and Portuguese. Both publishers encompass a total of 20 source languages, which is a small number compared to the global linguistic diversity. Comparing the volumes of source languages also indicates differences in preferences. Central European languages are chosen in the Netherlands, while Norwegian and Icelandic are favored in Russia. These differences may be influenced by the cost of rights to works, editorial preferences, and translator availability. The analysis results indicate that neither typological similarity between the source language and the target language, nor association with a specific language group, influences the preference for translating books from a particular language. This highlights the importance of sociocultural factors.

Philology. Linguistics

Detail DOI Sumber

arXiv Open Access 2023

Boosting Norwegian Automatic Speech Recognition

Javier de la Rosa, Rolv-Arild Braaten, Per Egil Kummervold et al.

In this paper, we present several baselines for automatic speech recognition (ASR) models for the two official written languages in Norway: Bokmål and Nynorsk. We compare the performance of models of varying sizes and pre-training approaches on multiple Norwegian speech datasets. Additionally, we measure the performance of these models against previous state-of-the-art ASR models, as well as on out-of-domain datasets. We improve the state of the art on the Norwegian Parliamentary Speech Corpus (NPSC) from a word error rate (WER) of 17.10\% to 7.60\%, with models achieving 5.81\% for Bokmål and 11.54\% for Nynorsk. We also discuss the challenges and potential solutions for further improving ASR models for Norwegian.

en cs.CL

Detail Sumber

arXiv Open Access 2023

NorBench -- A Benchmark for Norwegian Language Models

David Samuel, Andrey Kutuzov, Samia Touileb et al.

We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics. We also introduce a range of new Norwegian language models (both encoder and encoder-decoder based). Finally, we compare and analyze their performance, along with other existing LMs, across the different benchmark tests of NorBench.

en cs.CL

Detail Sumber

arXiv Open Access 2023

NoCoLA: The Norwegian Corpus of Linguistic Acceptability

Matias Jentoft, David Samuel

While there has been a surge of large language models for Norwegian in recent years, we lack any tool to evaluate their understanding of grammaticality. We present two new Norwegian datasets for this task. NoCoLA_class is a supervised binary classification task where the goal is to discriminate between acceptable and non-acceptable sentences. On the other hand, NoCoLA_zero is a purely diagnostic task for evaluating the grammatical judgement of a language model in a completely zero-shot manner, i.e. without any further training. In this paper, we describe both datasets in detail, show how to use them for different flavors of language models, and conduct a comparative study of the existing Norwegian language models.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2023

Aligning the Norwegian UD Treebank with Entity and Coreference Information

Tollef Emil Jørgensen, Andre Kåsen

This paper presents a merged collection of entity and coreference annotated data grounded in the Universal Dependencies (UD) treebanks for the two written forms of Norwegian: Bokmål and Nynorsk. The aligned and converted corpora are the Norwegian Named Entities (NorNE) and Norwegian Anaphora Resolution Corpus (NARC). While NorNE is aligned with an older version of the treebank, NARC is misaligned and requires extensive transformation from the original annotations to the UD structure and CoNLL-U format. We here demonstrate the conversion and alignment processes, along with an analysis of discovered issues and errors in the data - some of which include data split overlaps in the original treebank. These procedures and the developed system may prove helpful for future corpus alignment and coreference annotation endeavors. The merged corpora comprise the first Norwegian UD treebank enriched with named entities and coreference information.

en cs.CL

Detail Sumber

arXiv Open Access 2023

NorQuAD: Norwegian Question Answering Dataset

Sardana Ivanova, Fredrik Aas Andreassen, Matias Jentoft et al.

In this paper we present NorQuAD: the first Norwegian question answering dataset for machine reading comprehension. The dataset consists of 4,752 manually created question-answer pairs. We here detail the data collection procedure and present statistics of the dataset. We also benchmark several multilingual and Norwegian monolingual language models on the dataset and compare them against human performance. The dataset will be made freely available.

en cs.CL

Detail Sumber

arXiv Open Access 2023

NLEBench+NorGLM: A Comprehensive Empirical Analysis and Benchmark Dataset for Generative Language Models in Norwegian

Peng Liu, Lemei Zhang, Terje Farup et al.

Norwegian, spoken by only 5 million population, is under-representative within the most impressive breakthroughs in NLP tasks. To the best of our knowledge, there has not yet been a comprehensive evaluation of the existing language models (LMs) on Norwegian generation tasks during the article writing process. To fill this gap, we 1) compiled the existing Norwegian dataset and pre-trained 4 Norwegian Open Language Models varied from parameter scales and architectures, collectively called NorGLM; 2) introduced a comprehensive benchmark, NLEBench, for evaluating natural language generation capabilities in Norwegian, encompassing translation and human annotation. Based on the investigation, we find that: 1) the mainstream, English-dominated LM GPT-3.5 has limited capability in understanding the Norwegian context; 2) the increase in model parameter scales demonstrates limited impact on the performance of downstream tasks when the pre-training dataset is constrained in size; 3) smaller models also demonstrate the reasoning capability through Chain-of-Thought; 4) a multi-task dataset that includes synergy tasks can be used to verify the generalizability of LLMs on natural language understanding and, meanwhile, test the interconnectedness of these NLP tasks. We share our resources and code for reproducibility under a CC BY-NC 4.0 license.

en cs.CL

Detail Sumber

arXiv Open Access 2023

AI Literature Review Suite

David A. Tovar

The process of conducting literature reviews is often time-consuming and labor-intensive. To streamline this process, I present an AI Literature Review Suite that integrates several functionalities to provide a comprehensive literature review. This tool leverages the power of open access science, large language models (LLMs) and natural language processing to enable the searching, downloading, and organizing of PDF files, as well as extracting content from articles. Semantic search queries are used for data retrieval, while text embeddings and summarization using LLMs present succinct literature reviews. Interaction with PDFs is enhanced through a user-friendly graphical user interface (GUI). The suite also features integrated programs for bibliographic organization, interaction and query, and literature review summaries. This tool presents a robust solution to automate and optimize the process of literature review in academic and industrial research.

en cs.DL, cs.AI

Detail Sumber

CrossRef Open Access 2023

Presenting Norwegian Literature in Czechoslovakia: Norwegian Literature in Czech Translations 1945–1968

Adéla Ficová

Translations contribute to spreading but also shaping of cultural memory. While the choice of titles which get to be translated is contingent on many factors which the publishers take into consideration, decision-making in totalitarian countries is fettered. In communist Czechoslovakia, the final selection of books, and therefore memories, had to meet yet another criterion which deformed the natural literary development – censorship. The article focuses on Norwegian literature which was introduced into Czech between 1945 and 1968. Norwegian literature had already had a strong position on the Czechoslovak literary market since the end of the 19th century and in the first half of the 20th century thanks to several publishing houses, translators, and the introduction of the Nobel Prize in literature. This tradition was first interrupted by the WWII and shortly after again by the communist coup in 1948. Although the restrictions began loosening later, the Soviet intervention in 1968 installed the restrictions again.The object is to present and examine the image of Norwegian literature in Czech literary memory as it was shaped by the cultural policies of totalitarian Czechoslovakia; and to show and explain which type of literature could enter Czech bookshops and libraries. The focus often shifted to a specific literary genre, republishing the earlier works of the Norwegian canon, or works by authors whose work was translated into Czech although they were marginalized in Norway and did not make it into the Norwegian national canon. An important part of such a perception is not only remembering but also forgetting. The article therefore also maps the active suppressing of memories by black-listing particular authors or works.Lastly, the article is also concerned with peritexts of translation, namely introductions and afterwords, as these often contributed to mediation of the transfer.

en

Detail DOI Sumber

S2 Open Access 2022

Patient involvement in rare diseases research: a scoping review of the literature and mixed method evaluation of Norwegian researchers’ experiences and perceptions

Gry Velvin, Thale Hartman, Trine Bathen

Background Patients’ involvement (PI) in research is recognized as a valuable strategy for increasing the quality, developing more targeted research and to speed up more innovative research dissemination. Nevertheless, patient involvement in rare diseases research (PI-RDR) is scarce. The aims were: To study the Norwegian researchers` experiences and perceptions of PI-RDR and review the literature on PI-RDR. Methods 1. A systematic scoping review of the literature on PI-RDR. 2. A cross-sectional questionnaire study with close-ended and open-ended questions to investigate the researchers` experiences. Results In the scoping review 608 articles read in full-text and 13 articles (one review and twelve primary studies) were included. The heterogeneity of the design, methodology and results was large. Most studies described several benefits of PI, but few described methods for measuring impacts and effectiveness of PI-RDR. In the cross sectional part of this study, 145 of 251 employees working in the nine Norwegian Centers on Rare Diseases participated, of these 69 were researchers. Most (95%) of the researchers claimed that rare diseases research is more challenging than for the more common diseases. The majority (95%) argued that PI-RDR may increase the quality of the studies and the relevance, and most (89%) agreed that PI-RDR in dissemination may increase the awareness and public interest for rare diseases. In the open-ended questions several researchers also claimed challenges related to PI-RDR, and many had proposal for improving PI and promotion of rare disease research. Conclusion Both the literature and researchers emphasized that PI-RDR is important for improving research quality and increase the public attention on rare diseases, but what constitutes effective PI-RDR still remain unclear. More research on the design, methodology and assessment for measuring the impact of PI-RDR is warranted.

13 sitasi en Medicine

Detail DOI Sumber

Hasil untuk "Norwegian literature"