Michal Rubáš
Hasil untuk "Germanic languages. Scandinavian languages"
Menampilkan 20 dari ~11765 hasil · dari DOAJ, arXiv, Semantic Scholar
Schulte Michael
Mahmoud Samir Fayed
Most visual programming languages (VPLs) are domain-specific, with few general-purpose VPLs like Programming Without Coding Technology (PWCT). These general-purpose VPLs are developed using textual programming languages and improving them requires textual programming. In this thesis, we designed and developed PWCT2, a dual-language (Arabic/English), general-purpose, self-hosting visual programming language. Before doing so, we specifically designed a textual programming language called Ring for its development. Ring is a dynamically typed language with a lightweight implementation, offering syntax customization features. It permits the creation of domain-specific languages through new features that extend object-oriented programming, allowing for specialized languages resembling Cascading Style Sheets (CSS) or Supernova language. The Ring Compiler and Virtual Machine are designed using the PWCT visual programming language where the visual implementation is composed of 18,945 components that generate 24,743 lines of C code, which increases the abstraction level and hides unnecessary details. Using PWCT to develop Ring allowed us to realize several issues in PWCT, which led to the development of the PWCT2 visual programming language using the Ring textual programming language. PWCT2 provides approximately 36 times faster code generation and requires 20 times less storage for visual source files. It also allows for the conversion of Ring code into visual code, enabling the creation of a self-hosting VPL that can be developed using itself. PWCT2 consists of approximately 92,000 lines of Ring code and comes with 394 visual components. PWCT2 is distributed to many users through the Steam platform and has received positive feedback, On Steam, 1772 users have launched the software, and the total recorded usage time exceeds 17,000 hours, encouraging further research and development.
Brian DeRenzi, Anna Dixon, Mohamed Aymane Farhi et al.
Speech technology remains out of reach for most of the over 2300 languages in Africa. We present the first systematic assessment of large-scale synthetic voice corpora for African ASR. We apply a three-step process: LLM-driven text creation, TTS voice synthesis, and ASR fine-tuning. Eight out of ten languages for which we create synthetic text achieved readability scores above 5 out of 7. We evaluated ASR improvement for three (Hausa, Dholuo, Chichewa) and created more than 2,500 hours of synthetic voice data at below 1% of the cost of real data. Fine-tuned Wav2Vec-BERT-2.0 models trained on 250h real and 250h synthetic Hausa matched a 500h real-data-only baseline, while 579h real and 450h to 993h synthetic data created the best performance. We also present gender-disaggregated ASR performance evaluation. For very low-resource languages, gains varied: Chichewa WER improved about 6.5% relative with a 1:2 real-to-synthetic ratio; a 1:1 ratio for Dholuo showed similar improvements on some evaluation data, but not on others. Investigating intercoder reliability, ASR errors and evaluation datasets revealed the need for more robust reviewer protocols and more accurate evaluation data. All data and models are publicly released to invite further work to improve synthetic data for African languages.
Boqi Chen, Ou Wei, Bingzhou Zheng et al.
Graph model generation from natural language description is an important task with many applications in software engineering. With the rise of large language models (LLMs), there is a growing interest in using LLMs for graph model generation. Nevertheless, LLM-based graph model generation typically produces partially correct models that suffer from three main issues: (1) syntax violations: the generated model may not adhere to the syntax defined by its metamodel, (2) constraint inconsistencies: the structure of the model might not conform to some domain-specific constraints, and (3) inaccuracy: due to the inherent uncertainty in LLMs, the models can include inaccurate, hallucinated elements. While the first issue is often addressed through techniques such as constraint decoding or filtering, the latter two remain largely unaddressed. Motivated by recent self-consistency approaches in LLMs, we propose a novel abstraction-concretization framework that enhances the consistency and quality of generated graph models by considering multiple outputs from an LLM. Our approach first constructs a probabilistic partial model that aggregates all candidate outputs and then refines this partial model into the most appropriate concrete model that satisfies all constraints. We evaluate our framework on several popular open-source and closed-source LLMs using diverse datasets for model generation tasks. The results demonstrate that our approach significantly improves both the consistency and quality of the generated graph models.
William J. Bowman
We present the design and implementation of a macro-embedding of a family of compiler intermediate languages, from a Scheme-like language to x86-64, into Racket. This embedding is used as part of a testing framework for a compilers course to derive interpreters for all the intermediate languages. The embedding implements features including safe, functional abstractions as well as unsafe assembly features, and the interactions between the two at various intermediate stages. This paper aims to demonstrate language-oriented techniques and abstractions for implementing (1) a large family of languages and (2) interoperability between low- and high-level languages. The primary strength of this approach is the high degree of code reuse and interoperability compared to implementing each interpreter separately. The design emphasizes modularity and compositionality of an open set of language features by local macro expansion into a single host language, rather than implementing a language pre-defined by a closed set of features. This enables reuse from both the host language (Racket) and between intermediate languages, and enables interoperability between high- and low-level features, simplifying development of the intermediate language semantics. It also facilitates extending or redefining individual language features in intermediate languages, and exposing multiple interfaces to the embedded languages.
Ruben Becker, Giuseppa Castiglione, Giovanna D'Agostino et al.
The notion of Wheeler languages is rooted in the Burrows-Wheeler transform (BWT), one of the most central concepts in data compression and indexing. The BWT has been generalized to finite automata, the so-called Wheeler automata, by Gagie et al. [Theor. Comput. Sci. 2017]. Wheeler languages have subsequently been defined as the class of regular languages for which there exists a Wheeler automaton accepting them. Besides their advantages in data indexing, these Wheelerlanguages also satisfy many interesting properties from a language theoretic point of view [Alanko et al., Inf. Comput. 2021]. A characteristic yet unsatisfying feature of Wheeler languages however is that their definition depends on a fixed order of the alphabet. In this paper we introduce the Universally Wheeler languages UW, i.e., the regular languages that are Wheeler with respect to all orders of a given alphabet. Our first main contribution is to relate UW to some very well known regular language classes. We first show that the Striclty Locally Testable languages are strictly included in UW. After noticing that UW is not closed under taking the complement, we prove that the class of languages for which both the language and its complement are in UW exactly coincides with those languages that are Definite or Reverse Definite. Secondly, we prove that deciding if a regular language given by a DFA is in UW can be done in quadratic time. We also show that this is optimal unless the Strong Exponential Time Hypothesis (SETH) fails.
Andrew Kostakis
Corina-Emanuela Margarit
The study of Romance-Germanic languages and their linguistic and literary interactions is essential for understanding the evolution of European language and literature. The Romance and Germanic languages, as two fundamental branches of the Indo-European family, have profoundly influenced the formation of cultural and linguistic identities in Europe. This paper aims to explore recent trends in the linguistic and literary studies of these languages, analyzing how new theories and practices in comparative linguistics and literature contribute to the understanding of the connections between Romance-Germanic languages. In an era of rapid globalization and digitalization, interactions and borrowings between these languages are constantly evolving, leading to a reevaluation of research methods and traditional approaches in the study of language and literature.
Anna Maciejewska
The “Declaration of the letter of the Kraków Voivode, delivered among the people in Stężyca”, the authorship of which is attributed to Szczęsny Kryski, states the following: “If you so chastise Machiavel’s teachings that one ought to reign judiciously, I do not know if it will please you should the noble-born rise up against the king and the peasants against the noble-born, and, having slaughtered the noble-born, the peasants themselves should rule, as had unfolded in Holland; for such ‘Regestra’ were left by William to Maurice”. The passage refers to William I of Orange and the Dutch Revolt (1568–1648). During this insurrection, an armed conflict broke out between the Protestant population of the Low Countries and Spain. The conflict was widely considered as a religious war. In my article, I will analyse why Szczęsny Kryski, in the “Declaration of the letter of the Kraków Voivode, delivered among the people in Stężyca” (approx. 1606–1607), alluded to the Eighty Years’ War and William I of Orange, referring to Niccolò Machiavelli. I will also show why this Dutch Revolt was generally condemned in the Polish-Lithuanian Commonwealth.
Yate Ge, Yi Dai, Run Shan et al.
End-user development allows everyday users to tailor service robots or applications to their needs. One user-friendly approach is natural language programming. However, it encounters challenges such as an expansive user expression space and limited support for debugging and editing, which restrict its application in end-user programming. The emergence of large language models (LLMs) offers promising avenues for the translation and interpretation between human language instructions and the code executed by robots, but their application in end-user programming systems requires further study. We introduce Cocobo, a natural language programming system with interactive diagrams powered by LLMs. Cocobo employs LLMs to understand users' authoring intentions, generate and explain robot programs, and facilitate the conversion between executable code and flowchart representations. Our user study shows that Cocobo has a low learning curve, enabling even users with zero coding experience to customize robot programs successfully.
Leif Andersen, Cameron Moy, Stephen Chang et al.
The dominant programming languages support only linear text to express ideas. Visual languages offer graphical representations for entire programs, when viewed with special tools. Hybrid languages, with support from existing tools, allow developers to express their ideas with a mix of textual and graphical syntax tailored to an application domain. This mix puts both kinds of syntax on equal footing and, importantly, the enriched language does not disrupt a programmer's typical workflow. This paper presents a recipe for equipping existing textual programming languages as well as accompanying IDEs with a mechanism for creating and using graphical interactive syntax. It also presents the first hybrid language and IDE created using the recipe.
Roland Scheel
Scandinavian Studies in Germany are usually conceived of as comparative literary and cultural studies, encompassing the historical and current spaces where Northern Germanic languages were or are spoken. The article focuses on the current situation of Medieval Scandinavian Studies—one of the three branches of the discipline—in the German-speaking area, explaining their comparatively strong institutional position as a result of the long and peculiar history of the research and its entanglements with political ideology. Against this background, an overview is presented of the present research projects, and current structural and political problems, as well as challenges for the future are discussed.
Frick Karina
Leonie Helen Eckrich, Rogéria Costa Pereira
Die Corona-Pandemie hat den Präsenzunterricht vielerorts unmöglich gemacht und Lehrende und Lernende dazu bewegt, neue Möglichkeiten für einen Unterricht im Online-Format zu erkunden. Die Heterogenität der Lernenden an den Kursen der Casa de Cultura Alemã der Universidade Federal do Ceará (CCA) und der ab August 2020 durchgeführte Onlineunterricht haben uns zudem dazu bewegt, neue Unterrichtsverfahren zu ermitteln. Wir konnten feststellen, dass die schon vor der Pandemie große Heterogenität in den Deutschkursen der CCA in der Pandemie durch das neue Unterrichtssetting noch größer wurde, einerseits durch ein erhöhtes Maß an Selbstorganisation und aufzubringende -motivation, und andererseits durch die technischen Gegebenheiten der einzelnen Teilnehmer*innen. Um dieser Heterogenität gerecht zu werden, stellten wir uns die Frage, wie sich binnendifferenzierende Maßnahmen im Onlineunterricht, gestützt durch Online-Tools, umsetzen lassen. Besonders große Unterschiede finden sich im Hinblick auf die Schreibkompetenz der Teilnehmer*innen. Im vorliegenden Beitrag werden daher binnendifferenzierende Maßnahmen für das Training der Fertigkeit Schreiben im Online-Format vorgestellt.
Rufus H. Gouws, D.J. Prinsloo
This article, the third in a series of three on lexicographic data boxes, firstly focuses on a number of aspects of data boxes in bilingual dictionaries with the emphasis on different approaches in bilingual dictionaries with an African language as one of the members of the treated language pair. It is not possible to provide a comprehensive discussion within the limitations of an article. Then the discussion proceeds by looking at some new ways of using data boxes in online dictionaries. It is shown that the possibilities of the new medium allow lexicographers to employ data boxes in both traditional and non-traditional ways. It is argued that data boxes are expected to fulfil a variety of purposes ranging from navigational information and the provision of salient information to giving access to relevant data in dictionary-internal and dictionary-external sources. Lexicographers of online dictionaries have introduced new ways of using data boxes that have not yet been fully discussed in metalexicographic literature. This article gives an identification and a brief discussion of some of these innovative uses of data boxes. It stresses the potential that the online environment offers lexicography. Practical and theoretical lexicographers need to be aware of these possibilities and challenges. By embarking on a more comprehensive use of data boxes dictionaries can become even better containers of knowledge and can serve their users in an optimal way.
Jiřina Malá
Caleb Helbling, Fırat Aksoy
The difficulty associated with storing closures in a stack-based environment is known as the funarg problem. The funarg problem was first identified with the development of Lisp in the 1970s and hasn't received much attention since then. The modern solution taken by most languages is to allocate closures on the heap, or to apply static analysis to determine when closures can be stack allocated. This is not a problem for most computing systems as there is an abundance of memory. However, embedded systems often have limited memory resources where heap allocation may cause memory fragmentation. We present a simple extension to the prenex fragment of System F that allows closures to be stack-allocated. We demonstrate a concrete implementation of this system in the Juniper functional reactive programming language, which is designed to run on extremely resource limited Arduino devices. We also discuss other solutions present in other programming languages that solve the funarg problem but haven't been formally discussed in the literature.
Dieter Merlin
Im Deutschunterricht hat die Filmdidaktik mittlerweile einen festen Platz, auch wenn diesbezüglich regionale Unterschiede festzustellen sind. Die mündliche und schriftliche Analyse von Filmszenen, aber auch die Sichtung von Filmausschnitten als Ausgangsbasis für die Förderung klassischer prozessorientierter Kompetenzen, etwa als Grundlage für das Entwerfen verschiedenster informierender oder argumentierender Texte, als Recherche-Background für die Konzeption von Rede- und Diskussionsbeiträgen oder als initiierendes Moment für einen kreativen, handlungs- und produktionsorientierten Umgang mit dem Medium Film sind inzwischen weithin üblich. Zu kurz kommt dabei jedoch häufig ein genauerer Blick auf die auditiven Strukturen, die das Filmerleben und damit auch die Möglichkeiten der Filmanalyse und -interpretation wesentlich mitbestimmen. Daher soll in diesem Beitrag der Fokus auf diejenigen Kategorien filmischer Audio-Analyse gelegt werden, die sich auch ohne eine spezielle musikalische Vorbildung im Deutschunterricht auf das filmische Material anwenden lassen, um auf diese Weise zu weiterreichenden Aussagen über die textimmanenten und die kontextabhängigen Semantiken zu gelangen, welche bei der Rezeption dieses Materials aktiviert werden können. Beispielszenen sind aufgrund der Schüler*innen meist bereits vertrauten, vergleichsweise leicht zugänglichen Dramaturgie dem Western-Genre entnommen. Wichtiger Hinweis: Dieser Artikel beinhaltet mehrere Filmzitate, die nur nach einem Download des Artikels in bzw. mit bestimmten Readern (die MiDU-Redaktion empfiehlt den Adobe Acrobat Reader) aktiviert bzw. angesehen werden können. Aufgrund der Größe der Artikeldatei (ca. 50 MB) muss mit längeren Aufruf- und Download-Zeiten gerechnet werden. Abstract (english): Auditory dimensions of film reception, discussed by using the example of selected scenes from Don't Come Knocking (2005) and The Lone Ranger (2013; TV 1949) Teaching film has a firm place in German lessons at public schools, even if there are regional differences in this regard. The oral and written analysis of film scenes, but also viewing film excerpts as a starting point for fostering classical process-oriented skills are widely accepted, for example as a basis for drafting various informational or argumentative texts, as a research background when planning a speech or a contribution to a discussion, or as a key moment for initiating a creative and participative approach to the medium of film. However, a closer look at the auditory structures that have a significant influence on film experience and, thus, on the possibilities of film analysis and interpretation, is often missed out. Therefore, this article focuses on those categories of cinematic sound analysis that can be applied to film material without any special musical education. Using this concept in German lessons will enable learners to make more far-reaching statements about the text-immanent and context-dependent semantics of film scenes. Examples have been taken from three western films, due to the fact that many students are already familiar with the dramaturgy of this genre.
Ryoma Sin'ya
This paper investigates a new property of formal languages called REG-measurability where REG is the class of regular languages. Intuitively, a language \(L\) is REG-measurable if there exists an infinite sequence of regular languages that "converges" to \(L\). A language without REG-measurability has a complex shape in some sense so that it can not be (asymptotically) approximated by regular languages. We show that several context-free languages are REG-measurable (including languages with transcendental generating function and transcendental density, in particular), while a certain simple deterministic context-free language and the set of primitive words are REG-immeasurable in a strong sense.
Halaman 9 dari 589