Hasil untuk "Translating and interpreting"

Menampilkan 20 dari ~137772 hasil · dari DOAJ, arXiv, CrossRef, Semantic Scholar

JSON API
CrossRef Open Access 2026
Translating (in) the Public Service – When interpreting facilitates migrants’ understanding of the institutional context

Daniele Urlotti

Abstract In contexts where migrants need to access public services, intercultural mediators have been shown to make autonomous choices while providing their interpreting services. Thus, they do not act merely as translators but also as coordinators of the exchanges. One of the problems potentially arising in this type of context is that migrants might struggle to correctly understand how institutions function from an administrative or organizational viewpoint. When service providers do not compensate this lack of understanding, intercultural mediators are observed to do so in a variety of ways. Based on audio-recorded mediator-interpreted gynaecological visits and parent-teacher meetings gathered in the North of Italy, this study aims to shed light on this phenomenon, investigating the types of strategies that intercultural mediators employ to help migrants understand the functioning of institutions and the operations needed to achieve their aims. For the analysis two methodologies have been applied, Conversation Analysis and Wadensjö’s taxonomy of interpreters’ renditions. The results of the analysis show that intercultural mediators orient to employing three types of strategies, from rephrasing specialised lexis to expanding service providers’ utterances, and integrating additional explanations. These practices prove useful to prevent potential inconveniences for migrants using public services, and to reach interactional success between migrants and the institutions.

arXiv Open Access 2025
A U-Net and Transformer Pipeline for Multilingual Image Translation

Siddharth Sahay, Radhika Agarwal

This paper presents an end-to-end multilingual translation pipeline that integrates a custom U-Net for text detection, the Tesseract engine for text recognition, and a from-scratch sequence-to-sequence (Seq2Seq) Transformer for Neural Machine Translation (NMT). Our approach first utilizes a U-Net model, trained on a synthetic dataset , to accurately segment and detect text regions from an image. These detected regions are then processed by Tesseract to extract the source text. This extracted text is fed into a custom Transformer model trained from scratch on a multilingual parallel corpus spanning 5 languages. Unlike systems reliant on monolithic pre-trained models, our architecture emphasizes full customization and adaptability. The system is evaluated on its text detection accuracy, text recognition quality, and translation performance via BLEU scores. The complete pipeline demonstrates promising results, validating the viability of a custom-built system for translating text directly from images.

en cs.LG, cs.CL
arXiv Open Access 2025
From TOWER to SPIRE: Adding the Speech Modality to a Translation-Specialist LLM

Kshitij Ambilduke, Ben Peters, Sonal Sannigrahi et al.

We introduce Spire, a speech-augmented language model (LM) capable of both translating and transcribing speech input from English into 10 other languages as well as translating text input in both language directions. Spire integrates the speech modality into an existing multilingual LM via speech discretization and continued pre-training using only 42.5K hours of speech. In particular, we adopt the pretraining framework of multilingual LMs and treat discretized speech input as an additional translation language. This approach not only equips the model with speech capabilities, but also preserves its strong text-based performance. We achieve this using significantly less data than existing speech LMs, demonstrating that discretized speech input integration as an additional language is feasible during LM adaptation. We make our code and models available to the community.

en cs.CL
arXiv Open Access 2025
Translating the Grievance Dictionary: a psychometric evaluation of Dutch, German, and Italian versions

Isabelle van der Vegt, Bennett Kleinberg, Marilu Miotto et al.

This paper introduces and evaluates three translations of the Grievance Dictionary, a psycholinguistic dictionary for the analysis of violent, threatening or grievance-fuelled texts. Considering the relevance of these themes in languages beyond English, we translated the Grievance Dictionary to Dutch, German, and Italian. We describe the process of automated translation supplemented by human annotation. Psychometric analyses are performed, including internal reliability of dictionary categories and correlations with the LIWC dictionary. The Dutch and German translations perform similarly to the original English version, whereas the Italian dictionary shows low reliability for some categories. Finally, we make suggestions for further validation and application of the dictionary, as well as for future dictionary translations following a similar approach.

en cs.CL
CrossRef Open Access 2025
Reactivating Traumatic Memory through Paratexts: Translating Titles and Covers of <i>The Rape of Nanking</i>

Luomei Cui

Translation creates a connection that allows present-day individuals to directly engage with and immerse themselves in the memory of the past, fostering a tangible link between the present and historical events. This study examines Iris Chang’s The Rape of Nanking (1997) and its four Chinese translations to explore how the translations recreate traumatic memories through the paratextual perspective. Iris Chang’s work reveals Japan’s invasion of China during the Second Sino-Japanese War (1937-1945), which stored the memory of the decades of pain and suffering it brought to the people of Nanjing and even to all Chinese people. The study found that the four Chinese translations evoked contemporary people’s memories of Nanjing through paratextual elements such as the cover and title. This research is embedded within the interdisciplinary frameworks of memory studies and translation studies. The research aims to achieve three interrelated goals: first, to evoke national collective memory among contemporary Chinese readers; second, to deepen public understanding of this traumatic historical period; and third, to inspire patriotic sentiment and foster a heightened awareness of peace. The study not only sheds light on the cultural and emotional impact of translation and retranslation, but also contributes to advancing theoretical and practical insights within both memory studies and translation studies.

arXiv Open Access 2024
Fine-Grained and Multi-Dimensional Metrics for Document-Level Machine Translation

Yirong Sun, Dawei Zhu, Yanjun Chen et al.

Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT), yet most studies focus on sentence-level translation. This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT). Unlike prior approaches that require specialized techniques, we evaluate LLMs by directly prompting them to translate entire documents in a single pass. Our results show that this method improves translation quality compared to translating sentences separately, even without document-level fine-tuning. However, this advantage is not reflected in BLEU scores, which often favor sentence-based translations. We propose using the LLM-as-a-judge paradigm for evaluation, where GPT-4 is used to assess document coherence, accuracy, and fluency in a more nuanced way than n-gram-based metrics. Overall, our work demonstrates that instruction-tuned LLMs can effectively leverage document context for translation. However, we caution against using BLEU scores for evaluating docMT, as they often provide misleading outcomes, failing to capture the quality of document-level translation. Code and the outputs from GPT4-as-a-judge are available at https://github.com/EIT-NLP/BLEUless_DocMT

en cs.CL, cs.AI
arXiv Open Access 2024
EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque

Aitor García-Pablos, Naiara Perez, Montse Cuadros et al.

The widespread availability of Question Answering (QA) datasets in English has greatly facilitated the advancement of the Natural Language Processing (NLP) field. However, the scarcity of such resources for minority languages, such as Basque, poses a substantial challenge for these communities. In this context, the translation and alignment of existing QA datasets plays a crucial role in narrowing this technological gap. This work presents EuSQuAD, the first initiative dedicated to automatically translating and aligning SQuAD2.0 into Basque, resulting in more than 142k QA examples. We demonstrate EuSQuAD's value through extensive qualitative analysis and QA experiments supported with EuSQuAD as training data. These experiments are evaluated with a new human-annotated dataset.

en cs.CL
CrossRef Open Access 2024
SIMULTANEOUS INTERPRETING STRATEGIES FOR TRANSLATING FACEBOOK CEO’S INTERVIEW ON HBO’S AXIOS FROM ENGLISH TO INDONESIAN

Okta Viani Dwi Lestari, Ramadan Adianto Budiman

This study analyzes simultaneous translation strategies in Facebook CEO Mark Zuckerberg’s interview on Axios HBO from English to Indonesian. Using a descriptive qualitative approach, this study identifies various strategies applied by the interpreter, including paraphrasing, transposition, explication, adaptation, and modulation. Data collection methods include literature study, note-taking, reading, and observation, while data analysis utilizes Gile’s Simultaneous Interpretation theory and Kopczynski’s strategies. The results show that the success of simultaneous translation depends not only on linguistic ability but also on the understanding of technology and contemporary social issues, as well as the ability to manage technical terms and political nuances. This research contributes to the field of simultaneous translation studies in the context of technology and media. This research provides insights into practical approaches for interpreters in media settings, with broader implications for simultaneous interpretation in technology-related contexts.

CrossRef Open Access 2023
Translating Interviews, interpreting lives: bi-lingual research analysis informing less westernised views of international student mobility

Zhao Qun, Neil Carey

There are increasing instances in which researchers study their migrant co-nationals in one language but report their research findings in another language. This raises significant issues regarding the mediating role played by bilingual researcher-translators when translating research data: the decisions they make when bringing the Other’s world to their readers, and the strategies they adopt when making such decisions. These issues of data translation, as well as the unique experiences of the researcher-translators and the valuable knowledge that they generate from this process, have not yet been given adequate attention in the academic literature. In response, this article explores a translation analysis which allows the researcher-translator to reflect in detail on the methodological challenges that researcher-translators are likely to encounter. Introducing Poblete’s five operations of translation, we highlight the processes that the researcher-translator adopts in recognising, reflecting and negotiating with the (un)translatability of culturally embedded linguistic expression. Focusing on International Student Mobility (ISM) as a particular instance of research translation/analysis as cultural mediation, we demonstrate how our intention to attune to students’ own ISM journey in their own language reverberates in the mediation and interventions the researcher-translator conducted through the translation analysis. The article thus emphasises how translating interview scripts as part of the research is more than seeking linguistic correspondence, it is also about understanding non-western lives and life-words through a second-language.

11 sitasi en
arXiv Open Access 2023
How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation

Marco Gaido, Dennis Fucci, Matteo Negri et al.

When translating from notional gender languages (e.g., English) into grammatical gender languages (e.g., Italian), the generated translation requires explicit gender assignments for various words, including those referring to the speaker. When the source sentence does not convey the speaker's gender, speech translation (ST) models either rely on the possibly-misleading vocal traits of the speaker or default to the masculine gender, the most frequent in existing training corpora. To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender. While previous work has shown that the most effective solution is represented by separate, dedicated gender-specific models, the goal of this paper is to achieve the same results by integrating the speaker's gender metadata into a single "multi-gender" neural ST model, easier to maintain. Our experiments demonstrate that a single multi-gender model outperforms gender-specialized ones when trained from scratch (with gender accuracy gains up to 12.9 for feminine forms), while fine-tuning from existing ST models does not lead to competitive results.

en cs.CL
arXiv Open Access 2023
Translating away Translationese without Parallel Data

Rricha Jalota, Koel Dutta Chowdhury, Cristina España-Bonet et al.

Translated texts exhibit systematic linguistic differences compared to original texts in the same language, and these differences are referred to as translationese. Translationese has effects on various cross-lingual natural language processing tasks, potentially leading to biased results. In this paper, we explore a novel approach to reduce translationese in translated texts: translation-based style transfer. As there are no parallel human-translated and original data in the same language, we use a self-supervised approach that can learn from comparable (rather than parallel) mono-lingual original and translated data. However, even this self-supervised approach requires some parallel data for validation. We show how we can eliminate the need for parallel validation data by combining the self-supervised loss with an unsupervised loss. This unsupervised loss leverages the original language model loss over the style-transferred output and a semantic similarity loss between the input and style-transferred output. We evaluate our approach in terms of original vs. translationese binary classification in addition to measuring content preservation and target-style fluency. The results show that our approach is able to reduce translationese classifier accuracy to a level of a random classifier after style transfer while adequately preserving the content and fluency in the target original style.

en cs.CL
arXiv Open Access 2023
Automatically Testing Functional Properties of Code Translation Models

Hasan Ferit Eniser, Valentin Wüstholz, Maria Christakis

Large language models are becoming increasingly practical for translating code across programming languages, a process known as $transpiling$. Even though automated transpilation significantly boosts developer productivity, a key concern is whether the generated code is correct. Existing work initially used manually crafted test suites to test the translations of a small corpus of programs; these test suites were later automated. In contrast, we devise the first approach for automated, functional, property-based testing of code translation models. Our general, user-provided specifications about the transpiled code capture a range of properties, from purely syntactic to purely semantic ones. As shown by our experiments, this approach is very effective in detecting property violations in popular code translation models, and therefore, in evaluating model quality with respect to given properties. We also go a step further and explore the usage scenario where a user simply aims to obtain a correct translation of some code with respect to certain properties without necessarily being concerned about the overall quality of the model. To this purpose, we develop the first property-guided search procedure for code translation models, where a model is repeatedly queried with slightly different parameters to produce alternative and potentially more correct translations. Our results show that this search procedure helps to obtain significantly better code translations.

en cs.SE, cs.LG
CrossRef Open Access 2023
Interpreting and Translating Transnational Activism

Johanna Gehmacher

AbstractThis chapter takes the case of the International Woman Suffrage Alliance (IWSA) to discuss the relevance of translation and interpretation for transnational political movements. It asks how the IWSA dealt with the diversity of languages among their delegates and analyses the (at times entangled) roles of the interpreters and journalists in the context of the regular transnational conferences and on the board of the organisation. The chapter gauges how translation was used to facilitate the transfer of political arguments, but also helped to reinforce hierarchies of cultures and languages.

S2 Open Access 2022
A call for community-informed translation

R. Attig

This article considers the Spanish and French translations of nonbinary pronouns in Netflix’s One Day at a Time, a social-justice-oriented sitcom. The article compares the source text with six parallel translations taken from one episode and isolates two main translation strategies. In the first strategy, translators rely on calque translations from English that demonstrate a misunderstanding of the source text. The second strategy shows an active engagement on the part of translators with Hispanic and Francophone Queer communities, replicating authentic Queer language practices. The article goes on to describe the implications of both strategies on reception and outlines several reasons why community-informed translation should be established as a best practice for Queer-oriented texts.

DOAJ Open Access 2022
Machine translation in everyone’s hands - Adoption and changes among general users of MT

David Orrego-Carmona

En los 20 años de la Revista Tradumàtica, hemos visto cómo la traducción automática ha pasado a formar parte de la vida cotidiana de sus usuarios habituales. Partiendo de 17 respuestas, este artículo reflexiona sobre el uso de la TA entre los no profesionales de la traducción. Tras opinar sobre el uso de la TA como diccionario, para leer noticias, para acceder a la información o para producir textos en situaciones que los usuarios perciben como de bajo o alto riesgo, el artículo ahonda en la concienciación de los usuarios con respecto a la precisión de la TA y la necesidad de comprometerse con el resultado para mejorar la calidad de las traducciones. Además, los resultados también indican que el uso de la TA no solo afecta a la producción en la lengua meta, sino que también influye en la redacción de los originales que se pretende traducir. A partir de las respuestas, el artículo analiza el impacto de la TA en el marco de la accesibilidad y la democratización, revisando cómo la TA y la IA tienen el potencial de apoyar el cambio social pero también de profundizar la desigualdad, reproducir sesgos y reducir la operatividad de los agentes humanos. Por último, el artículo hace un llamamiento a una aplicación crítica y consciente de la TA para apoyar la interacción persona-ordenador como herramienta para el desarrollo de la sociedad.

Translating and interpreting
arXiv Open Access 2022
A Theory of Unsupervised Translation Motivated by Understanding Animal Communication

Shafi Goldwasser, David F. Gruber, Adam Tauman Kalai et al.

Neural networks are capable of translating between languages -- in some cases even between two languages where there is little or no access to parallel translations, in what is known as Unsupervised Machine Translation (UMT). Given this progress, it is intriguing to ask whether machine learning tools can ultimately enable understanding animal communication, particularly that of highly intelligent animals. We propose a theoretical framework for analyzing UMT when no parallel translations are available and when it cannot be assumed that the source and target corpora address related subject domains or posses similar linguistic structure. We exemplify this theory with two stylized models of language, for which our framework provides bounds on necessary sample complexity; the bounds are formally proven and experimentally verified on synthetic data. These bounds show that the error rates are inversely related to the language complexity and amount of common ground. This suggests that unsupervised translation of animal communication may be feasible if the communication system is sufficiently complex.

en cs.CL, cs.LG

Halaman 26 dari 6889