Chiara Polli
Hasil untuk "Translating and interpreting"
Menampilkan 20 dari ~90321 hasil · dari CrossRef, DOAJ, arXiv
Chiara Polli
Elena S. Bell
Ana Cristina Sánchez López
Neologisms are a key factor of science fiction and world building, and their proper translation is essential if the complexity of the genre, with its usually multi-layered plot, is to be fully understood in the target language. However, the perception of science fiction and its characteristic futuristic, technological worlds may have changed in last decades due to the breakthroughs in technology and science experienced by societies all around the world. This study extracts the neologisms related to technical and scientific breakthroughs found in four English-written science fiction novels and in their translation and retranslations into Spanish, creates a contrastive corpus and analyses if the approach to their translation has evolved. The novels used are Brave New World (Aldous Huxley, 1932); Nineteen Eighty‑Four (George Orwell, 1949), Fahrenheit 451 (Ray Bradbury, 1953) and Do Androids Dream of Electric Sheep? (Philip K. Dick, 1968).
Antonio Aguilera, Esperança Bielsa, Fruela Fernández et al.
It has been over one hundred years since the original publication of Walter Benjamin's essay “The Task of the Translator”, considered a central text of the twentieth century on translation. In a public debate, held on 20 February 2024, the philosopher Antonio Aguilera, the sociologist Esperança Bielsa and the translator Fruela Fernández discussed their recent collaboration in the book Benjamin y la traducción (Ediciones del Subsuelo, 2024). Bielsa and Aguilera co-authored the book, which also provided new translations (by Fernández) of the three texts that Benjamin wrote on translation at different times in his career. Researcher Mattea Cussel chaired the debate. They discussed the importance of translation in Walter Benjamin’s work, while reflecting on how the dual and enduring task of interpreting and translating Benjamin helps us illuminate the present. Benjamin y la traducción and the debate documented here offer an actualisation of Benjamin’s work that puts translation at the centre, a ubiquitous social activity that many philosophers and sociologists underestimate or ignore, but in which, as Benjamin recognised, the key to our survival may be found. This is a transcription into English of the debate, which was held in Catalan and Spanish at the Centre de Cultura Contemporània de Barcelona (CCCB). It is followed by the last chapter from the book Benjamin y la traducción, titled “Theses on translation” (Tesis sobre la traducción).Consult the original version of this article in Catalan and Castilian HERE.
Joe Hare, Leo Freitas, Ken Pierce
As the complexity of safety-critical medical devices increases, so does the need for clear, verifiable, software requirements. This paper explores the use of Kapture, a formal modelling tool developed by D-RisQ, to translate an existing formal VDM model of a medical implant for treating focal epilepsy called CANDO. The work was undertaken without prior experience in formal methods. The paper assess Kapture's usability, the challenges of formal modelling, and the effectiveness of the translated model. The result is a model in Kapture which covers over 90% of the original VDM model, and produces matching traces of results. While several issues were encountered during design and implementation, mainly due to the initial learning curve, this paper demonstrates that complex systems can be effectively modelled in Kapture by inexperienced users and highlights some difficulties in translating VDM specifications to Kapture.
Joshua Otten, Antonios Anastasopoulos, Kevin Moran
Python is one of the most commonly used programming languages in industry and education. Its English keywords and built-in functions/modules allow it to come close to pseudo-code in terms of its readability and ease of writing. However, those who do not speak English may not experience these advantages. In fact, they may even be hindered in their ability to understand Python code, as the English nature of its terms creates an additional layer of overhead. To that end, we introduce the task of automatically translating Python's natural modality (keywords, error types, identifiers, etc.) into other human languages. This presents a unique challenge, considering the abbreviated nature of these forms, as well as potential untranslatability of advanced mathematical/programming concepts across languages. We therefore create an automated pipeline to translate Python into other human languages, comparing strategies using machine translation and large language models. We then use this pipeline to acquire translations from five common Python libraries (pytorch, pandas, tensorflow, numpy, and random) in seven languages, and do a quality test on a subset of these terms in French, Greek, and Bengali. We hope this will provide a clearer path forward towards creating a universal Python, accessible to anyone regardless of nationality or language background.
Minghan Wang, Viet-Thanh Pham, Farhad Moghimifar et al.
Despite achieving remarkable performance, machine translation (MT) research remains underexplored in terms of translating cultural elements in languages, such as idioms, proverbs, and colloquial expressions. This paper investigates the capability of state-of-the-art neural machine translation (NMT) and large language models (LLMs) in translating proverbs, which are deeply rooted in cultural contexts. We construct a translation dataset of standalone proverbs and proverbs in conversation for four language pairs. Our experiments show that the studied models can achieve good translation between languages with similar cultural backgrounds, and LLMs generally outperform NMT models in proverb translation. Furthermore, we find that current automatic evaluation metrics such as BLEU, CHRF++ and COMET are inadequate for reliably assessing the quality of proverb translation, highlighting the need for more culturally aware evaluation metrics.
An Trieu, Phuong Nguyen, Minh Le Nguyen
Entity-aware machine translation (EAMT) is a complicated task in natural language processing due to not only the shortage of translation data related to the entities needed to translate but also the complexity in the context needed to process while translating those entities. In this paper, we propose a method that applies multi-task learning to optimize the performance of the two subtasks named entity recognition and machine translation, which improves the final performance of the Entity-aware machine translation task. The result and analysis are performed on the dataset provided by the organizer of Task 2 of the SemEval 2025 competition.
Cornelia Griebel, Ivana Havelka
Yaonaiming Zhao, Qiang Zou
Triply periodic minimal surface (TPMS) is emerging as an important way of designing microstructures. However, there has been limited use of commercial CAD/CAM/CAE software packages for TPMS design and manufacturing. This is mainly because TPMS is consistently described in the functional representation (F-rep) format, while modern CAD/CAM/CAE tools are built upon the boundary representation (B-rep) format. One possible solution to this gap is translating TPMS to STEP, which is the standard data exchange format of CAD/CAM/CAE. Following this direction, this paper proposes a new translation method with error-controlling and $C^2$ continuity-preserving features. It is based on an approximation error-driven TPMS sampling algorithm and a constrained-PIA algorithm. The sampling algorithm controls the deviation between the original and translated models. With it, an error bound of $2ε$ on the deviation can be ensured if two conditions called $ε$-density and $ε$-approximation are satisfied. The constrained-PIA algorithm enforces $C^2$ continuity constraints during TPMS approximation, and meanwhile attaining high efficiency. A theoretical convergence proof of this algorithm is also given. The effectiveness of the translation method has been demonstrated by a series of examples and comparisons.
Niko Moritz, Ruiming Xie, Yashesh Gaur et al.
We propose the joint speech translation and recognition (JSTAR) model that leverages the fast-slow cascaded encoder architecture for simultaneous end-to-end automatic speech recognition (ASR) and speech translation (ST). The model is transducer-based and uses a multi-objective training strategy that optimizes both ASR and ST objectives simultaneously. This allows JSTAR to produce high-quality streaming ASR and ST results. We apply JSTAR in a bilingual conversational speech setting with smart-glasses, where the model is also trained to distinguish speech from different directions corresponding to the wearer and a conversational partner. Different model pre-training strategies are studied to further improve results, including training of a transducer-based streaming machine translation (MT) model for the first time and applying it for parameter initialization of JSTAR. We demonstrate superior performances of JSTAR compared to a strong cascaded ST model in both BLEU scores and latency.
Sophie Ling-chia Wei
In the translation history of late imperial China, the Jesuit enterprise played a significant role in translating Western scientific knowledge, a role they performed in tandem with proselytization. The Jesuit Figurists’ re-interpreting and re-writing of the ancient Chinese classics pivoted on symbols, figures, and Chinese characters. The father at the helm of this journey, Joachim Bouvet (1656–1730), embarked on his own Figurist path, navigating by the symbols, figures, and Chinese characters from the <i>Yijing</i>. His followers Joseph Henri Marie de Prémare (1666–1736) and Jean François Foucquet (1665–1741) continued on this track, each further developing his own interpretation of the <i>Dao</i>. Here I will present and explore Foucquet’s journey of the <i>Dao</i> and his presentation of the Christian God and Jesus Christ as Daoist sages by investigating his Chinese, French, and Latin manuscripts that discuss his reinterpretation of the <i>Dao</i> in the Chinese classics, especially the <i>Yijing</i> and <i>Daodejing</i>. In these manuscripts, Foucquet adopted typological exegesis and exhibited his inheritance of the Confucian-Christian-<i>Dao</i> synthesis from his senior Bouvet; he also identified the <i>Dao</i> as Deus and the Oneness of the Dao as the unity of the Holy Trinity. This micro-historical case study of Foucquet’s interpretation of the <i>Dao</i> shows how his navigating the strait between the Scylla and Charybdis of the emperor and the Holy See factored into his trajectory of interpreting the <i>Dao</i>; it also demonstrates that in response to being challenged by his own brothers in the Catholic Church, he cleaved to typological exegesis and Confucian-Christian-<i>Dao</i> synthesis. The significance of this paper lies in that the early understanding of the <i>Dao</i> was manipulated, especially among the Figurists, both as a tool for proselytization and as a bridge to link the East with the West.
Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna et al.
Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code translation. The prerequisite for advancing the state of LLM-based code translation is to understand their promises and limitations over existing techniques. To that end, we present a large-scale empirical study to investigate the ability of general LLMs and code LLMs for code translation across pairs of different languages, including C, C++, Go, Java, and Python. Our study, which involves the translation of 1,700 code samples from three benchmarks and two real-world projects, reveals that LLMs are yet to be reliably used to automate code translation -- with correct translations ranging from 2.1% to 47.3% for the studied LLMs. Further manual investigation of unsuccessful translations identifies 15 categories of translation bugs. We also compare LLM-based code translation with traditional non-LLM-based approaches. Our analysis shows that these two classes of techniques have their own strengths and weaknesses. Finally, insights from our study suggest that providing more context to LLMs during translation can help them produce better results. To that end, we propose a prompt-crafting approach based on the symptoms of erroneous translations; this improves the performance of LLM-based code translation by 5.5% on average. Our study is the first of its kind, in terms of scale and breadth, that provides insights into the current limitations of LLMs in code translation and opportunities for improving them. Our dataset -- consisting of 1,700 code samples in five PLs with 10K+ tests, 43K+ translated code, 1,748 manually labeled bugs, and 1,365 bug-fix pairs -- can help drive research in this area.
Raden Arief Nugroho, Alfian Yoga Prananta, Syaiful Ade Septemuryantoro et al.
Pilar Godayol
Mary Wollstonecraft (1759-1797) és un referent de la literatura feminista, i el seu A vindication of the rights of woman (1792), un dels tractats fundacionals. Aquest assaig es traduí al català el 2014, més de dos-cents anys després de la mort de l’autora, gairebé quaranta després de la primera traducció al castellà (1977) i deu després de la versió gallega (2004). Publicat per l’editorial gironina L’Art de la Memòria, traduït i prologat per Joan Josep Mussarra Roca, Vindicació dels drets de la dona arriba en una nova etapa d’auge editorial per al feminisme internacional. Quatre anys abans, el 2010, l’editorial valenciana Tres i Quatre n’havia editat Cartes sobre Suècia, Noruega i Dinamarca, versionades i introduïdes per Òscar Sabata i Teixidó. Aquest article, després de presentar breument la intel·lectual il·lustrada anglesa, resseguir-ne les traduccions a les altres llengües de l’Estat i contextualitzar els feminismes traduïts a Catalunya d’ençà de la dècada dels seixanta, se centra en la recepció catalana de Mary Wollstonecraft i en els factors que confluïren perquè una de les escriptores clàssiques de la història del feminisme fos introduïda al català tan tardanament.
Shaolei Zhang, Yang Feng
Simultaneous translation (ST) outputs translation while receiving the source inputs, and hence requires a policy to determine whether to translate a target token or wait for the next source token. The major challenge of ST is that each target token can only be translated based on the current received source tokens, where the received source information will directly affect the translation quality. So naturally, how much source information is received for the translation of the current target token is supposed to be the pivotal evidence for the ST policy to decide between translating and waiting. In this paper, we treat the translation as information transport from source to target and accordingly propose an Information-Transport-based Simultaneous Translation (ITST). ITST quantifies the transported information weight from each source token to the current target token, and then decides whether to translate the target token according to its accumulated received information. Experiments on both text-to-text ST and speech-to-text ST (a.k.a., streaming speech translation) tasks show that ITST outperforms strong baselines and achieves state-of-the-art performance.
Zilu Tang, Derry Wijaya
Incorporating tagging into neural machine translation (NMT) systems has shown promising results in helping translate rare words such as named entities (NE). However, translating NE in low-resource setting remains a challenge. In this work, we investigate the effect of using tags and NE hypernyms from knowledge graphs (KGs) in parallel corpus in different levels of resource conditions. We find the tag-and-copy mechanism (tag the NEs in the source sentence and copy them to the target sentence) improves translation in high-resource settings only. Introducing copying also results in polarizing effects in translating different parts-of-speech (POS). Interestingly, we find that copy accuracy for hypernyms is consistently higher than that of entities. As a way of avoiding "hard" copying and utilizing hypernym in bootstrapping rare entities, we introduced a "soft" tagging mechanism and found consistent improvement in high and low-resource settings.
Ali Araabi, Christof Monz, Vlad Niculae
Neural Machine Translation (NMT) is an open vocabulary problem. As a result, dealing with the words not occurring during training (a.k.a. out-of-vocabulary (OOV) words) have long been a fundamental challenge for NMT systems. The predominant method to tackle this problem is Byte Pair Encoding (BPE) which splits words, including OOV words, into sub-word segments. BPE has achieved impressive results for a wide range of translation tasks in terms of automatic evaluation metrics. While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured. In this paper, we study to what extent BPE is successful in translating OOV words at the word-level. We analyze the translation quality of OOV words based on word type, number of segments, cross-attention weights, and the frequency of segment n-grams in the training data. Our experiments show that while careful BPE settings seem to be fairly useful in translating OOV words across datasets, a considerable percentage of OOV words are translated incorrectly. Furthermore, we highlight the slightly higher effectiveness of BPE in translating OOV words for special cases, such as named-entities and when the languages involved are linguistically close to each other.
Marta R. Costa-jussà, Eric Smith, Christophe Ropers et al.
Machine Translation systems can produce different types of errors, some of which are characterized as critical or catastrophic due to the specific negative impact that they can have on users. In this paper we focus on one type of critical error: added toxicity. We evaluate and analyze added toxicity when translating a large evaluation dataset (HOLISTICBIAS, over 472k sentences, covering 13 demographic axes) from English into 164 languages. An automatic toxicity evaluation shows that added toxicity across languages varies from 0% to 5%. The output languages with the most added toxicity tend to be low-resource ones, and the demographic axes with the most added toxicity include sexual orientation, gender and sex, and ability. We also perform human evaluation on a subset of 8 translation directions, confirming the prevalence of true added toxicity. We use a measurement of the amount of source contribution to the translation, where a low source contribution implies hallucination, to interpret what causes toxicity. Making use of the input attributions allows us to explain toxicity, because the source contributions significantly correlate with toxicity for 84% of languages studied. Given our findings, our recommendations to reduce added toxicity are to curate training data to avoid mistranslations, mitigate hallucination and check unstable translations.
Halaman 12 dari 4517