RISE: Rule-Driven SQL Dialect Translation via Query Reduction
Xudong Xie, Yuwei Zhang, Wensheng Dou
et al.
Translating SQL dialects across different relational database management systems (RDBMSs) is crucial for migrating RDBMS-based applications to the cloud. Traditional SQL dialect translation tools rely on manually-crafted rules, necessitating significant manual effort to support new RDBMSs and dialects. Although large language models (LLMs) can assist in translating SQL dialects, they often struggle with lengthy and complex SQL queries. In this paper, we propose RISE, a novel LLM-based SQL dialect translation approach that can accurately handle lengthy and complex SQL queries. Given a complex source query $Q_c$ that contains a SQL dialect $d$, we first employ a dialect-aware query reduction technique to derive a simplified query $Q_{s}$ by removing $d$-irrelevant SQL elements from $Q_c$. Subsequently, we utilize LLMs to translate $Q_{s}$ into $Q_{s^{'}}$, and automatically extract the translation rule $r_d$ for dialect $d$ based on the relationship between $Q_{s}$ and $Q_{s^{'}}$. By applying $r_d$ to $Q_c$, we can effectively translate the dialect $d$ within $Q_c$, thereby bypassing the complexity of the source query $Q_c$. We evaluate RISE on two real-world benchmarks, i.e., TPC-DS and SQLProcBench, comparing its performance against both the traditional rule-based tools and the LLM-based approaches with respect to translation accuracy. RISE achieves accuracies of 97.98% on TPC-DS and 100% on SQLProcBench, outperforming the baselines by an average improvement of 24.62% and 238.41%, respectively.
Translational surfaces and iterated resultants
Matthew Weaver
A translational surface is a tensor product surface constructed from two space curves by translating one along the other. These surfaces are common within geometric modeling and, since their description is parametric, it is desirable to obtain the implicit equation of such a surface. These surfaces have been studied thoroughly by Goldman and Wang, where a particular set of syzygies was identified and shown to yield the implicit equation through an inhomogeneous resultant. As this method may fail in the presence of ill-behaved basepoints of the parameterization, we offer an alternative method in this article using iterated homogeneous resultants. The algorithm presented here involves smaller Sylvester matrices overall, potentially resulting in faster computation, and succeeds in many instances where the previous method cannot be applied.
Visualizing translation dynamics at atomic detail inside a bacterial cell
Liang Xue, Swantje Lenz, Maria Zimmermann-Kogadeeva
et al.
Translation is the fundamental process of protein synthesis and is catalysed by the ribosome in all living cells 1 . Here we use advances in cryo-electron tomography and sub-tomogram analysis 2 , 3 to visualize the structural dynamics of translation inside the bacterium Mycoplasma pneumoniae . To interpret the functional states in detail, we first obtain a high-resolution in-cell average map of all translating ribosomes and build an atomic model for the M. pneumoniae ribosome that reveals distinct extensions of ribosomal proteins. Classification then resolves 13 ribosome states that differ in their conformation and composition. These recapitulate major states that were previously resolved in vitro, and reflect intermediates during active translation. On the basis of these states, we animate translation elongation inside native cells and show how antibiotics reshape the cellular translation landscapes. During translation elongation, ribosomes often assemble in defined three-dimensional arrangements to form polysomes 4 . By mapping the intracellular organization of translating ribosomes, we show that their association into polysomes involves a local coordination mechanism that is mediated by the ribosomal protein L9. We propose that an extended conformation of L9 within polysomes mitigates collisions to facilitate translation fidelity. Our work thus demonstrates the feasibility of visualizing molecular processes at atomic detail inside cells. Cryo-electron tomography is used to reveal the structural dynamics and functional diversity of translating ribosomes in Mycoplasma pneumoniae , providing insight into the translation elongation cycle inside cells and how it is reshaped by antibiotics.
152 sitasi
en
Medicine, Biology
Deep learning with convolutional neural network for objective skill evaluation in robot-assisted surgery
Ziheng Wang, A. M. Fey
PurposeWith the advent of robot-assisted surgery, the role of data-driven approaches to integrate statistics and machine learning is growing rapidly with prominent interests in objective surgical skill assessment. However, most existing work requires translating robot motion kinematics into intermediate features or gesture segments that are expensive to extract, lack efficiency, and require significant domain-specific knowledge.MethodsWe propose an analytical deep learning framework for skill assessment in surgical training. A deep convolutional neural network is implemented to map multivariate time series data of the motion kinematics to individual skill levels.ResultsWe perform experiments on the public minimally invasive surgical robotic dataset, JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS). Our proposed learning model achieved competitive accuracies of 92.5%, 95.4%, and 91.3%, in the standard training tasks: Suturing, Needle-passing, and Knot-tying, respectively. Without the need of engineered features or carefully tuned gesture segmentation, our model can successfully decode skill information from raw motion profiles via end-to-end learning. Meanwhile, the proposed model is able to reliably interpret skills within a 1–3 second window, without needing an observation of entire training trial.ConclusionThis study highlights the potential of deep architectures for efficient online skill assessment in modern surgical training.
247 sitasi
en
Computer Science, Medicine
Teaching LMMs for Image Quality Scoring and Interpreting
Zicheng Zhang, Haoning Wu, Ziheng Jia
et al.
Image quality scoring and interpreting are two fundamental components of Image Quality Assessment (IQA). The former quantifies image quality, while the latter enables descriptive question answering about image quality. Traditionally, these two tasks have been addressed independently. However, from the perspective of the Human Visual System (HVS) and the Perception-Decision Integration Model, they are inherently interconnected: interpreting serves as the foundation for scoring, while scoring provides an abstract summary of interpreting. Thus, unifying these capabilities within a single model is both intuitive and logically coherent. In this paper, we propose Q-SiT (Quality Scoring and Interpreting joint Teaching), a unified framework that enables large multimodal models (LMMs) to learn both image quality scoring and interpreting simultaneously. We achieve this by transforming conventional IQA datasets into learnable question-answering datasets and incorporating human-annotated quality interpreting data for training. Furthermore, we introduce an efficient scoring & interpreting balance strategy, which first determines the optimal data mix ratio on lightweight LMMs and then maps this ratio to primary LMMs for fine-tuning adjustment. This strategy not only mitigates task interference and enhances cross-task knowledge transfer but also significantly reduces computational costs compared to direct optimization on full-scale LMMs. With this joint learning framework and corresponding training strategy, we develop Q-SiT, the first model capable of simultaneously performing image quality scoring and interpreting tasks, along with its lightweight variant, Q-SiT-mini. Experimental results demonstrate that Q-SiT achieves strong performance in both tasks with superior generalization IQA abilities.Project page at https://github.com/Q-Future/Q-SiT.
Beyond English: The Impact of Prompt Translation Strategies across Languages and Tasks in Multilingual LLMs
Itai Mondshine, Tzuf Paz-Argaman, Reut Tsarfaty
Despite advances in the multilingual capabilities of Large Language Models (LLMs) across diverse tasks, English remains the dominant language for LLM research and development. So, when working with a different language, this has led to the widespread practice of pre-translation, i.e., translating the task prompt into English before inference. Selective pre-translation, a more surgical approach, focuses on translating specific prompt components. However, its current use is sporagic and lacks a systematic research foundation. Consequently, the optimal pre-translation strategy for various multilingual settings and tasks remains unclear. In this work, we aim to uncover the optimal setup for pre-translation by systematically assessing its use. Specifically, we view the prompt as a modular entity, composed of four functional parts: instruction, context, examples, and output, either of which could be translated or not. We evaluate pre-translation strategies across 35 languages covering both low and high-resource languages, on various tasks including Question Answering (QA), Natural Language Inference (NLI), Named Entity Recognition (NER), and Abstractive Summarization. Our experiments show the impact of factors as similarity to English, translation quality and the size of pre-trained data, on the model performance with pre-translation. We suggest practical guidelines for choosing optimal strategies in various multilingual settings.
Read it in Two Steps: Translating Extremely Low-Resource Languages with Code-Augmented Grammar Books
Chen Zhang, Jiuheng Lin, Xiao Liu
et al.
While large language models (LLMs) have shown promise in translating extremely low-resource languages using resources like dictionaries, the effectiveness of grammar books remains debated. This paper investigates the role of grammar books in translating extremely low-resource languages by decomposing it into two key steps: grammar rule retrieval and application. To facilitate the study, we introduce ZhuangRules, a modularized dataset of grammar rules and their corresponding test sentences. Our analysis reveals that rule retrieval constitutes a primary bottleneck in grammar-based translation. Moreover, although LLMs can apply simple rules for translation when explicitly provided, they encounter difficulties in handling more complex rules. To address these challenges, we propose representing grammar rules as code functions, considering their similarities in structure and the benefit of code in facilitating LLM reasoning. Our experiments show that using code rules significantly boosts both rule retrieval and application, ultimately resulting in a 13.1% BLEU improvement in translation.
Neural Machine Translation for Coptic-French: Strategies for Low-Resource Ancient Languages
Nasma Chaoui, Richard Khoury
This paper presents the first systematic study of strategies for translating Coptic into French. Our comprehensive pipeline systematically evaluates: pivot versus direct translation, the impact of pre-training, the benefits of multi-version fine-tuning, and model robustness to noise. Utilizing aligned biblical corpora, we demonstrate that fine-tuning with a stylistically-varied and noise-aware training corpus significantly enhances translation quality. Our findings provide crucial practical insights for developing translation tools for historical languages in general.
Program Skeletons for Automated Program Translation
Bo Wang, Tianyu Li, Ruishi Li
et al.
Translating software between programming languages is a challenging task, for which automated techniques have been elusive and hard to scale up to larger programs. A key difficulty in cross-language translation is that one has to re-express the intended behavior of the source program into idiomatic constructs of a different target language. This task needs abstracting away from the source language-specific details, while keeping the overall functionality the same. In this work, we propose a novel and systematic approach for making such translation amenable to automation based on a framework we call program skeletons. A program skeleton retains the high-level structure of the source program by abstracting away and effectively summarizing lower-level concrete code fragments, which can be mechanically translated to the target programming language. A skeleton, by design, permits many different ways of filling in the concrete implementation for fragments, which can work in conjunction with existing data-driven code synthesizers. Most importantly, skeletons can conceptually enable sound decomposition, i.e., if each individual fragment is correctly translated, taken together with the mechanically translated skeleton, the final translated program is deemed to be correct as a whole. We present a prototype system called Skel embodying the idea of skeleton-based translation from Python to JavaScript. Our results show promising scalability compared to prior works. For 9 real-world Python programs, some with more than about 1k lines of code, 95% of their code fragments can be automatically translated, while about 5% require manual effort. All the final translations are correct with respect to whole-program test suites.
Abrir la puerta del aula de traducción. El modelo de aprendizaje-servicio en el marco del outward turn: experiencias en Wikipedia
Ingrid Cáceres-Würsig, Lorena Silos Ribas
Este artículo pretende subrayar la naturaleza transformadora de la traducción y destacar su relevancia social y explorar cómo los estudios universitarios en traducción e interpretación necesitan realizar un “giro hacia el exterior” (Bassnett/Johnston 2019). Consideramos que, dentro del espacio académico y docente, este giro requiere de una apertura del aula, que permita un modelo pedagógico más proactivo y conectado con el entorno y contribuya a resignificar la traducción en los procesos de comunicación. En el marco del proyecto descrito en el artículo y desarrollado en el entorno de Wikipedia, confirmamos que la metodología aprendizaje-servicio está alineada con los postulados del “outward turn”, planteados por Bassnett y Johnston, y responde a sus mismos objetivos. Los resultados de nuestra investigación muestran que esta metodología constituye el marco pedagógico idóneo para reforzar el perfil competencial del estudiantado en relación a sus competencias traductoras y editoras, su pensamiento crítico y, sobre todo, a sus valores cívicos y a su compromiso social.
Translating and interpreting
Interpretable deep learning translation of GWAS and multi-omics findings to identify pathobiology and drug repurposing in Alzheimer’s disease
Jielin Xu, Chengsheng Mao, Yuan Hou
et al.
SUMMARY Translating human genetic findings (genome-wide association studies [GWAS]) to pathobiology and therapeutic discovery remains a major challenge for Alzheimer’s disease (AD). We present a network topology-based deep learning framework to identify disease-associated genes (NETTAG). We leverage non-coding GWAS loci effects on quantitative trait loci, enhancers and CpG islands, promoter regions, open chromatin, and promoter flanking regions under the protein-protein interactome. Via NETTAG, we identified 156 AD-risk genes enriched in druggable targets. Combining network-based prediction and retrospective case-control observations with 10 million individuals, we identified that usage of four drugs (ibuprofen, gemfibrozil, cholecalciferol, and ceftriaxone) is associated with reduced likelihood of AD incidence. Gemfibrozil (an approved lipid regulator) is significantly associated with 43% reduced risk of AD compared with simvastatin using an active-comparator design (95% confidence interval 0.51–0.63, p < 0.0001). In summary, NETTAG offers a deep learning methodology that utilizes GWAS and multi-genomic findings to identify pathobiology and drug repurposing in AD.
Utilizing Large Language Models to Translate RFC Protocol Specifications to CPSA Definitions
Martin Duclos, Ivan A. Fernandez, Kaneesha Moore
et al.
This paper proposes the use of Large Language Models (LLMs) for translating Request for Comments (RFC) protocol specifications into a format compatible with the Cryptographic Protocol Shapes Analyzer (CPSA). This novel approach aims to reduce the complexities and efforts involved in protocol analysis, by offering an automated method for translating protocol specifications into structured models suitable for CPSA. In this paper we discuss the implementation of an RFC Protocol Translator, its impact on enhancing the accessibility of formal methods analysis, and its potential for improving the security of internet protocols.
Elisabetta Bartoli, Carla Francellini, Roberto Ludovico (a cura di) “Dossier I pianeti della fortuna: Renato Poggioli intellettuale, critico e traduttore” (Semicerchio: Rivista di poesia comparata, LXIX, 2023/2)
Francesco Chianese
Recensione del volume “Dossier I pianeti della fortuna: Renato Poggioli intellettuale, critico e traduttore”
(Semicerchio: Rivista di poesia comparata, LXIX, 2023/2) a cura di Elisabetta Bartoli, Carla Francellini, Roberto Ludovico.
Geography. Anthropology. Recreation, Language. Linguistic theory. Comparative grammar
Claves de actuación para la interpretación de la lengua de signos española en el ámbito religioso
Raico H. González-Montesino
Los (con)textos religiosos son uno de los grandes desafíos para intérpretes y traductores/as por el simbolismo y el lenguaje que los caracteriza. Su interpretación a/desde las lenguas de signos requiere valorar la diferencia de modalidad entre ambas lenguas. Ante la escasez de recursos para las intérpretes de lengua de signos española (LSE), el presente trabajo supone una propuesta teórica de pautas de actuación para estas profesionales en dicho ámbito. Para ello, se empleó el modelo de comunicación humana de Escandell Vidal (2005) y se adaptaron algunos de sus postulados a la interpretación signada, complementándolos con referencias de diversas áreas. Se pretende así que este documento sirva de guía para estudiantes, profesionales y formadores de la interpretación de LSE y, consecuentemente, repercuta en la vida de las personas sordas.
Translating and interpreting
Niemieckie i polskie nazwy szkół wyższych w ujęciu języko- i przekładoznawczym (XX i XXI wiek)
Maciej Stanaszek
GERMAN AND POLISH NAMES OF HIGHER SCHOOLS FROM A LINGUISTIC AND TRANSLATIONAL PERSPECTIVE (20TH AND 21ST CENTURIES)
The present paper is an attempt to describe the nomination usus (namegiving linguistic custom) regarding higher schools of the German-speaking area on the one hand, and of the Polish-speaking one on the other. The specific semantic categories of nominal components correspond to grammatical realisations, which are subject to certain structural constraints, as well as exhibit more and less typical forms. Informed by such observations – initiated by Jan Iluk’s overview paper [2000] and refined by an analysis of much broader material, encompassing Austrian and Swiss names as well – I tried to show the name-giving possibilities and tendencies in this field in both German and Polish, mainly with a view to a translation-related application of such contrastive analysis. The findings presented here, based on the material of a considerable number of higher school names in the examined area and combined with the presentation of about 100 German and some 45 Polish examples (provided with translations), should help both in evaluating already published equivalents, hardly ever official in the case of German, and proposing one’s own ones – also because of the non-existence of given equivalents in publications, be it only Internet-based ones.
Translating and interpreting
Una historia de Shakespeare en Losada
Pablo Ingberg
La Editorial Losada empezó a publicar traducciones de piezas de Shakespeare en 1939 y llegó al último tomo de las Obras completas en 2009. Este artículo recorre la historia de ese proceso de setenta años: la parte atinente de la trayectoria de la editorial, cómo llegó a esas traducciones y quiénes intervinieron, quiénes fueron sus autores, cómo y cuándo se publicaron, cómo se fue organizando la edición de las Obras completas, cuáles fueron sus criterios y características. El objetivo principal es ofrecer una fuente para quienes busquen datos y referencias bibliográficas sobre las traducciones de obras de Shakespeare publicadas por Losada.
Translating and interpreting
Many-to-English Machine Translation Tools, Data, and Pretrained Models
Thamme Gowda, Zhao Zhang, Chris A Mattmann
et al.
While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages. Commercial translation systems support only one hundred languages or fewer, and do not make these models available for transfer to low resource languages. In this work, we present useful tools for machine translation research: MTData, NLCodec, and RTG. We demonstrate their usefulness by creating a multilingual neural machine translation model capable of translating from 500 source languages to English. We make this multilingual model readily downloadable and usable as a service, or as a parent model for transfer-learning to even lower-resource languages.
The Role of Conceptual Metaphors in the Translation of Sahifa Sajadieh From the Linguistics Perspective and the Model of Lakoff and Johnson
Yosra Shadman, Mohammad Nabi Ahmadi, Sahar Malekian
Cognitive metaphorical theory is a term used in linguistics to refer to an idea or conceptual field based on another idea or concept. Examples from the book Sahifa Sajadieh and its two translations by Mousavi Garmaroodi and Elahi Ghomshey have been selected in order to investigate the relationship between cognitive metaphors at the two main levels: similar metaphors, special metaphors, with the act of translating and paying attention to the translators. In these examples, the metaphorical element was extracted and the performance of each translator in dealing with it was investigated. The results suggest that although translators have taken a different approach to translate the meanings of the source text, the success of both translators in translating metaphors has been similar. Mousavi Garmaroodi, as a translator of literature, has used creative metaphors, while Elahi Ghomshei has, in some instances, translated abstract concepts without metaphor in order to facilitate translation. In this research, the translation of Sahifa Sajadieh's conceptual metaphors is examined by the descriptive-analytical method and comparative approach.
Translating and interpreting
Interior estimates for Translating solitons of the $Q_k$-flows in $\mathbb{R}^{n+1}$
Jose Torres Santaella
We prove interior gradient estimate and second order estimate for the $Q_k$-flow and $Q_k$-translators in $\mathbb{R}^{n+1}$. In addition, we show that $Q_k$-translator which are asymptotic to $o(|x|)$ cannot exist.
Improving Zero-Shot Translation by Disentangling Positional Information
Danni Liu, Jan Niehues, James Cross
et al.
Multilingual neural machine translation has shown the capability of directly translating between language pairs unseen in training, i.e. zero-shot translation. Despite being conceptually attractive, it often suffers from low output quality. The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training. We demonstrate that a main factor causing the language-specific representations is the positional correspondence to input tokens. We show that this can be easily alleviated by removing residual connections in an encoder layer. With this modification, we gain up to 18.5 BLEU points on zero-shot translation while retaining quality on supervised directions. The improvements are particularly prominent between related languages, where our proposed model outperforms pivot-based translation. Moreover, our approach allows easy integration of new languages, which substantially expands translation coverage. By thorough inspections of the hidden layer outputs, we show that our approach indeed leads to more language-independent representations.