迷失于字面翻译:监督训练如何塑造大语言模型中的翻译腔
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
March 6, 2025
作者: Yafu Li, Ronghao Zhang, Zhilin Wang, Huajian Zhang, Leyang Cui, Yongjing Yin, Tong Xiao, Yue Zhang
cs.AI
摘要
大型语言模型(LLMs)在机器翻译领域取得了显著成就,展现了跨多种语言的卓越性能。然而,翻译腔——以过度直译和不自然的翻译为特征——仍然是基于LLM的翻译系统中一个持续存在的挑战。尽管LLMs在大量自然话语语料库上进行了预训练,但在监督微调(SFT)过程中引入的偏差,导致其仍会出现翻译腔错误,并生成意料之外的不自然翻译。在本研究中,我们系统评估了LLM生成翻译中翻译腔的普遍性,并探究了其在监督训练中的根源。我们提出了缓解这些偏差的方法,包括润色黄金参考译文和过滤不自然的训练实例。实证评估表明,这些方法显著减少了翻译腔,同时提升了翻译的自然度,这一结果得到了人工评估和自动指标的验证。我们的发现强调了训练过程中进行调整的必要性,以优化LLM的翻译输出,为更流畅且符合目标语言习惯的翻译铺平了道路。我们已在https://github.com/yafuly/LLM_Translationese上发布了相关数据和代码。
English
Large language models (LLMs) have achieved remarkable success in machine
translation, demonstrating impressive performance across diverse languages.
However, translationese, characterized by overly literal and unnatural
translations, remains a persistent challenge in LLM-based translation systems.
Despite their pre-training on vast corpora of natural utterances, LLMs exhibit
translationese errors and generate unexpected unnatural translations, stemming
from biases introduced during supervised fine-tuning (SFT). In this work, we
systematically evaluate the prevalence of translationese in LLM-generated
translations and investigate its roots during supervised training. We introduce
methods to mitigate these biases, including polishing golden references and
filtering unnatural training instances. Empirical evaluations demonstrate that
these approaches significantly reduce translationese while improving
translation naturalness, validated by human evaluations and automatic metrics.
Our findings highlight the need for training-aware adjustments to optimize LLM
translation outputs, paving the way for more fluent and
target-language-consistent translations. We release the data and code at
https://github.com/yafuly/LLM_Translationese.Summary
AI-Generated Summary