TESS 2: 대규모 일반 목적 확산 언어 모델

초록

우리는 최신 지시 조정 확산 모델을 능가하고, 강력한 자기회귀(AR) 모델과도 견줄 만하며 때로는 이를 뛰어넘는 범용 지시 수행 확산 언어 모델인 TESS 2를 소개합니다. TESS 2는 먼저 강력한 AR 모델을 확산 손실로 일반적인 교차 엔트로피를 사용한 지속적 사전 학습을 통해 적응시킨 후, 추가적인 지시 튜닝을 수행하여 학습합니다. 우리는 적응 학습과 기본 모델 선택이 우수한 지시 수행 확산 모델을 학습하는 데 있어 중요하다는 사실을 발견했습니다. 또한, 기본 모델을 재학습할 필요 없이 모델 출력을 정렬할 수 있는 모듈식 추론 시 지도 절차인 보상 지도를 제안합니다. 마지막으로, TESS 2가 추론 시 사용되는 계산량에 대한 세밀한 제어 가능성을 통해 증가된 추론 계산량으로 더욱 개선됨을 보여줌으로써 확산 언어 모델의 유용성을 강조합니다. 코드와 모델은 https://github.com/hamishivi/tess-2에서 확인할 수 있습니다.

English

We introduce TESS 2, a general instruction-following diffusion language model that outperforms contemporary instruction-tuned diffusion models, as well as matches and sometimes exceeds strong autoregressive (AR) models. We train TESS 2 by first adapting a strong AR model via continued pretraining with the usual cross-entropy as diffusion loss, and then performing further instruction tuning. We find that adaptation training as well as the choice of the base model is crucial for training good instruction-following diffusion models. We further propose reward guidance, a novel and modular inference-time guidance procedure to align model outputs without needing to train the underlying model. Finally, we show that TESS 2 further improves with increased inference-time compute, highlighting the utility of diffusion LMs in having fine-grained controllability over the amount of compute used at inference time. Code and models are available at https://github.com/hamishivi/tess-2.

TESS 2: 대규모 일반 목적 확산 언어 모델

TESS 2: A Large-Scale Generalist Diffusion Language Model

초록

Summary

Support