詞義鏈接:在沙盒之外消除歧義

Word Sense Linking: Disambiguating Outside the Sandbox

December 12, 2024
作者: Andrei Stefan Bejgu, Edoardo Barba, Luigi Procopio, Alberte Fernández-Castro, Roberto Navigli
cs.AI

摘要

詞義消歧(WSD)是將給定語境中的單詞與可能的候選詞義中最合適的意義相關聯的任務。儘管該任務最近再次引起關注,系統的表現超過了預估的標註者間一致性,但在撰寫本文時,它仍然難以找到下游應用。我們認為造成這一情況的原因之一是將WSD應用於純文本的困難。事實上,在標準制定中,模型工作的假設是a)所有需要消歧的範圍已經被識別,以及b)每個範圍的所有可能候選詞義都已提供,這兩者都是遠非微不足道的要求。在本研究中,我們提出了一個名為詞義鏈接(WSL)的新任務,給定一個輸入文本和一個參考詞義庫,系統必須同時識別要消歧的範圍,然後將它們連結到最合適的意義。我們提出了一種基於Transformer架構的任務,並徹底評估了其性能以及那些擴展到WSL的最先進WSD系統的性能,逐步放寬了WSD的假設。我們希望我們的工作將促進將詞彙語義更容易地整合到下游應用中。
English
Word Sense Disambiguation (WSD) is the task of associating a word in a given context with its most suitable meaning among a set of possible candidates. While the task has recently witnessed renewed interest, with systems achieving performances above the estimated inter-annotator agreement, at the time of writing it still struggles to find downstream applications. We argue that one of the reasons behind this is the difficulty of applying WSD to plain text. Indeed, in the standard formulation, models work under the assumptions that a) all the spans to disambiguate have already been identified, and b) all the possible candidate senses of each span are provided, both of which are requirements that are far from trivial. In this work, we present a new task called Word Sense Linking (WSL) where, given an input text and a reference sense inventory, systems have to both identify which spans to disambiguate and then link them to their most suitable meaning.We put forward a transformer-based architecture for the task and thoroughly evaluate both its performance and those of state-of-the-art WSD systems scaled to WSL, iteratively relaxing the assumptions of WSD. We hope that our work will foster easier integration of lexical semantics into downstream applications.

Summary

AI-Generated Summary

PDF92December 13, 2024