하이퍼인코더: 정보 검색을 위한 하이퍼네트워크

초록

검색 모델의 대다수는 쿼리와 문서 사이의 관련성 점수를 생성하기 위해 벡터 내적에 의존합니다. 이는 사용할 수 있는 관련성 점수의 표현력을 자연스럽게 제한합니다. 우리는 새로운 패러다임을 제안합니다. 쿼리를 나타내는 벡터를 생성하는 대신 학습된 관련성 함수 역할을 하는 작은 신경망을 생성합니다. 이 작은 신경망은 문서의 표현을 입력으로 받아 스칼라 관련성 점수를 출력합니다. 작은 신경망을 생성하기 위해 다른 네트워크의 가중치를 생성하는 하이퍼네트워크, 즉 우리가 쿼리 인코더로 사용하는 Hypencoder를 사용합니다. 도메인 내 검색 작업에서의 실험 결과 Hypencoder가 강력한 밀집 검색 모델을 크게 능가하며 재순위 모델 및 크기가 한 단계 큰 모델보다 높은 지표를 보여줍니다. Hypencoder는 도메인 외 검색 작업에도 잘 일반화됨을 입증합니다. Hypencoder의 능력을 평가하기 위해 혀끝 검색 및 지시 따르기 검색 작업을 포함한 일려의 어려운 검색 작업에서 성능 차이가 표준 검색 작업과 비교했을 때 크게 벌어짐을 확인합니다. 더불어, 우리 방법의 실용성을 증명하기 위해 근사 검색 알고리즘을 구현하고 모델이 60밀리초 미만에 880만 건의 문서를 검색할 수 있는 것을 보여줍니다.

English

The vast majority of retrieval models depend on vector inner products to produce a relevance score between a query and a document. This naturally limits the expressiveness of the relevance score that can be employed. We propose a new paradigm, instead of producing a vector to represent the query we produce a small neural network which acts as a learned relevance function. This small neural network takes in a representation of the document, in this paper we use a single vector, and produces a scalar relevance score. To produce the little neural network we use a hypernetwork, a network that produce the weights of other networks, as our query encoder or as we call it a Hypencoder. Experiments on in-domain search tasks show that Hypencoder is able to significantly outperform strong dense retrieval models and has higher metrics then reranking models and models an order of magnitude larger. Hypencoder is also shown to generalize well to out-of-domain search tasks. To assess the extent of Hypencoder's capabilities, we evaluate on a set of hard retrieval tasks including tip-of-the-tongue retrieval and instruction-following retrieval tasks and find that the performance gap widens substantially compared to standard retrieval tasks. Furthermore, to demonstrate the practicality of our method we implement an approximate search algorithm and show that our model is able to search 8.8M documents in under 60ms.

하이퍼인코더: 정보 검색을 위한 하이퍼네트워크

Hypencoder: Hypernetworks for Information Retrieval

초록

Summary

Support