需要多少幅梵高畫作才能達到梵高的水準?尋找模仿閾值
How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold
October 19, 2024
作者: Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar
cs.AI
摘要
文本到圖像模型是使用從互聯網上爬取的圖像-文本對數據集進行訓練的。這些數據集通常包含私人、受版權保護和許可的材料。在這些數據集上訓練模型使其能夠生成具有此類內容的圖像,這可能違反版權法和個人隱私。這種現象被稱為模仿,即生成具有與其訓練圖像具有可識別相似性的內容的圖像。在這項工作中,我們研究了概念在訓練數據集中的頻率與模型模仿該概念的能力之間的關係。我們試圖確定模型在訓練了足夠多實例以模仿一個概念時的點,即模仿閾值。我們將這個問題提出為一個新問題:尋找模仿閾值(FIT),並提出一種有效的方法,該方法估計模仿閾值,而無需費心地從頭訓練多個模型。我們在兩個領域進行實驗,即人臉和藝術風格,我們創建了四個數據集,並評估了三個文本到圖像模型,這些模型是在兩個預訓練數據集上訓練的。我們的結果顯示,這些模型的模仿閾值在200-600張圖像的範圍內,具體取決於領域和模型。模仿閾值可以為版權侵權索賠提供實證依據,並作為遵守版權和隱私法律的文本到圖像模型開發者的指導原則。我們在https://github.com/vsahil/MIMETIC-2.git 上發布了代碼和數據,項目網站托管在https://how-many-van-goghs-does-it-take.github.io。
English
Text-to-image models are trained using large datasets collected by scraping
image-text pairs from the internet. These datasets often include private,
copyrighted, and licensed material. Training models on such datasets enables
them to generate images with such content, which might violate copyright laws
and individual privacy. This phenomenon is termed imitation -- generation of
images with content that has recognizable similarity to its training images. In
this work we study the relationship between a concept's frequency in the
training dataset and the ability of a model to imitate it. We seek to determine
the point at which a model was trained on enough instances to imitate a concept
-- the imitation threshold. We posit this question as a new problem: Finding
the Imitation Threshold (FIT) and propose an efficient approach that estimates
the imitation threshold without incurring the colossal cost of training
multiple models from scratch. We experiment with two domains -- human faces and
art styles -- for which we create four datasets, and evaluate three
text-to-image models which were trained on two pretraining datasets. Our
results reveal that the imitation threshold of these models is in the range of
200-600 images, depending on the domain and the model. The imitation threshold
can provide an empirical basis for copyright violation claims and acts as a
guiding principle for text-to-image model developers that aim to comply with
copyright and privacy laws. We release the code and data at
https://github.com/vsahil/MIMETIC-2.git and the project's website is
hosted at https://how-many-van-goghs-does-it-take.github.io.Summary
AI-Generated Summary