Papers Read on AI

Papers Read on AI header image 1
July 14, 2022  

The Web Is Your Oyster - Knowledge-Intensive NLP against a Very Large Web Corpus

July 14, 2022

We propose a new setup for evaluating existing knowledge intensive tasks in which we generalize the background corpus to a universal web snapshot. We investigate a slate of NLP tasks which rely on knowledge - either factual or common sense, and ask systems to use a subset of CCNet—the S PHERE corpus—as a knowledge source. In contrast to Wikipedia, otherwise a common background corpus in KI-NLP, S PHERE is orders of magnitude larger and better reflects the full diversity of knowledge on the web.

2021: Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Dmytro Okhonko, Samuel Broscheit, Gautier Izacard, Patrick Lewis, Barlas Ouguz, Edouard Grave, Wen-tau Yih, Sebastian Riedel

https://arxiv.org/pdf/2112.09924v2.pdf