Papers Read on AI

Papers Read on AI header image 1
July 13, 2022  

Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection

July 13, 2022

In this work, we propose to address this problem by performing object-centric alignment of the language embeddings from the CLIP model. Furthermore, we visually ground the objects with only image-level supervision using a pseudo-labeling process that provides high-quality object proposals and helps expand the vocabulary during training. We establish a bridge between the above two object-alignment strategies via a novel weight transfer function that aggregates their complimentary strengths. In essence, the proposed model seeks to minimize the gap between object and image-centric representations in the OVD setting.

2022: Hanoona Rasheed, Muhammad Maaz, Muhammad Uzair Khattak, Salman Khan, F. Khan

https://arxiv.org/pdf/2207.03482v1.pdf