Wednesday Dec 21, 2022
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
![Revisiting Classifier: Transferring Vision-Language Models for Video Recognition](https://pbcdn1.podbean.com/imglogo/image-logo/12375367/papers_read_ai_300x300.png)
Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is an important topic in computer vision research. Along with the growth of computational capacity, we now have open-source vision-language pre-trained models in large scales of the model architecture and amount of data. In this study, we focus on transferring knowledge for video classification tasks. Conventional methods randomly initialize the linear classifier head for vision classification, but they leave the usage of the text encoder for downstream visual recognition tasks undiscovered. In this paper, we revise the role of the linear classifier and replace the classifier with different knowledge from the pre-trained model. 2022: Wenhao Wu, Zhun Sun, Wanli Ouyang Ranked #1 on Action Recognition on ActivityNet https://arxiv.org/pdf/2207.01297v3.pdf
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.