Vision GNN: An Image is Worth Graph of Nodes

Network architecture plays a key role in the deep learning-based computer vision system. The widely-used convolutional neural network and transformer treat the image as a grid or sequence structure, which is not flexible to capture irregular and complex objects. In this paper, we propose to represent the image as a graph structure and introduce a new Vision GNN (ViG) architecture to extract graph level feature for visual tasks. 2022: Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, E. Wu https://arxiv.org/pdf/2206.00272v1.pdf

Comment (0)

No comments yet. Be the first to say something!