ZipIt! Merging Models from Different Tasks without Training

Typical deep visual recognition models are capable of performing the one task they were trained on. In this paper, we tackle the extremely difficult problem of combining completely distinct models with different initializations, each solving a separate task, into one multi-task model without any additional training. Prior work in model merging permutes one model to the space of the other then adds them together. While this works for models trained on the same task, we find that this fails to account for the differences in models trained on disjoint tasks. Thus, we introduce"ZipIt!", a general method for merging two arbitrary models of the same architecture that incorporates two simple strategies. 2023: George Stoica, Daniel Bolya, J. Bjorner, Taylor N. Hearn, Judy Hoffman https://arxiv.org/pdf/2305.03053v1.pdf

Comment (0)

No comments yet. Be the first to say something!