Post
  • From Twitter

Xavi says: This paper is interesting in many ways. By using multimodal fine-tuning of small(er) (i.e. 1B parameter) models like T5 the authors show that they can beat GPT3.5 in visual QA tasks, surpassing visual performance in several tasks.

Replies
No replies yet