Latest from Google AI – Modular visual question answering via code generation
Posted by Sanjay Subramanian, PhD student, UC Berkeley, and Arsha Nagrani, Research Scientist, Google Research, Perception Team Visual question answering (VQA) is a machine learning task that requires a model to answer a question about an image or a set of images. Conventional VQA approaches need a large amount of labeled training data consisting of…