Latest from Google AI – F-VLM: Open-vocabulary object detection upon frozen vision and language models
Posted by Weicheng Kuo and Anelia Angelova, Research Scientists, Google Research Detection is a fundamental vision task that aims to localize and recognize objects in an image. However, the data collection process of manually annotating bounding boxes or instance masks is tedious and costly, which limits the modern detection vocabulary size to roughly 1,000 object…