Latest from Google AI – Multimodal Bottleneck Transformer (MBT): A New Model for Modality Fusion
Posted by Arsha Nagrani and Chen Sun, Research Scientists, Google Research, Perception Team People interact with the world through multiple sensory streams (e.g., we see objects, hear sounds, read words, feel textures and taste flavors), combining information and forming associations between senses. As real-world data consists of various signals that co-occur, such as video frames…