Multimodal AI takes in different inputs like text, images or video, allowing digital assistants to better understand the world and you, and gets supercharged when it’s able to run on your device.
As smart as generative artificial intelligence (AI) can be, its capabilities are limited by how well it understands everything around it. That’s where large multimodal models (LMMs) come in, which allow AI to analyze voice queries, text, images, videos and even radio frequency and sensor data to provide more accurate and relevant answers. (OnQ)