Multimodal

Google Home Gemini "Live Search": Multimodal Vision

Google has brought Gemini's multimodal capabilities to the physical world with "Live Search" for Nest Cam owners. This allows users to treat their live video feeds as a searchable database.

Semantic Video Querying

Instead of scrubbing through timelines, users can now ask: "Did the delivery driver leave the package behind the gate?" or "Is the dog on the couch?". Gemini scans the live frames, identifies objects and actions, and provides a text-based confirmation.

This marks the transition of Gemini from a chatbot to a Vision Intelligence Layer for the smart home.