Google Supercharges Gemini with new AI features at I/O 2025: 13 Updates that you shouldn't miss | Mint

Google has rolled out a set of major updates from its Gemini app with the aim of increasing its attraction and providing users with more ways to communicate with AI. The updates are announced at the annual Google I/O 2025 event, and contain tools for visual aid, mediaeration and research – which is now available on Android and iOS. Here is every AI update announced by the American Tech giant: Gemini Live Gemini Live, through which users can share their camera tour or screen during conversations, is now available for free. The feature is designed to help users show what they mean instead of tapping out questions. According to Google, conversations used by Gemini Live are longer than just text chats, which attribute the business to the more interactive format. Google plans to integrate the instrument deeper into the broader ecosystem in the coming weeks. Users can connect twins live to apps such as maps, calendar, tasks and hold. For example, asking about restaurant options can liaise directly with Google Maps, or a group chat may result in an opportunity being added to the calendar. Imagen 4 twins now contain Imagen 4, a new model for image generation that supports improved visual detail and better text version within images. Users can create graphics and footage for different uses, including offers and social media posts. VEO 3 For video creation, VEO 3 is introduced. It supports text-to-video generation and can also add surrounding sounds and basic character dialogue. VEO 3 is currently only available for Togoogle AI ultra -subscribers in the US, which limits access to international users or those on the free plan. Deep research The deep research feature now enables users to download personal files-like PDFs or images-ups into AI-generated reports. The goal is to produce more personalized and contextual results by combining private and public data sources. Google has announced plans to expand this functionality to include contents of Google Drive and Gmail in the near future. Project Astra project Astra shows the real-time capabilities of Google’s Gemini models, with the initial features now integrated into Gemini Live. This upgraded version can actively use a camera’s camera to interpret the content on the screen in real time. Some of the latest improvements are a more natural, expressive voice powered by indigenous sound generation, improved memory function and advanced computer control functions. During the key at Google I/O 2025, a live demonstration gemini live’s ability to communicate fluently with users, highlighted – respond with expressive speech, seamlessly handled, and continued conversations without losing context. It also showed multitasking capabilities, such as making business calls, browsing documents and browsing the web, all in real time. Google Flow Google Flow is an AI-powered filmmaking tool designed for creativities to easily generate movie videos. It combines Google’s Advanced Models-Veo (video generation), Imagen (Image Generation) and Gemini (Natural Language Concepts) -To help users to convert everyday language directions into high quality visual scenes. Flow enables a constant character and theater creation, enabling seamless integration across multiple tracks. It is built to make storytelling faster, intuitive and visually surprising. Agent mode on the Google I/O 2025 event revealed Pichai a new feature called Agent Mode for the Gemini app. This upcoming experimental instrument, which is initially available for subscribers, is designed to handle complicated tasks and planning on behalf of the user. With agent mode, twins move beyond simple reactions to take more autonomous actions-to organize, schedule, schedule and execute multi-step tasks. Google has also announced that these agent AI capabilities will extend to Chrome, search and the twin platform, which is an important step towards AI that can proactively manage Take, rather than just responding to directions. Google Jules Jules is an autonomous, Agent Coding Assistant who works directly with your code base. Unlike traditional tools for code completion, Jules your repository in a secure Google Cloud VM, understand your project context, and handle it independently tasks such as writing tests, errors right, building functions and more. It works asinchronically, so you can focus elsewhere while completing tasks and returning with a detailed plan, reasoning and code changes. Jules is now publicly beta and prioritizes privacy and keeps your code safe and isolated. AI mode in Search Google is performing a new feature called AI fashion, aimed at people who want a more advanced and interactive search experience. AI mode, first tested in laboratories, is now available to everyone in the US, with a larger global launch that is later expected. A new tab for AI mode will soon appear in the Google app and on the tabletop. AI mode uses something called a ‘Nquery Fan-Out’ system. This means that it divides your demand into smaller parts and does many searches at the same time, which helps to delve deeper and return more useful and detailed answers from across the Internet. It also uses Gemini 2.5, Google’s most advanced AI model yet. With AI mode, users can ask follow-up questions, get interactive links and even use images or live video to search for real time. It’s not just about answering questions, Google wants to help Peopledo things, book from tickets to comparing data. Google meets speech translation in real-time using AI Google has launched a groundbreaking AI-powered speech translation function in Google Meet, which enables real-time audio-to-audio translation during calls. This system is built on Deepmind’s advanced Audiolm technology and integrated with the Gemini AI model, and translates spoken language into the preferred language of a listener -while preserving the original voice, tone and emotional expression of the speaker. Unlike traditional caption-based translation, this function directly transforms the speech, and delivers a natural sound in real time. Users hear the translated voice with subtle covers of the original, which increase the clarity and retain conversational context. Although there is a slight delay for processing, the experience mimics carefully to a vibrant interpreter on the call. Google Beam Google Beam is a new 3D video communication platform that transforms ordinary 2D video calls into exciting 3D experiences. It is announced on Google I/O 2025, and uses several cameras and AI to create realistic, real-time 3D footage with depth and eye contact. Bean, who is powered by Google Cloud and designed for business use, also supports exact head detection and is expected to have real -time speech translation. It will be launched on HP devices later this year. Gemma 3n Gemma 3n is Google’s first open AI model specifically designed for use on the device, which brings fast, multimodal intelligence to telephones, tablets and laptops. It is built on a new architecture developed with partners such as Qualcomm, Mediatek and Samsung, and promotes real-time, private AI experiences without relying on the cloud. Gemma 3n also forms the basis for the next generation of Gemini Nano, which allows developers to explore advanced AI directly on everyday devices. Try the “Try on” feature in Google Search allows users to see how clothes like shirts, dresses, pants and skirts are watching on them by uploading a full -length photo. The tool is available through searchLaboratories in the US and uses AI to generate an image of the outfit on the user. It also allows users to save the pictures or share for feedback before making a purchase.