It’s that moment you’ve been waiting for all year: Google I/O keynote day! Google kicks off its developer conference each year with a rapid-fire stream of announcements, including many unveilings of recent things it’s been working on. Brian already kicked us off by sharing what we are expecting.
We know you don’t always have time to watch the whole two-hour presentation today, so we’re taking that on and will deliver quick hits of the biggest news from the keynote as they are announced, all in an easy-to-digest, easy-to-skim list. Here we go!
Pixel 8a
Google couldn’t wait until I/O to show off the latest addition to the Pixel line and announced the new Pixel 8a last week. The handset starts at $499 and ships Tuesday. The updates, too, are what we’ve come to expect from these refreshes. At the top of the list is the addition of the Tensor G3 chip. Read more
Pixel Slate
Google’s Pixel Tablet, called Slate, is now available. If you recall, Brian reviewed the Pixel Tablet around this same time last year, and all he talked about was the base. Interestingly enough, the tablet is available without it. Read more
Ask Photos
Google Photos is getting an AI infusion with the launch of an experimental feature, Ask Photos, powered by Google’s Gemini AI model. The new addition, which rolls out later this summer, will allow users to search across their Google Photos collection using natural language queries that leverage an AI’s understanding of their photo’s content and other metadata.
While before users could search for specific people, places, or things in their photos, thanks to natural language processing, the AI upgrade will make finding the right content more intuitive and less of a manual search process.
And the example was cute, too. Who doesn’t love a tiger stuffed animal-Golden Retriever band duo called “Golden Stripes?” Read more
All About Gemini
Gemini 1.5 Pro: Another upgrade to the generative AI is that Gemini can now analyze longer documents, codebases, videos and audio recordings than before.
In a private preview of a new version of Gemini 1.5 Pro, the company’s current flagship model, it was revealed that it can take in up to 2 million tokens. That’s double the previous maximum amount. With that level, the new version of Gemini 1.5 Pro supports the largest input of any commercially available model. Read more
Gemini Live: The company previewed a new experience in Gemini called Gemini Live, which lets users have “in-depth” voice chats with Gemini on their smartphones. Users can interrupt Gemini while the chatbot’s speaking to ask clarifying questions, and it’ll adapt to their speech patterns in real time. And Gemini can see and respond to users’ surroundings, either via photos or video captured by their smartphones’ cameras.
At first glance, Live doesn’t seem like a drastic upgrade over existing tech. But Google claims it taps newer techniques from the generative AI field to deliver superior, less error-prone image analysis — and combines these techniques with an enhanced speech engine for more consistent, emotionally expressive and realistic multi-turn dialog. Read more
Gemini Nano: Now for a tiny announcement. Google is also building Gemini Nano, the smallest of its AI models, directly into the Chrome desktop client, starting with Chrome 126.This, the company says, will enable developers to use the on-device model to power their own AI features. Google itself plans to use this new capability to power features like the existing “help me write” tool from Workspace Lab in Gmail, for example. Read more
Gemini on Android: Google’s Gemini on Android, its AI replacement for Google Assistant, will soon be taking advantage of its ability to deeply integrate with Android’s mobile operating system and Google’s apps. Users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps. Meanwhile, YouTube users will be able to tap “Ask this video” to find specific information from within that YouTube video, Google says. Read more
Gemini on Google Maps: Gemini model capabilities are coming to the Google Maps platform for developers, starting with the Places API. Developers can show generative AI summaries of places and areas in their own apps and websites. The summaries are created based on Gemini’s analysis of insights from Google Maps’ community of more than 300 million contributors. What’s better? Developers will no longer have to write their own custom descriptions of places. Read more
Tensor Processing Units get a performance boost
Google unveiled its next generation — the sixth to be exact — of its Tensor Processing Units (TPU) AI chips. Dubbed Trillium, they will launch later this year. If you recall, announcing the next generation of TPUs is something of a tradition at I/O, even as the chips only roll out later in the year.
These new TPUs will feature a 4.7x performance boost in compute performance per chip when compared to the fifth generation. What’s maybe even more important, though, is that Trillium features the third generation of SparseCore, which Google describes as “a specialized accelerator for processing ultra-large embeddings common in advanced ranking and recommendation workloads.” Read more
AI in search
Google is adding more AI to its search, assuaging doubts that the company is losing market share to competitors like ChatGPT and Perplexity. It is rolling out AI-powered overviews to users in the U.S. Additionally, the company is also looking to use Gemini as an agent for things like trip planning. Read more
Google plans to use generative AI to organize the entire search results page for some search results. That’s in addition to the existing AI Overview feature, which creates a short snippet with aggregate information about a topic you were searching for. The AI Overview feature becomes generally available Tuesday, after a stint in Google’s AI Labs program. Read more
Generative AI upgrades
Google announced Imagen 3, the latest in the tech giant’s Imagen generative AI model family.
Demis Hassabis, head of DeepMind, Google’s AI research division, said that Imagen 3 more accurately understands the text prompts that it translates into images versus its predecessor, Imagen 2, and is more “creative and detailed” in its generations. In addition, the model produces fewer “distracting artifacts” and errors, he said.
“This is [also] our best model yet for rendering text, which has been a challenge for image generation models,” Hassabis added. Read more
Gemma 2 updates
Gemma 2, the next generation of Google’s Gemma models, will launch with a 27 billion parameter model in June. Read more
Project IDX
Project IDX, the company’s next-gen, AI-centric browser-based development environment, is now in open beta. With this update comes an integration with the Google Maps Platform into the IDE, helping add geolocation features to its apps, as well as integrations with the Chrome Dev Tools and Lighthouse to help debug applications. Soon, Google will also enable deploying apps to Cloud Run, Google Cloud’s serverless platform for running front- and back-end services. Read more
Veo
Google’s gunning for OpenAI’s Sora with Veo, an AI model that can create 1080p video clips around a minute long given a text prompt. Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.
It also builds on Google’s preliminary commercial work in video generation, previewed in April, which tapped the company’s Imagen 2 family of image-generating models to create looping video clips. Read more
Circle to Search
The AI-powered Circle to Search feature, which allows Android users to get instant answers using gestures like circling, will now be able to solve more complex problems across psychics and math word problems. It’s designed to make it more natural to engage with Google Search from anywhere on the phone by taking some action — like circling, highlighting, scribbling or tapping. Oh, and it’s also better to help kids with their homework directly from supported Android phones and tablets. Read more
Firebase Genkit
There’s a new addition to the Firebase platform, called Firebase Genkit, that aims to make it easier for developers to build AI-powered applications in JavaScript/TypeScript, with Go support coming soon. It’s an open-source framework, using the Apache 2.0 license, that enables developers to quickly build AI into new and existing applications.
Some of the use cases for Genkit the company is highlighting Tuesday include many of the standard GenAI use cases: content generation and summarization, text translation and generating images. Read more
We’ll be updating this post throughout the day …