Google unveils Project Mariner: AI agents to use the web for you

Google unveiled on Wednesday its first-ever AI agent that can take actions on the web, a research prototype from the company’s DeepMind division called Project Mariner. The Gemini-powered agent takes control of your Chrome browser, moves the cursor on your screen, clicks buttons, and fills out forms, allowing it to use and navigate websites much like a human would.

The company is starting out by releasing its AI agent to a small group of preselected testers on Wednesday, Google says.

Google is continuing to experiment with new ways for Gemini to read, summarize, and now use websites. A Google executive tells TechCrunch this is part of a “fundamentally new UX paradigm shift”: moving users away from directly interacting with websites, and instead interacting with a generative AI system that does it for you.

A first look at Project Mariner.Image Credits:Google

These shifts could affect millions of businesses — from publishers like TechCrunch, to retailers like Walmart — which have historically relied on Google to send real people to visit and use their websites.

In a demo with TechCrunch, Google Labs director Jaclyn Konzelmann showed how Project Mariner works.

After setting up the AI agent with an extension in Chrome, a chat window pops up to the right of your browser. You can instruct the agent to do things like “create a shopping cart from a grocery store based on this list.”

Here’s what Project Mariner looks like when in use.Image Credits:Google

From there, the AI agent navigated to a grocery store’s website — in this case, Safeway — and then searched for and added items to a virtual shopping cart. One thing that’s immediately evident is how slow the agent is: There were about 5 seconds of delay between each cursor movement. At times, the agent stopped its task and reverted back to the chat window, asking for clarification about certain items (how many carrots, etc.).

Google’s agent cannot check out, as it’s not supposed to fill out credit card numbers or billing information. Project Mariner also won’t accept cookies for users or sign a terms of service agreement. Google says it purposefully doesn’t allow the agent to do these things, in order to give users more control.

Behind the scenes, Google’s agent is taking screenshots of your browser window, something users must agree to in the terms of service, and sending them to Gemini in the cloud for processing. Gemini then sends instructions back to your computer to navigate the web page.

Project Mariner can also be used to find flights and hotels, shop for household items, find recipes, and other tasks that currently require users to click through the web.

One major caveat is that Project Mariner only works on a Chrome browser’s foremost active tab, which means you can’t use your computer for other things while the agent works in the background — you need to watch Gemini slowly click around. Google DeepMind’s chief technology officer, Koray Kavukcuoglu, says this was a very intentional decision so that users know what Google’s AI agent is doing.

“Because [Gemini] is now taking actions on a user’s behalf, it’s important to take this step-by-step,” said Kavukcuoglu in an interview with TechCrunch. “It’s complementary. You, as an individual, can use websites, and now your agent can do everything that you do on a website as well.”

Website owners may be relieved to hear that Google’s AI agent works on your computer screen, because that means publishers and retailers still get your eyeballs on their pages. However, Google’s AI agent could mean that users are less engaged with the websites they visit, and one day, it may not require users to use these websites at all.

“[Project Mariner] is a fundamentally new UX paradigm shift that we’re seeing right now,” Konzelmann told TechCrunch. “We need to figure out what is the right way for all of this to change the way users interact with the web, and the way publishers can create experiences for users, as well as for agents, in the future.”

Besides Project Mariner, Google on Wednesday also unveiled several other AI agents for more specific tasks.

One AI agent, Deep Research, aims to help users explore complex topics by creating multistep research plans. It seems to compete with OpenAI’s o1, which can also do multistep reasoning. However, a Google spokesperson notes the agent is not designed to solve math and logical reasoning problems, write code, or do data analysis. The AI agent is rolling out in Gemini Advanced today and will come to the Gemini app in 2025.

When prompted with a difficult or large question, Deep Research will create a multistep action plan to answer it. After the user approves the plan, Deep Research takes a few minutes to answer the question and search the web and then generates a lengthy report on its findings.

Another new AI agent from Google, Jules, aims to help developers with coding tasks. It integrates directly into GitHub workflows, allowing Jules to view your existing work and make changes directly in GitHub. Jules is rolling out to a select group of beta testers today and will be available later in 2025.

Finally, Google DeepMind says it’s working on an AI agent to help you navigate video games, building on its long history of creating game-playing AI. Google is working with game developers, like Supercell, to test Gemini’s ability to interpret gaming worlds such as Clash of Clans. Google didn’t offer any release date for this prototype but says this work is helping them build AI agents that help navigate physical worlds, as well as virtual ones.

It’s unclear when Project Mariner will roll out to Google’s massive user base, but when they do, these agents will have a significant impact on the broader web. The web is designed for humans to use it, but Google’s AI agents could change that standard.

Source

Must Read