Google's Interactions API is now the default for Gemini agents
Google DeepMind promoted the Interactions API to GA on June 22, 2026 and made it the default for new Gemini models, agents, and developer docs. Here is what changed and why it matters.
Google's Interactions API is now the default for Gemini agents. On June 22, 2026, Google DeepMind promoted the Interactions API to general availability, made it the default interface across Google AI Studio, the Gemini API, and the official documentation, and set it up to be the only place where frontier long-running Gemini models and new agent capabilities will land. The Gemini API reference is now an Interactions API reference.
That is a bigger change than it sounds. The generateContent endpoint has been the single way to call Gemini since launch, and thousands of products, SDKs, agent frameworks, and internal pipelines have built around its role-based schema. The team is explicit that frontier capabilities for long-running models and agents will increasingly land exclusively on the Interactions API. The blog post is from Ali Çevik, Group Product Manager at Google DeepMind, with Philipp Schmid from developer relations, and the API first went to public beta in December 2025.
If you have already bet on agentic Gemini work, the move is significant. Our recent look at how Google DeepMind treats its own AI agents as insider threats is one example of the kind of long-running, stateful agent run that the new endpoint is being designed to carry.
What the Interactions API actually does
The Interactions API replaces the old role-based generateContent schema with a session-shaped resource called an Interaction. Where the legacy API treated every call as an independent prompt and response, the Interactions API treats each call as one step in a session that the server keeps for you. The Interaction object is a chronological sequence of typed execution steps, including user_input, thought, function_call, and model_output, so the whole transcript is a structured, queryable record instead of a free-form chat log.
The practical effect for developers is that long-running workflows no longer require the client to manage the conversation. You pass a model ID for inference, an agent ID for autonomous tasks, and you set background=True on anything long-running. The server runs the interaction asynchronously, returns an Interaction ID, and you can come back to it later. A 55-day retention window on the paid tier means past interactions are retrievable for almost two months, which is long enough to debug a multi-day agent run without keeping the entire transcript in your own state.
The API was designed to be the single endpoint for both model calls and agent calls. Same URL, same Interaction resource, just different inputs. The Google team is explicit about this in the launch post: "Whether you're calling a model or running an agent, the Interactions API gets you there in a few lines of code." That is the pitch. One endpoint, one mental model, instead of separate surfaces for chat, function calling, structured output, and code execution.
A migration guide is published for anyone running existing generateContent integrations, and Google's own docs include a toggle to switch code snippets back to the legacy format. The team frames this as a transition at your own pace, not a forced cutover. In practice the team is also saying that legacy endpoint will get the mainline Gemini models for the foreseeable future, but anything beyond that is going to be Interactions API only.
The new capabilities that ship with GA
The GA release ships with four capabilities that were missing or rough in beta, and they are the reason this is more than a version bump.
Managed Agents. A single API call now provisions a remote Linux sandbox where an agent can reason, execute code, browse the web, and manage files. The default agent is Antigravity, which is the same name as the antigravity-preview-05-2026 model in the Gemini API, and developers can define their own custom agents with instructions, skills, and data sources. This collapses what used to be three or four separate integrations (a code-execution sandbox, a browser tool, a retrieval layer, and a state store) into one API call.
Background execution. background=True on any call puts the interaction on a server-side worker and returns immediately. The Interaction object is the handle. This is the part that makes long-running, multi-step agent runs feel like a normal database query, and it is also the part that finally makes "fire and check back later" a first-class pattern instead of a hack with webhooks.
Tool combination. Built-in tools like Google Search and Google Maps can now be mixed with custom function calls in a single request, and tool results can return images alongside text. For product teams that have been stringing together separate Search-grounded calls and vision calls to get a single answer, this is a meaningful latency and cost improvement.
Deep Research upgrades. Two new agent versions land, one tuned for speed and one tuned for depth, with collaborative planning, native charts and infographics, and multimodal grounding that handles images, PDFs, and audio. Deep Research has been one of the most useful Gemini features for enterprise use cases, and the upgrade positions it more directly against Claude's research agents and OpenAI's deep research mode.
A new schema called Steps replaces the old role structure. Every action in a run, including user_input, thought, function_call, and model_output, is its own typed step. Errors now pinpoint the exact field that failed, which is a small change that has a big effect on debug time. The team also added Flex and Priority inference tiers, with Flex offering a 50% cost reduction for workloads that can tolerate higher latency.
Why Google is consolidating on one Gemini endpoint
The strategic reasoning is the easiest part to see. Google is the only major model provider that has shipped a separate API surface for agents. OpenAI ships the Responses API on top of the same Chat Completions endpoint, Anthropic ships Messages, and the convention is that one endpoint handles both inference and agentic work. By collapsing the Gemini surface to a single Interactions API, Google is bringing Gemini into line with how the rest of the industry is building, and the team is explicit that frontier capabilities for long-running models and agents will land there first.
The second reason is the developer experience argument. The Interactions API ships with an MCP-exposed skill called gemini-interactions-api that injects best-practice patterns (streaming, function calling, structured output, Deep Research) directly into a coding agent's context. That is a small detail, but it matters. The team is building for a world where the primary reader of the API documentation is not a human, it is a coding agent. If the agent can pull the current schema, the current function signatures, and the current recommended patterns straight from the docs MCP, the migration story is "tell your agent to use the new API" rather than "re-train every developer on your team."
The third reason is the partner ecosystem. Google says it is working with ecosystem partners to make it the default interface across 3P SDKs and Libraries, and lists LiteLLM, Eigent, and Agno as the first integrations. LiteLLM in particular is widely used in the Python AI infrastructure stack, and once it routes by default to Interactions, the long tail of LLM applications that depend on it will follow. That is a much faster path to default than waiting for every developer to migrate by hand.
The Interactions API is available today through the Python and JavaScript SDKs, and Google has published the full API reference, the migration guide, and a developer forum for feedback. The Interactions API page on the Gemini API docs was last updated 2026-06-22, the same day as the announcement, and the team is treating GA as a milestone, not a finish line. The clearest signal of how Google reads the next year of Gemini work is the line in the launch post: "Going forward, new models and capabilities beyond the core mainline family, along with new agentic capabilities and tools, will launch on the Interactions API." The endpoint is set. The future of Gemini is being written against it.
If you are picking a primary AI API surface, this is a useful moment to look at how the major model providers are consolidating their interfaces. The Claude vs GPT vs Gemini comparison walks through the practical differences in how each provider handles stateful agent work and what that means for product teams shipping agents without re-architecting their backend every quarter.
Weekly newsletter
Get a weekly summary of our most popular articles
Every week we send one email with a summary of the most popular articles on AIntelligenceHub so you can stay up-to-date on the latest AI trends and topics.
Comments
Every comment is reviewed before it appears on the site.
Related articles
Microsoft and Chevron sign a 20-year, 2-gigawatt power deal for Pecos
On June 22, 2026, Microsoft and Chevron announced a 20-year, 2.67-gigawatt behind-the-meter power deal to run a new Microsoft AI datacenter campus in Pecos, Texas.
Micron and Anthropic sign a deal on HBM, Claude, and Series H funding
On June 22, 2026, Micron and Anthropic announced a strategic agreement covering HBM co-design, a multi-year supply line, Claude deployment inside Micron, and a Series H investment from Micron.
Philippines DICT puts Gemini Enterprise in 50,000 government desks
Philippines DICT and Google Cloud announced a deal on June 22, 2026 to put Gemini Enterprise on 50,000 government desks this year, scaling to 200,000 over 18 months. A 56-agency Cybershield alliance ships alongside.