Google Split Its New AI Chips by Job, One for Training and One for Inference

Google says nearly 75% of its cloud customers are already using its AI products, and that more than 16 billion tokens per minute now flow through direct customer API use. Those numbers, shared on April 22, 2026 at Cloud Next, explain why the company introduced Agentic Data Cloud as more than another analytics feature. The pitch is that enterprises no longer need only better answers from AI. They need systems where AI agents can take actions safely across business data, policies, and operations.

In Google's announcement on what's new in Agentic Data Cloud, the company describes a shift from passive data platforms to what it calls a system of action. The claim is straightforward, data platforms built for human analysts are not enough when thousands of agents are expected to reason and execute in real workflows. That means architecture choices that used to be optimization work are now deployment blockers.

If you want the broader business and operating-model context behind this shift, our AI Infrastructure resource hub maps the stack decisions that usually determine whether enterprise AI rollout speeds up or stalls.

Earlier today, we also covered Google's unified Gemini enterprise platform strategy. Agentic Data Cloud is the data-and-governance layer underneath that broader product packaging move.

Google's TPU split by workload and what changed at launch

The immediate reason this matters is timing. Over the last year, many enterprises proved they could deploy copilots, summarization tools, and internal chat systems. Fewer proved they could run reliable multi-step agent workflows tied to production systems. In most cases, the gap was not model quality. It was data context, governance consistency, and cross-platform latency.

Google's announcement targets those failure points directly. It positions the data layer as the control plane that determines whether an agent can understand business meaning, retrieve permitted information, and complete actions without violating policy. That framing is useful for buyers because it highlights where early programs broke. Teams often optimized prompts and model selection while leaving data context fragmented across warehouses, SaaS apps, and local policy logic.

The company points to examples like Vodafone, American Express, and Virgin Voyages to show that large deployments are moving beyond pilot mode. The details vary by industry, but the pattern is consistent, organizations are trying to combine analytical history, operational systems, and permission-aware retrieval in one runtime path. Once that becomes the baseline, data platforms have to support both reasoning and execution at the same time.

There is also a competitive signal. Cloud vendors are now fighting on who can make enterprise agent workflows production-safe, not just who has the strongest model benchmark. That changes buying criteria. Procurement discussions that centered on model access are expanding into architecture maturity, governance depth, and cross-cloud interoperability.

For data leaders, this means platform strategy cannot be deferred to a future phase. If your architecture is still organized around monthly reporting and ad hoc BI workloads, agent programs will keep hitting the same wall. Agentic systems are event-driven, policy-sensitive, and continuous by design. The data stack has to reflect that reality.

Compute architecture choices for split training and inference fleets

The first decision is to define context ownership before scaling agents. Agents fail quickly when terms, metrics, and policy definitions differ across teams. If one unit defines margin, risk tier, or customer status differently from another, the same request can produce conflicting actions. Google's emphasis on a catalog and context engine reflects this pain point. Enterprises need one authoritative context layer with versioned business semantics, not a set of disconnected metadata projects.

The second decision is to treat access control as runtime infrastructure, not governance paperwork. In traditional analytics programs, permissions are often reviewed periodically and enforced at a broad system boundary. Agent workflows need finer-grained checks on every retrieval and action step. A platform that cannot enforce access-aware search and policy-scoped execution will force teams into risky workarounds.

The third decision is to set a cross-cloud strategy early. Many large companies now operate across Google Cloud, AWS, Azure, and on-prem systems. If agents depend on manual data movement or expensive egress-heavy pipelines, execution quality drops and costs rise quickly. Google's cross-cloud lakehouse framing is one option, but the broader lesson is platform neutrality. Teams should design for where data already lives, not where a single vendor wishes it lived.

The fourth decision is to re-scope the data team's role. The old model centered on batch pipelines, dashboard requests, and static stewardship. In agent environments, data engineers become orchestration owners who define context quality, retrieval reliability, and policy-safe automation paths. That role shift requires new staffing plans, training priorities, and accountability models.

The fifth decision is to pick workload entry points with measurable outcomes. Start with processes where agent execution can reduce cycle time or failure rate in ways finance and operations leaders trust. Good candidates usually involve repeated decisions, clear policy boundaries, and strong baseline metrics. Avoid starting with the most politically visible workflow if your platform fundamentals are still uneven.

These decisions are not glamorous, but they determine whether agent programs scale responsibly or remain impressive demos.

Capacity, power, and procurement risks for the next planning cycle

A realistic 90-day plan starts with a context and policy baseline. Inventory your highest-impact data domains, identify conflicting definitions, and publish an owner-approved semantic map that agent workflows must use. In parallel, validate permission enforcement against real user roles instead of abstract policy statements.

Next, select two or three candidate workflows and run controlled deployments with explicit success criteria. Focus on measurable improvements like cycle-time reduction, incident-rate reduction, or fewer manual escalations. Build a review cadence that includes platform engineering, security, and business operators so decisions are not made in isolation.

Then, implement minimum viable observability for agent execution paths. Capture retrieval provenance, policy outcomes, action results, and rollback events in one place. Keep the first version simple but complete enough for audits and post-incident analysis.

By days 45 to 60, evaluate cross-cloud and cost behavior under load. Test latency and reliability across the environments your workflow actually touches. If costs spike under burst conditions, tune orchestration patterns before expanding scope.

In the final month, set expansion gates and enforce them. Require evidence that quality, control, and economics hold across multiple cycles. If they do, extend permissions gradually and add adjacent workflows. If they do not, keep autonomy constrained while fixing root causes.

This staged rollout also reduces five common failure modes. First, it prevents over-indexing on autonomy before controls mature. Second, it closes observability gaps that make incident response slow. Third, it reduces role confusion between platform, security, and business teams by making ownership explicit. Fourth, it catches cost surprises before workloads scale widely. Fifth, it keeps portability criteria in scope so long-term switching options are preserved.

Google's Agentic Data Cloud launch is a strong market signal, but durable advantage will come from execution discipline inside each enterprise. Teams that pair agent ambition with context clarity, control rigor, and measurable rollout gates will move faster than teams that chase features without reworking operating fundamentals.

Google Split Its New AI Chips by Job, One for Training and One for Inference

Google's TPU split by workload and what changed at launch

Compute architecture choices for split training and inference fleets

Capacity, power, and procurement risks for the next planning cycle

Get a weekly summary of our most popular articles

Comments

Related articles

Meta Cut 8,000 Jobs and Is Betting $145 Billion It Won't Need Them Back

Nvidia Reports $81.6 Billion in Revenue and Guides to $91 Billion Next Quarter

Exa Raises $250 Million to Become the Search Engine for AI Agents