Anthropic ships Claude Science, an AI workbench for researchers
Anthropic ships Claude Science, an AI workbench that bundles a coordinating agent with 60-plus curated skills and a reviewer agent that audits citations and figures. Beta for Pro, Max, Team, Enterprise.
Anthropic on Tuesday released Claude Science, an AI workbench that bundles a coordinating agent with 60-plus curated skills and a reviewer agent that audits citations and figures. The launch turns Anthropic's life-sciences push from a model and partnership story into a packaged product that runs on a researcher's laptop, an HPC node, or a Modal account. The bet is that biomedical research will be shaped by the surrounding agent environment, not the underlying model.
The Anthropic announcement lays out the launch. It carries the typical Anthropic framing: scientific work is fragmented across dozens of databases, bespoke file formats, and a tool chain that no single human can keep in their head, and a capable agent environment can absorb that fragmentation. What is different about Claude Science is the breadth of the launch, with native rendering of proteins, single-cell tracks, genome browser views, and chemical structures, plus explicit hooks for Modal on-demand compute and SSH access to existing HPC clusters. The same launch is paired with a new Team plan discount for academic and nonprofit labs and a 50-project credit program for AI for Science work, with applications open through July 15, 2026.
The workbench model Anthropic is shipping
Claude Science presents itself as a single research surface rather than a chat box. Researchers bring their own data, choose between a local macOS or Linux install, a remote SSH session, or a Modal-backed GPU pool, and then talk to a generalist coordinating agent that has access to a curated library of more than 60 skills and connectors pre-configured for genomics, single-cell, proteomics, structural biology, and cheminformatics. That coordinating agent can spin up specialist agents, run the resulting pipelines, and hand the work off to a reviewer agent that flags incorrect citations, untraceable numbers, and figures that no longer match the code that produced them. Every output keeps an auditable history of the code, the inputs, and the message thread so a result can be reproduced months later, which is the standard researchers have wanted from agentic tooling since OpenAI's first function-calling demos and which Anthropic is now packaging as a product.
Three details matter most for evaluation. First, the work runs in a session that holds context in memory, so a large dataset only needs to be loaded once. Second, only the context required for each step is sent to Claude, which means the raw data can stay on the researcher's laptop, a lab cluster, or a Modal account and never leave the system it is already on. Third, every figure generated by the agent comes with the code, the environment, and the message history that produced it, so a reviewer can rerun the figure, change an axis to log scale, and see the change propagate. These are the same properties the enterprise AI governance checklist calls out for any enterprise deployment of agentic AI: an audit trail, a human-in-the-loop checkpoint, and a clear separation between the data plane and the inference plane.
Named users and the workflow evidence Anthropic is showing
Anthropic is not launching Claude Science with a marketing video. The announcement is anchored on three named research programs that have used the workbench in beta. Manifold Bio, the tissue-targeting medicines startup, used Claude Science to nominate targets for its latest round of candidate binders, ranking each target on surface expression, trafficking, and safety with the context of its own proprietary data; what set Claude Science apart, the company said, was the end-to-end execution with judgment, not just code. Jérôme Lecoq at the Allen Institute built a multi-agent computational review template of about 20 custom skills that reads thousands of papers, pulls the central claim and the key quantitative finding, and stores them in an evidence state database; the pipeline then constructs a narrative arc, writing each section through a dedicated sub-agent with an actor-critic pair, and a separate reviewer agent checks the citation fidelity. Stephen Francis at the UCSF Brain Tumor Center used Claude Science to support glioma epidemiology work and reported that full germline workups now take about one-tenth the time they used to, with results his lab independently validated. The throughline in each case is the same: the agent environment replaces the tool chain, and the audit trail replaces the lab notebook.
The launch is also a competitive signal. The Claude family has been the model layer for life-sciences use cases since the Claude Sonnet 5 launch, but the workbench is the product layer, and Anthropic is now competing for the same seat that Benchling, Dotmatics, and the larger electronic-lab-notebook incumbents have held for a decade. The pricing story is also pointed: a Team plan discount for academic and nonprofit labs, up to $30,000 in credits for 50 AI for Science projects, and up to $2,000 in Modal compute per project, with applications open through July 15, 2026 and awards notified by July 31. The expectation is that the early academic wins seed the long-term enterprise pipeline the same way Anthropic's earlier Claude Code rollout seeded the agent-coding market, where the same agent primitives are now showing up in the same agent orchestration patterns in production codebases.
Where this fits in the broader Claude platform
The workbench is a natural extension of the agent direction Anthropic has been pushing since the spring, and it lands the same week as the Fable 5 redeployment and the Claude Sonnet 5 launch. Read together, the three moves describe a single platform bet: a frontier model family for the underlying intelligence, a runtime for agent coordination and tool use, and a series of vertical workbenches that turn the runtime into a product for a specific buyer. Claude Design was the first vertical workbench, and the Fable 5 redeployment returned the model family to a clean regulatory footing for enterprise customers. Claude Science is the second vertical workbench and the first one that runs on the customer's own infrastructure by default, which is a meaningful step in the enterprise AI story for any research-driven buyer.
The realistic ceiling for this launch is more limited than Anthropic's framing suggests. Benchling has spent a decade building the lab-data side that Claude Science treats as a skill that connects to existing systems, and the gap between "agent can read my files" and "agent can write to my LIMS" is the difference between a research assistant and a regulated workflow. The reviewer agent is also the least proven of the three agent roles; an actor-critic pair is only as good as the critic, and Anthropic's examples show reviewers catching citation and arithmetic errors, not the harder cases of methodological drift or p-hacking. Researchers will run their own validation, and the press release points at Stephen Francis's external confirmation of Claude Science's results as a model. For now the launch is best read as Anthropic packaging the agent primitives that life-sciences teams have been building themselves for the last year, and turning them into a product the lab can buy with a Team plan instead of an internal build.
Weekly newsletter
Get a weekly summary of our most popular articles
Every week we send one email with a summary of the most popular articles on AIntelligenceHub so you can stay up-to-date on the latest AI trends and topics.
Comments
Every comment is reviewed before it appears on the site.
Related articles
NVIDIA's new AI cloud business model lands with Sharon AI and Firmus
NVIDIA ships a new AI cloud business model: revenue-sharing and credit-support for AI clouds to stand up DSX AI factories on NVIDIA hardware without bearing the capex. Sharon AI and Firmus are the first partners.
Buy-side data is stale. AI agents in marketing are acting on it anyway
AdExchanger columnist Margarita Savytska argues the buy-side data layer was never built to be acted on by AI agents. Stale consent and suppression rules now drive sends and scores at scale.
Shadow agents: enterprise IT can't see what runs at the API layer
CIO columnist Lucas Bonner argues that shadow AI agents, autonomous processes that operate at the API layer without logging in, are already inside enterprise systems. The governance gap is structural.