Abstract illustration of a glowing scientific research workbench with data nodes and a coordinating agent, with stylized 3D protein and genome track, no humans, no readable text.

Anthropic ships Claude Science, an AI workbench for researchers

AIntelligenceHub
··5 min read

Anthropic ships Claude Science, an AI workbench that bundles a coordinating agent with 60-plus curated skills and a reviewer agent that audits citations and figures. Beta for Pro, Max, Team, Enterprise.

Anthropic on Tuesday released Claude Science, an AI workbench that bundles a coordinating agent with 60-plus curated skills and a reviewer agent that audits citations and figures. The launch turns Anthropic's life-sciences push from a model and partnership story into a packaged product that runs on a researcher's laptop, an HPC node, or a Modal account. The bet is that biomedical research will be shaped by the surrounding agent environment, not the underlying model.

The Anthropic announcement lays out the launch. It carries the typical Anthropic framing: scientific work is fragmented across dozens of databases, bespoke file formats, and a tool chain that no single human can keep in their head, and a capable agent environment can absorb that fragmentation. What is different about Claude Science is the breadth of the launch, with native rendering of proteins, single-cell tracks, genome browser views, and chemical structures, plus explicit hooks for Modal on-demand compute and SSH access to existing HPC clusters. The same launch is paired with a new Team plan discount for academic and nonprofit labs and a 50-project credit program for AI for Science work, with applications open through July 15, 2026.

The workbench model Anthropic is shipping

Claude Science presents itself as a single research surface rather than a chat box. Researchers bring their own data, choose between a local macOS or Linux install, a remote SSH session, or a Modal-backed GPU pool, and then talk to a generalist coordinating agent that has access to a curated library of more than 60 skills and connectors pre-configured for genomics, single-cell, proteomics, structural biology, and cheminformatics. That coordinating agent can spin up specialist agents, run the resulting pipelines, and hand the work off to a reviewer agent that flags incorrect citations, untraceable numbers, and figures that no longer match the code that produced them. Every output keeps an auditable history of the code, the inputs, and the message thread so a result can be reproduced months later, which is the standard researchers have wanted from agentic tooling since OpenAI's first function-calling demos and which Anthropic is now packaging as a product.

Three details matter most for evaluation. First, the work runs in a session that holds context in memory, so a large dataset only needs to be loaded once. Second, only the context required for each step is sent to Claude, which means the raw data can stay on the researcher's laptop, a lab cluster, or a Modal account and never leave the system it is already on. Third, every figure generated by the agent comes with the code, the environment, and the message history that produced it, so a reviewer can rerun the figure, change an axis to log scale, and see the change propagate. These are the same properties the enterprise AI governance checklist calls out for any enterprise deployment of agentic AI: an audit trail, a human-in-the-loop checkpoint, and a clear separation between the data plane and the inference plane.

Named users and the workflow evidence Anthropic is showing

Anthropic is not launching Claude Science with a marketing video. The announcement is anchored on three named research programs that have used the workbench in beta. Manifold Bio, the tissue-targeting medicines startup, used Claude Science to nominate targets for its latest round of candidate binders, ranking each target on surface expression, trafficking, and safety with the context of its own proprietary data; what set Claude Science apart, the company said, was the end-to-end execution with judgment, not just code. Jérôme Lecoq at the Allen Institute built a multi-agent computational review template of about 20 custom skills that reads thousands of papers, pulls the central claim and the key quantitative finding, and stores them in an evidence state database; the pipeline then constructs a narrative arc, writing each section through a dedicated sub-agent with an actor-critic pair, and a separate reviewer agent checks the citation fidelity. Stephen Francis at the UCSF Brain Tumor Center used Claude Science to support glioma epidemiology work and reported that full germline workups now take about one-tenth the time they used to, with results his lab independently validated. The throughline in each case is the same: the agent environment replaces the tool chain, and the audit trail replaces the lab notebook.

The launch is also a competitive signal. The Claude family has been the model layer for life-sciences use cases since the Claude Sonnet 5 launch, but the workbench is the product layer, and Anthropic is now competing for the same seat that Benchling, Dotmatics, and the larger electronic-lab-notebook incumbents have held for a decade. The pricing story is also pointed: a Team plan discount for academic and nonprofit labs, up to $30,000 in credits for 50 AI for Science projects, and up to $2,000 in Modal compute per project, with applications open through July 15, 2026 and awards notified by July 31. The expectation is that the early academic wins seed the long-term enterprise pipeline the same way Anthropic's earlier Claude Code rollout seeded the agent-coding market, where the same agent primitives are now showing up in the same agent orchestration patterns in production codebases.

Where this fits in the broader Claude platform

The workbench is a natural extension of the agent direction Anthropic has been pushing since the spring, and it lands the same week as the Fable 5 redeployment and the Claude Sonnet 5 launch. Read together, the three moves describe a single platform bet: a frontier model family for the underlying intelligence, a runtime for agent coordination and tool use, and a series of vertical workbenches that turn the runtime into a product for a specific buyer. Claude Design was the first vertical workbench, and the Fable 5 redeployment returned the model family to a clean regulatory footing for enterprise customers. Claude Science is the second vertical workbench and the first one that runs on the customer's own infrastructure by default, which is a meaningful step in the enterprise AI story for any research-driven buyer.

The realistic ceiling for this launch is more limited than Anthropic's framing suggests. Benchling has spent a decade building the lab-data side that Claude Science treats as a skill that connects to existing systems, and the gap between "agent can read my files" and "agent can write to my LIMS" is the difference between a research assistant and a regulated workflow. The reviewer agent is also the least proven of the three agent roles; an actor-critic pair is only as good as the critic, and Anthropic's examples show reviewers catching citation and arithmetic errors, not the harder cases of methodological drift or p-hacking. Researchers will run their own validation, and the press release points at Stephen Francis's external confirmation of Claude Science's results as a model. For now the launch is best read as Anthropic packaging the agent primitives that life-sciences teams have been building themselves for the last year, and turning them into a product the lab can buy with a Team plan instead of an internal build.

Weekly newsletter

Get a weekly summary of our most popular articles

Every week we send one email with a summary of the most popular articles on AIntelligenceHub so you can stay up-to-date on the latest AI trends and topics.

One weekly email. No sponsored sends. Unsubscribe when you want.

Comments

Every comment is reviewed before it appears on the site.

Comments stay pending until review. Posts with more than two links are held back.

Related articles