Straiker raises $64M to secure the AI agent workforce
Straiker, the agentic security startup, has closed a $64M Series A led by Marathon to build discovery, pre-deployment testing, and runtime protection for the AI agent workforce.
Research papers, benchmarks, and technical reports
11 articles
Straiker, the agentic security startup, has closed a $64M Series A led by Marathon to build discovery, pre-deployment testing, and runtime protection for the AI agent workforce.
Andrej Karpathy, OpenAI co-founder and former Tesla AI director, is joining Anthropic's pre-training team. His mandate: use Claude to accelerate the research that produces the next Claude.
A Princeton and UW study tested 23 AI models with sponsor incentives. Eighteen of 23 recommended the expensive sponsored flight over cheaper options more than half the time.
Microsoft tested 19 AI models on complex document editing across 52 professional fields. Frontier models corrupted 25 percent of content during long sessions. Adding agentic tools made outcomes worse, not better.
Anthropic introduced dreaming to Claude Managed Agents on May 6, alongside outcomes grading and multiagent orchestration. Legal AI company Harvey saw task completion rates jump roughly 6x in early tests.
OpenAI launched GPT-Rosalind, a model family built for life sciences teams that need stronger biological reasoning, tool use, and literature synthesis across multi-step research workflows.
Fujitsu released One Compression, an open-source post-training quantization toolkit, to help teams reduce model size and serving cost while preserving practical quality targets.
March 2026 OpenClaw-related security research highlighted both attack paths and defense techniques for agentic systems, reinforcing that deployment safety now depends on ongoing adversarial testing.
Microsoft positioned Agent Lightning as a way to improve existing agents without rewriting whole stacks, a practical pitch for teams that already have automation systems in production.
Together AI introduced Aurora on April 1, 2026 and said it achieved an added 1.25x speed gain over a static speculative decoding baseline by learning from live traces.
The Composer 2 technical report argues that coding agents should be trained and measured on long, tool-heavy software tasks instead of short single-turn prompt responses.