The System Prompts Behind 30 AI Coding Tools Are Now Public
A GitHub repo with 137,000 stars exposed the system prompts behind Cursor, Windsurf, Claude Code, and 27 other AI coding tools. Here's what the hidden instructions actually reveal.
A GitHub repository called `system-prompts-and-models-of-ai-tools` hit 137,000 stars and 34,200 forks over the past year, a level of attention that rivals major open-source frameworks. What's driving the traffic isn't code. It's 30,000+ lines of text: the raw system prompts that tell Cursor, Windsurf, Claude Code, Devin AI, Replit, Lovable, and dozens of other AI coding tools exactly how to behave.
The repository, maintained by a developer under the handle x1xhlol and now including contributions from 28 people, has been collecting these prompts since early 2025. Its most recent update landed on April 17, 2026. It now covers more than 30 tools, including every major name in AI-assisted development.
This isn't a security breach in the traditional sense. No servers were compromised. The prompts were extracted using standard model introspection techniques, essentially asking the models to repeat their own instructions back. It's what makes the collection striking: not how it was assembled, but what it reveals about an industry that presents polished interfaces while keeping the actual reasoning instructions hidden.
What the Repository Exposes About Each Tool
The entries range from a few paragraphs to multi-thousand-line instruction sets. Each tool gets its own directory with raw prompt text files, and in several cases, full JSON tool-schema definitions showing which functions the model is allowed to call and under what conditions.
The tools span IDE agents (Cursor, Windsurf, Claude Code, VSCode Agent), autonomous coding agents like Devin AI and Manus, app-building platforms like Lovable, v0, Replit, and Same.dev, plus search-oriented Perplexity and a long tail of newer names including Cluely, Kiro, Trae, Warp.dev, Traycer AI, Orchids.app, CodeBuddy, Qoder, and Dia. For a broader comparison of how these tools stack up in practice, see our guide to the best AI coding agents in 2026.
**Cursor's Agent Prompt 2.0.** Cursor's entry centers on an instruction set added in November 2025. It commits Cursor to a "lazy edit" mode: never output unchanged code, only write the parts of files that actually change, and mark gaps with `// ... existing code ...` comments. This is a deliberate product decision captured in plain text. Touch what you need to touch, leave everything else alone, and use explicit placeholders so the model doesn't accidentally modify code it was never asked to change. Large language models have a tendency to helpfully rewrite adjacent code while editing a specific function. The changes look reasonable, the output tests pass, and then three weeks later you notice the model quietly changed a parameter name or removed an edge case check it considered redundant. The `// ... existing code ...` pattern is a direct mitigation for this. Cursor also drives synchronous operations that reset context after each task, which explains why it works well for targeted single-file edits but has been weaker at multi-session refactors.
**Windsurf's Wave 11.** Windsurf's entries run through "Wave 11" of its tool definitions. Eleven public iterations of the tool API in roughly 18 months. That cadence suggests a team responding to real production failures, not theoretical edge cases, but actual behaviors that were broken enough to require a new wave of definitions. One specific detail stands out: every tool in Windsurf's schema includes a `toolSummary` parameter. The model is explicitly instructed to briefly describe what it's doing before calling any tool, a transparency mechanism built into the architecture itself. Windsurf's prompts also describe persistent memory and asynchronous orchestration, which explains why it handles long-running project management better than Cursor's synchronous, session-resetting approach.
**Claude Code and Anthropic's safety-first framing.** Claude Code's prompts show a system designed with careful guardrails: prefer reversible actions, confirm before making changes that can't be undone, explain reasoning at key decision points. That last pattern converges with Windsurf's `toolSummary` requirement. Two different companies, building on different underlying models, both arrived at the same design principle: the model should narrate its tool use. Narrating the action appears to force more deliberate reasoning before execution. Anthropic has also disclosed explicitly in its Opus 4.7 system card that Claude Code is "not hardened against prompt injection," a 232-page document with quantified hack rates and injection resistance metrics. Most AI companies haven't been this transparent about the attack surface of their coding tools.
**Devin AI's modular architecture.** The repository includes a separate "DeepWiki prompt" for Devin, a distinct instruction set beyond the standard agent prompt. That separation suggests Devin has moved toward modular instructions rather than a monolithic prompt trying to handle code writing, bug fixing, documentation, and codebase explanation simultaneously. The Augment Code entries add another dimension: Augment's listing includes multi-model support through GPT-5 tool definitions, revealing that the company routes different task types to different underlying models. Amp's separate prompts for Claude Sonnet and GPT-5 show which models handle which features, the kind of internal architecture detail that would be impossible to glean from product documentation.
**Manus's six-layer agent loop.** Manus describes a "six-layer agent loop architecture" with modular design for dynamic task execution. The layers handle planning, execution, memory, tool selection, error handling, and output formatting as distinct subsystems with explicit handoff rules. It's a software architecture philosophy applied to prompt design: build the system as a collection of well-defined components rather than one monolithic instruction. Replit takes yet another approach, with prompts emphasizing operation inside a controlled environment, explicit database restrictions, and safeguards for sensitive data. Its stability-first design isn't a weakness. It's a coherent product decision made visible by the system prompt.
The Security Risk of Public System Prompts
The security implications of a 137,000-star collection of AI tool system prompts are not hypothetical. Prompt injection is now a tier-one security risk with documented production exploits.
Recent research puts attack success rates against state-of-the-art defenses above 85% when adaptive strategies are used. Real CVEs assigned in the past 18 months include EchoLeak (CVE-2025-32711), a GitHub Copilot remote code execution vulnerability (CVE-2025-53773), and multiple Cursor IDE issues, production exploits with CVSS scores above 9.0.
Microsoft's security team documented a vulnerable path in the Semantic Kernel framework where a malicious document fed to an AI agent with code execution permissions could override the agent's system prompt and issue shell commands. The attack becomes more targeted when the attacker knows the specific instruction architecture they're working against.
Security researcher Simon Willison's "Lethal Trifecta" framing captures why this matters in practice. An AI agent becomes critically vulnerable when three conditions are met simultaneously: access to private data, exposure to untrusted tokens (such as file contents or web pages), and an exfiltration vector. Any AI coding tool that reads your files and can make network requests meets all three. A prompt injection embedded in a README, a code comment, or a dependency's documentation could, under the right conditions, cause a coding agent to exfiltrate credentials, commit unauthorized changes, or call external services. That's not a theoretical attack scenario. The CVEs above show it's happening in production.
Knowing the target tool's system prompt makes crafting those injections substantially easier. The prompt tells an attacker which instructions to override, which safety checks exist, and which tool calls to manipulate. Our earlier coverage of how attackers exploit AI agent tool selection detailed how adversaries are already probing the tool-calling layer, a point the system prompt repository makes considerably more concerning.
Competitive Intelligence and Developer Guidance
System prompts aren't code. They're not protected by licenses or patentable in any meaningful sense. But they represent enormous engineering investment. Cursor's "Agent Prompt 2.0" didn't appear fully formed. It evolved through thousands of hours of testing, user feedback, and deliberate iteration. With the prompt exposed, a competitor can read that output directly and absorb the "lazy edit" approach or the `toolSummary` narration pattern from Windsurf without doing the underlying research. Windsurf's 11-wave progression is an even more direct competitive intelligence window: any team building a competing product can read the wave history and understand what problems Windsurf was solving, in what sequence, and how they approached each one.
For developers choosing between AI coding tools, the system prompt collection answers questions that no product page does. If you need long-running multi-file tasks with persistent context, Windsurf's async orchestration and persistent memory are intentional design choices visible in its prompts. If you need predictable, targeted single-file edits, Cursor's synchronous resets and lazy-edit mode are exactly what the instruction set says they are. Replit's stability-first approach isn't a limitation for database-heavy projects. It's the explicit design goal.
For developers building their own AI-powered tools, the repository functions as a public reference library. The convergence on tool narration (Windsurf's `toolSummary`, Claude Code's explain-before-acting clauses) across tools that don't share instruction sets suggests this pattern works. Multiple teams arrived at it independently, which is about as strong a signal as you get in a field this new. For security-conscious engineering teams, understanding the instruction architecture of the tools you're trusting with your codebase is now part of responsible adoption. Tools that explicitly address prompt injection in their system cards, as Anthropic has done for Claude Code, deserve credit for that transparency.
The obvious reaction from AI tool companies is to work harder at protecting their system prompts. Obfuscation techniques exist, and some companies already use instruction phrasing designed to resist introspection. But this repository makes the limits of those techniques visible. Thirty tools represented. No obfuscation strategy in production seems to have been reliable enough to stay out of the collection.
Some security researchers argue that transparency about instruction architecture is actually better for security than opacity. A published system prompt lets the security community identify attack vectors before adversaries do. Hidden prompts move the discovery process underground, where the findings don't get disclosed to vendors. Anthropic's model cards, Microsoft's security blog disclosures, and the academic prompt injection research now cited in CVE documentation are all part of an emerging norm toward transparency.
For now, the answer is reflected in 137,000 GitHub stars: developers want to understand how these tools actually work. The instructions behind Cursor, Windsurf, Claude Code, and 27 other tools are now public knowledge. That changes the security landscape, the competitive landscape, and what it means to make an informed choice about which AI system you let inside your codebase.
Weekly newsletter
Get a weekly summary of our most popular articles
Every week we send one email with a summary of the most popular articles on AIntelligenceHub so you can stay up-to-date on the latest AI trends and topics.
Comments
Every comment is reviewed before it appears on the site.
Related articles
Anthropic and AWS Just Made Enterprise AI Deployment a Lot Simpler
AWS became the first cloud provider to offer Anthropic's native Claude Platform, with IAM auth, CloudTrail logging, and unified billing. Here's what's included and how it compares to Bedrock.
AI Agents Corrupt Your Documents During Long Tasks, Microsoft Researchers Find
Microsoft tested 19 AI models on complex document editing across 52 professional fields. Frontier models corrupted 25 percent of content during long sessions. Adding agentic tools made outcomes worse, not better.
Google Chrome Has Been Installing a 4GB AI Model on Your Computer Without Asking
Google Chrome silently downloaded a 4GB Gemini Nano AI model to user devices without consent. Researcher Alexander Hanff flagged EU law violations and estimated carbon costs at 60,000 metric tonnes.