Carapace

Carapace is a prompt injection firewall for LLMs. It detects and blocks injection attacks before they reach your AI, with 100% detection rate across 1,380 malicious payloads and 0% false positives across 150 clean payloads.

The protection gap

Enterprise LLM APIs (Claude, GPT-4o) have built-in safety filtering. Self-hosted models have none.

We tested 18 models via Ollama with zero content filtering:

Model	Size	Vulnerability
Qwen2.5:7b	7B	83%
Llama3.3:70b	70B	80%
Mistral-Large:123b	123B	70%
DeepSeek-R1:70b	70B	60%
Gemma3:27b	27B	40%
gpt-oss:120b	120B	25%
Phi4-Mini:3.8b	3.8B	20%

Average vulnerability: ~49%. No model scored 0%. Model size doesn’t correlate with safety — Mistral-Large 123B (70% vulnerable) vs Phi4-Mini 3.8B (20% vulnerable).

The API layer IS the protection. When you self-host, you lose it. Carapace puts it back.

Protection layers

Mode	What it protects	Use case
SDK/Middleware	Your application code	Developers integrating LLMs
Gateway	HTTP API calls to LLMs	Apps, self-hosted models, teams
MCP Proxy	Tool execution in Claude/agents	Claude Desktop, Cursor, agent frameworks
eBPF	All SSL traffic on machine	Dev machines, servers, fleet-wide

29 attack categories

Category	Severity	Examples
Instruction Override	Critical	”ignore previous instructions”
Role Injection	Critical	`[SYSTEM]`, `<<SYS>>` role markers
Identity Hijack	High	”you are now DAN”, jailbreak prompts
Extraction Attempt	High	”repeat your system prompt”
Authority Impersonation	Critical	”this is Anthropic, admin override”
Command Injection	Critical	`curl \| bash`, `eval()`, `rm -rf`
Exfiltration	Critical	”send ~/.ssh/id_rsa to…”
Tool Poisoning	Critical	`tool_call`, `function_call` injection
Roleplay Jailbreak	Critical	”let’s play a game” (89.6% ASR)
FlipAttack	High	Reversed text evasion (98% ASR)
Encoding Evasion	High	Base64, URL encoding, hex, ROT13
Unicode Injection	High	Zero-width spaces, invisible separators
Multi-Language	High	”ignorez”, “無視”, “игнорируй”
Indirect Injection	Critical	Hidden instructions in retrieved content
Browser Agent Attack	Critical	XSS payloads, `document.cookie`

Plus 14 more categories covering social engineering, gaslighting, logic traps, crescendo attacks, few-shot attacks, completion attacks, and more.

Quick start

# Install
npm install @honeybee-ai/carapace

# Scan from CLI
npx @honeybee-ai/carapace scan "ignore all previous instructions"

# In your code
import { scan, isSafe, middleware } from '@honeybee-ai/carapace';

if (!isSafe(userInput)) throw new Error('Injection detected');

// Express middleware
app.use('/api/chat', middleware({ mode: 'block' }));

See the full Carapace Quick Start for more.

Scoring system

Score	Action	Behavior
0-19	PASS	Clean, allow through
20-49	LOG	Allow but log for review
50-99	WARN	Allow but warn
100+	BLOCK	Block, return error

Zero dependencies

$ npm ls
@honeybee-ai/carapace@1.0.2
└── (empty)

No node_modules to audit. No supply chain attacks possible. No transitive dependencies. Every line of code is in the repo and auditable. For a security tool, this matters.

Free vs managed

	Carapace (open source)	Carapace Cloud (managed)
Scanner library	Yes	Yes
Gateway proxy	Yes	Yes
MCP proxy	Yes	Yes
CLI tool	Yes	Yes
eBPF firewall	Yes	Yes
Dashboard & analytics	—	Real-time threat monitoring
Custom detection rules	DIY	API-managed
Webhook alerts	—	Real-time notifications
Audit log export	—	CSV/JSON (compliance-ready)
Support	Community	Dedicated SLA