Privacy & Data

Honeybee is built for teams running AI agents on sensitive codebases. Privacy is not an afterthought — it’s a design constraint that shapes every data decision.

Core principles

Opt-in everything: No data leaves your machine without explicit configuration
Structural metadata only: Cloud telemetry captures counts, scores, and latency — never content
Local-first audit: Full audit trail always available locally, cloud is optional
No phone home: Open source packages (incubator, carapace, CLI, SDK) never contact any server by default

What we never collect

Regardless of configuration, Honeybee components never collect or transmit:

LLM prompts or responses
File contents or diffs
API keys, tokens, or credentials
Source code
User input text
Agent conversation history
Personal information

Telemetry configuration

Open source packages (incubator, carapace, CLI, SDK)

Default: OFF. No telemetry is sent unless you explicitly set two environment variables:

export TELEMETRY_ENDPOINT=https://your-endpoint.example.com/v1/telemetry
export TELEMETRY_API_KEY=your-key

Without both variables set, all telemetry data stays local in JSONL files at ~/.honeyb/projects/<slug>/telemetry/.

What local telemetry records

The local JSONL files record structural metadata about agent execution:

Field	Example	Purpose
Event type	`llm_call`	What happened
Timestamp	`2026-02-15T10:30:00Z`	When
Model name	`claude-sonnet`	Which model
Token counts	`500 prompt, 200 completion`	Usage tracking
Latency	`1500ms`	Performance
Cost estimate	`$0.003`	Budget tracking
Tool name	`read_file`	Which tool was called
Scan score	`5`	Carapace threat score
Scan action	`PASS`	What Carapace decided
Exit reason	`done`	Why agent stopped

Note: tool arguments are not recorded. The telemetry captures that read_file was called, not what file was read.

What cloud sync sends

When cloud telemetry is enabled, only 5-minute aggregated summaries are sent:

{
  "period": "2026-02-15T10:30:00Z/2026-02-15T10:35:00Z",
  "counts": { "llm_call": 42, "tool_call": 18 },
  "totalPromptTokens": 21000,
  "totalCompletionTokens": 8400,
  "totalCostUsd": 0.15,
  "avgLatencyMs": 1200,
  "errorRate": 0.02,
  "guardBlocks": 0
}

Individual events are never sent to the cloud. The aggregation runs locally, and only the summary crosses the network.

Colony (managed orchestration)

When using Colony (cloud-hosted hives), additional data is stored:

Data	Where	Retention	Purpose
Hive configuration	D1	Until deleted	Run your hive
Provider API keys	D1 (encrypted)	Until deleted	Authenticate to LLM providers
Agent execution logs	Durable Object	Session lifetime	Orchestration
Monthly usage	D1	12 months	Billing
Audit events (if enabled)	R2 + D1	Configurable	Compliance

Colony processes LLM calls on your behalf (forwarding to providers like Cerebras, Groq, Anthropic). Prompt content passes through Colony to reach the provider but is not stored unless you explicitly enable full audit mode.

Carapace scanning

The Carapace scanner runs entirely locally. When you scan text:

Text is analyzed by the pattern matching engine (in-process, synchronous)
Results (score, findings, action) are returned to the caller
Nothing is sent anywhere

The hosted Carapace dashboard (coming) will receive scan metadata (scores, finding categories, timestamps) but never the scanned text itself.

Data handling for the eBPF firewall

The eBPF firewall captures SSL plaintext at the kernel level. This data:

Stays in local memory (ring buffer → Node.js process)
Is scanned by the local Carapace instance
Is written to local JSONL audit files (if Nectar is enabled)
Is never transmitted to any remote service unless explicitly configured

Secret management

Local secrets: ~/.secrets/*.env files with 0600 permissions
Cloud secrets: Stored in Colony D1 with encryption at rest (Cloudflare managed)
Session tokens: 48-character hex, generated per-session, not persisted
Auth tokens: Stored in ~/.honeyb/auth/<profile>.json with 0600 permissions

Secrets are never logged, never included in error messages, and never transmitted in telemetry.

Open source transparency

All scanning, telemetry, and audit code is open source:

Scanner: @honeybee-ai/carapace
Telemetry: @honeybee-ai/hivemind-sdk/telemetry
Incubator guard: @honeybee-ai/incubator

You can audit exactly what data is collected and where it goes.

Security Overview — audit history and checklist
Audit Trail — what Nectar captures
Telemetry Events — event format reference