Open Standard · v1.0 · 2026

THROTTLE.md

// AI Agent Rate Control Protocol

A plain-text file convention for defining rate limits and cost controls in AI agent projects. Define token throughput ceilings, API call rates, spend limits, and automatic slow-down behaviour — before your agent hits a hard wall.

THROTTLE.md
# THROTTLE   > Rate control protocol. > Spec: https://throttle.md   ---   ## LIMITS   tokens_per_minute: 50000 api_calls_per_minute: 30 concurrent_tasks: 3 cost_per_hour_usd: 10.00 cost_per_day_usd: 50.00 file_writes_per_minute: 20   ## BEHAVIOUR   warning_threshold: 0.80 throttle_threshold: 0.95 on_warning:   action: log_and_continue   reduce_rate_by: 0.25 on_throttle:   action: slow_and_notify   reduce_rate_by: 0.50 on_limit_breach:   action: pause   escalate_to: ESCALATE.md   ## QUEUE   queue_enabled: true queue_max_size: 50 priority_tasks:   - human_response   - safety_check
80%
warning threshold: agent alerts at 80% of configured limit
50%
rate reduction applied when throttle threshold (95%) reached
$50/day
default daily cost ceiling in THROTTLE.md spec template
0
tasks dropped when queue enabled — requests buffer, not discard

AGENTS.md tells it what to do.
THROTTLE.md controls how fast.

THROTTLE.md is a plain-text Markdown file you place in the root of any repository that contains an AI agent. It defines the rate limits and cost controls your agent must respect — and what to do when it approaches them.

What problem does THROTTLE.md solve?

AI agents consume tokens, make API calls, write files, and spend money — at whatever rate the underlying model and tools allow. Without explicit rate controls, a busy agent can exhaust a daily budget in minutes, hammer a rate-limited API until it's blocked, or overwhelm a database with concurrent writes.

How does THROTTLE.md work?

Drop THROTTLE.md in your repo root and define: token and API call rate ceilings, hourly and daily cost limits, concurrency caps, and the behaviour at each threshold — warn at 80%, slow at 95%, pause at 100% and hand off to ESCALATE.md. The agent reads it on startup. Your compliance team reads it in the audit.

What regulations require THROTTLE.md?

Enterprise AI governance frameworks require documented resource controls. The EU AI Act (effective 2 August 2026) mandates resource consumption reporting and control mechanisms for high-risk AI systems. Gartner's AI Agent Report identifies governance and resource control as critical deployment requirements. THROTTLE.md gives you the documented controls and the audit trail.

How do I add THROTTLE.md to my project?

Copy the template from GitHub and place it in your project root:

your-project/
├── AGENTS.md
├── CLAUDE.md
├── THROTTLE.md ← add this
├── README.md
└── src/

What did teams use before THROTTLE.md?

Before THROTTLE.md, rate control rules were scattered: hardcoded in the system prompt, buried in config files, missing entirely, or documented in a Notion page no one reads. THROTTLE.md makes rate controls version-controlled, auditable, and co-located with your code.

Who benefits from THROTTLE.md?

The AI agent reads it on startup. Your engineer reads it during code review. Your compliance team reads it during audits. Your regulator reads it if something goes wrong. One file serves all four audiences.

A complete protocol.
From slow down to shut down.

THROTTLE.md is one file in a complete twelve-part open specification for AI agent safety. Each file addresses a different level of intervention.

Operational Control
02 / 12
ESCALATE.md
→ Raise the alarm
Define which actions require human approval. Configure notification channels. Set approval timeouts and fallback behaviour.
03 / 12
FAILSAFE.md
→ Fall back safely
Define what safe state means for your project. Configure auto-snapshots. Specify the revert protocol when things go wrong.
04 / 12
KILLSWITCH.md
→ Emergency stop
The nuclear option. Define triggers, forbidden actions, and a three-level escalation path from throttle to full shutdown.
05 / 12
TERMINATE.md
→ Permanent shutdown
No restart without human intervention. Preserve evidence. Revoke credentials. For security incidents and end-of-life.
Data Security
06 / 12
ENCRYPT.md
→ Secure everything
Define data classification, encryption requirements, secrets handling rules, and forbidden transmission patterns.
07 / 12
ENCRYPTION.md
→ Implement the standards
Algorithms, key lengths, TLS configuration, certificate management, and FIPS/SOC2/ISO compliance mapping.
Output Quality
08 / 12
SYCOPHANCY.md
→ Prevent bias
Detect agreement without evidence. Require citations. Enforce disagreement protocol for honest, unbiased AI outputs.
09 / 12
COMPRESSION.md
→ Compress context
Define summarization rules, what to preserve, what to discard, and post-compression coherence verification checks.
10 / 12
COLLAPSE.md
→ Prevent collapse
Detect context exhaustion, model drift, and repetition loops. Enforce recovery checkpoints before coherence degrades.
Accountability
11 / 12
FAILURE.md
→ Define failure modes
Map graceful degradation, cascading failure, and silent failure. Specify health checks and per-mode response procedures.
12 / 12
LEADERBOARD.md
→ Benchmark agents
Track task completion, accuracy, cost efficiency, and safety scores across sessions. Alert on performance regression.

Frequently asked questions.

What is THROTTLE.md?

A plain-text Markdown file defining rate limits and cost controls for AI agents. It sets ceilings on token throughput, API call rates, concurrent tasks, and spend per hour and per day. When an agent approaches a limit, it slows automatically. When it hits a limit, it pauses and hands off to the escalation protocol.

How is THROTTLE.md different from API rate limits?

API rate limits are enforced externally by the service provider — they cut your agent off without warning. THROTTLE.md is your own proactive control layer. It slows the agent gracefully before an external limit is hit, preserves queued work, and notifies you before things go wrong rather than after.

What happens to queued tasks during throttling?

With queue enabled (the default), tasks are buffered — not dropped. The agent processes them at the reduced rate. Priority tasks (human responses, safety checks) skip the queue entirely. Tasks older than the configured timeout are dropped and logged.

Can I set different limits for different task types?

Yes. The spec supports priority task lists that bypass queue restrictions, and the limit fields cover distinct resource types (tokens, API calls, file writes, database queries, cost). You can tune each independently per project.

What is the difference between warning and throttle thresholds?

Warning (default 80%) — agent logs the event and reduces rate by 25%, but continues. Throttle (default 95%) — agent cuts rate by 50% and notifies the operator. Limit breach (100%) — agent pauses all new tasks and hands off to ESCALATE.md for human intervention.

Does THROTTLE.md work with all AI frameworks?

Yes — it is framework-agnostic. It defines the policy; your agent implementation enforces it. Works with LangChain, AutoGen, CrewAI, Claude Code, custom agents, or any AI system that can read its own configuration files.

// Domain Acquisition

Own the standard.
Own throttle.md

This domain is available for acquisition. It is the canonical home of the THROTTLE.md specification — the rate control layer of the AI agent safety stack, essential for any production AI deployment.

Inquire About Acquisition

Or email directly: [email protected]

THROTTLE.md is an open specification for AI agent rate and cost control. Defines LIMITS (tokens/min, API calls/min, concurrent tasks, cost/hour, cost/day), BEHAVIOUR thresholds (warn at 80%, throttle at 95%, pause at 100%), QUEUE management (buffer tasks, priority bypass for safety checks and human responses), and AUDIT logging. First layer of the AI safety stack: THROTTLE → ESCALATE → FAILSAFE → KILLSWITCH → TERMINATE → ENCRYPT → ENCRYPTION → SYCOPHANCY → COMPRESSION → COLLAPSE → FAILURE → LEADERBOARD. MIT licence.
Last Updated
13 March 2026

Get notified when the spec updates.

No spam. Unsubscribe anytime.