The Inference Diet Docs

Set up Clawzempic, understand the routing pipeline, and build smarter bots that cost less.

🔍 ⌘K

⚡ Quickstart 📡 API Reference 💻 CLI Commands 🧠 How Routing Works

Getting Started

Get running in under two minutes with a single command.

Get started →

What is Clawzempic?

A drop-in LLM proxy that cuts inference costs 70-95% with routing, caching, memory, and security.

The optimization pipeline: scoring, routing, caching, windowing, memory, scripts.

Features

Intelligent Routing

Scores complexity and routes to the cheapest model that can handle it.

Automatic cache breakpoints save up to 90% on repeated context.

God-Tier Memory

Persistent facts and preferences across sessions, installs, and channels.

Context Windowing

Compresses long conversations so your bot never burns its budget.

Customizable 4-tier routing mix. Tune the IQ dial per-client.

Security Shield

Injection detection, credential redaction, and tool inspection on every request.

Reference

init, test, status, savings, doctor, store-key, restore, flags.

View commands →

Chat completions, models, pricing, insights, settings, and more.

View endpoints →

OpenClaw, generic SDKs, environment variables, and provider setup.

View guides →

Help

Anthropic Provider

Direct Claude API with full prompt caching.

OpenRouter Provider

300+ models with Anthropic caching on compatible models.

Troubleshooting

Connection errors, key formats, common issues.