Show HN: Smart model routing directly in Claude, Codex and Cursor
Smart Model Routing for Claude, Codex, and Cursor
🚀 Overview
The workweave/router is a seamless proxy designed for Anthropic, OpenAI, and Gemini. Instead of relying on "vibes-based" prompting, it utilizes a lightweight, local embedder to intelligently select the most effective model for every individual request.
Developed by Weave—the leading engineering intelligence platform trusted by industry leaders like PostHog, Robinhood, and Reducto—this tool optimizes your LLM workflow.
How it Works
The router acts as a middleman. When you point your tools (like Cursor or Claude Code) to localhost:8080, the router evaluates the request using a cluster scorer based on the Avengers-Pro 1 framework to determine the optimal provider.
Key Capabilities
- Broad Model Support: Access DeepSeek, Qwen, GLM, Kimi, Mistral, and Llama via OpenRouter or any OpenAI-compatible API.
- Full Observability: Monitor routing decisions via the Weave dashboard at
http://localhost:8080/ui/dashboardor integrate with your existing stack (e.g., Grafana, Datadog, or Honeycomb).
🛠️ Getting Started
30-Second Quickstart
The fastest way to integrate the router with Claude Code, Codex, or opencode is via a single command. The installer will guide you through tool selection, scope (user vs. project), and API key configuration.
Prerequisites:
- Node.js
jq(required for Claude Code and opencode paths)
Installation Options
| Command | Purpose |
|---|---|
npx @workweave/router --claude | Direct setup for Claude Code |
npx @workweave/router --codex | Direct setup for OpenAI Codex CLI |
npx @workweave/router --opencode | Direct setup for opencode |
npx @workweave/router --scope project | Saves settings to settings.json or .codex/ / opencode.json per repo |
npx @workweave/router --local | Self-host on localhost:8080 |
npx @workweave/router --base-url <url> | Set a custom internal base URL |
npx @workweave/router@0.1.0 | Install a specific version |
Self-Hosting the Stack
To run the router and the management dashboard on your own hardware:
Step 1: Execute
make full-setupResults:
- Router:
http://localhost:8080- Dashboard:
http://localhost:8080/ui/(Default password:admin)- Key: Your
rk_...key will be generated.
💻 Usage & API
Request Examples
You can interact with the router using standard API formats.
Anthropic Style:
curl -sS http://localhost:8080/v1/messages \
-H "Authorization: Bearer rk_..." \
-d '{
"model":"claude-sonnet-4-5",
"max_tokens":256,
"messages":[{"role":"user","content":"hi"}]
}'
OpenAI Style:
curl -sS http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer rk_..." \
-d '{
"model":"gpt-4o-mini",
"messages":[{"role":"user","content":"hi"}]
}'
Routing Preview:
To see which model the router would pick without actually sending the request:
curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..."
🔌 Tool Integration Details
Claude Code
Run make install-cc to link Claude Code to your local router.
- Control: Use slash commands
/router-on,/router-off, and/router-status.
OpenAI Codex
The router patches ~/.codex/config.toml (or the project-level config) by adding a managed [model_providers.weave] block.
- Auth: Your
OPENAI_API_KEYstill goes to OpenAI for plan passthrough, but the router uses theX-Weave-Router-Keyheader. - Removal: Use
npx @workweave/router --uninstall --codexto clean up the config.
opencode
The router merges a provider.weave entry into ~/.config/opencode/opencode.json.
- Mechanism: It leverages the
@ai-sdk/anthropicprovider, as the router natively speaks the Anthropic Messages API.
Cursor
- Navigate to Settings Models.
- Override the OpenAI Base URL to
http://localhost:8080/v1. - Enter your
rk_...key.
📊 Technical Reference
Endpoints
| Endpoint | Format | Description |
|---|---|---|
POST /v1/messages | Anthropic | Routed Messages API |
POST /v1/chat/completions | OpenAI | Routed Chat Completions |
POST /v1beta/models/:action | Gemini | Routed generateContent |
POST /v1/route | Internal | Returns routing decision only |
GET /v1/models | Anthropic | Passthrough |
POST /v1/messages/count_tokens | Anthropic | Passthrough |
GET /health | System | Liveness and key validation |
Routing Logic
The routing decision can be conceptualized as a function of the input embedding and the cluster weights :
Documentation & Contribution
- Configuration Reference Env vars, OTel, BYOK encryption.
- Contributing Guide Hot-reload dev, migrations, and tests.
- Architecture Package layout and provider recipes.
Scientific Basis: For more on the logic, see the paper: Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing (Avengers-Pro) on arXiv.
✅ Checklist for Users
- Install Node
- Install
jq(if using Claude Code/opencode) - Run
npx @workweave/router - Configure Base URL in Cursor/Codex
- Verify status via
/router-statusor Dashboard