Smart Model Routing for Claude, Codex, and Cursor

Weave Badge

🚀 Overview

The workweave/router is a seamless proxy designed for Anthropic, OpenAI, and Gemini. Instead of relying on ~~"vibes-based" prompting~~, it utilizes a lightweight, local embedder to intelligently select the most effective model for every individual request.

Developed by Weave—the leading engineering intelligence platform trusted by industry leaders like PostHog, Robinhood, and Reducto—this tool optimizes your LLM workflow.

How it Works

The router acts as a middleman. When you point your tools (like Cursor or Claude Code) to localhost:8080, the router evaluates the request using a cluster scorer based on the Avengers-Pro 1 framework to determine the optimal provider.

Key Capabilities

Broad Model Support: Access DeepSeek, Qwen, GLM, Kimi, Mistral, and Llama via OpenRouter or any OpenAI-compatible API.
Full Observability: Monitor routing decisions via the Weave dashboard at http://localhost:8080/ui/dashboard or integrate with your existing stack (e.g., Grafana, Datadog, or Honeycomb).

🛠️ Getting Started

30-Second Quickstart

The fastest way to integrate the router with Claude Code, Codex, or opencode is via a single command. The installer will guide you through tool selection, scope (user vs. project), and API key configuration.

Prerequisites:

Node.js $\ge 18$
jq (required for Claude Code and opencode paths)

Installation Options

Command	Purpose
`npx @workweave/router --claude`	Direct setup for Claude Code
`npx @workweave/router --codex`	Direct setup for OpenAI Codex CLI
`npx @workweave/router --opencode`	Direct setup for opencode
`npx @workweave/router --scope project`	Saves settings to `settings.json` or `.codex/` / `opencode.json` per repo
`npx @workweave/router --local`	Self-host on `localhost:8080`
`npx @workweave/router --base-url <url>`	Set a custom internal base URL
`npx @workweave/router@0.1.0`	Install a specific version

Self-Hosting the Stack

To run the router and the management dashboard on your own hardware:

Step 1: Execute make full-setup

Results:

Router: http://localhost:8080

Dashboard: http://localhost:8080/ui/ (Default password: admin)

Key: Your rk_... key will be generated.

💻 Usage & API

Request Examples

You can interact with the router using standard API formats.

Anthropic Style:

curl -sS http://localhost:8080/v1/messages \
  -H "Authorization: Bearer rk_..." \
  -d '{
    "model":"claude-sonnet-4-5",
    "max_tokens":256, 
    "messages":[{"role":"user","content":"hi"}]
  }'

OpenAI Style:

curl -sS http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer rk_..." \
  -d '{
    "model":"gpt-4o-mini", 
    "messages":[{"role":"user","content":"hi"}]
  }'

Routing Preview: To see which model the router would pick without actually sending the request: curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..."

🔌 Tool Integration Details

Claude Code

Run make install-cc to link Claude Code to your local router.

Control: Use slash commands /router-on, /router-off, and /router-status.

OpenAI Codex

The router patches ~/.codex/config.toml (or the project-level config) by adding a managed [model_providers.weave] block.

Auth: Your OPENAI_API_KEY still goes to OpenAI for plan passthrough, but the router uses the X-Weave-Router-Key header.
Removal: Use npx @workweave/router --uninstall --codex to clean up the config.

opencode

The router merges a provider.weave entry into ~/.config/opencode/opencode.json.

Mechanism: It leverages the @ai-sdk/anthropic provider, as the router natively speaks the Anthropic Messages API.

Cursor

Navigate to Settings $\rightarrow$ Models.
Override the OpenAI Base URL to http://localhost:8080/v1.
Enter your rk_... key.

📊 Technical Reference

Endpoints

Endpoint	Format	Description
`POST /v1/messages`	Anthropic	Routed Messages API
`POST /v1/chat/completions`	OpenAI	Routed Chat Completions
`POST /v1beta/models/:action`	Gemini	Routed `generateContent`
`POST /v1/route`	Internal	Returns routing decision only
`GET /v1/models`	Anthropic	Passthrough
`POST /v1/messages/count_tokens`	Anthropic	Passthrough
`GET /health`	System	Liveness and key validation

Routing Logic

The routing decision can be conceptualized as a function of the input embedding $E$ and the cluster weights $W$ : $\text{Model}_{\text{selected}} = \text{argmax}(\text{Score}(E, W))$

Documentation & Contribution

Configuration Reference $\rightarrow$ Env vars, OTel, BYOK encryption.
Contributing Guide $\rightarrow$ Hot-reload dev, migrations, and tests.
Architecture $\rightarrow$ Package layout and provider recipes.

Scientific Basis: For more on the logic, see the paper: Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing (Avengers-Pro) on arXiv.

✅ Checklist for Users

Install Node $\ge 18$
Install jq (if using Claude Code/opencode)
Run npx @workweave/router
Configure Base URL in Cursor/Codex
Verify status via /router-status or Dashboard

Group 1(1)