← Back to news

Show HN: Smart model routing directly in Claude, Codex and Cursor

github.com|37 points|28 comments|by adchurch|Jun 26, 2026

Smart Model Routing for Claude, Codex, and Cursor

Weave Badge Go Tests License: ELv2 Star History Chart

🚀 Overview

The workweave/router is a seamless proxy designed for Anthropic, OpenAI, and Gemini. Instead of relying on "vibes-based" prompting, it utilizes a lightweight, local embedder to intelligently select the most effective model for every individual request.

Developed by Weave—the leading engineering intelligence platform trusted by industry leaders like PostHog, Robinhood, and Reducto—this tool optimizes your LLM workflow.

How it Works

The router acts as a middleman. When you point your tools (like Cursor or Claude Code) to localhost:8080, the router evaluates the request using a cluster scorer based on the Avengers-Pro 1 framework to determine the optimal provider.

Key Capabilities

  • Broad Model Support: Access DeepSeek, Qwen, GLM, Kimi, Mistral, and Llama via OpenRouter or any OpenAI-compatible API.
  • Full Observability: Monitor routing decisions via the Weave dashboard at http://localhost:8080/ui/dashboard or integrate with your existing stack (e.g., Grafana, Datadog, or Honeycomb).

🛠️ Getting Started

30-Second Quickstart

The fastest way to integrate the router with Claude Code, Codex, or opencode is via a single command. The installer will guide you through tool selection, scope (user vs. project), and API key configuration.

Prerequisites:

  • Node.js 18\ge 18
  • jq (required for Claude Code and opencode paths)

Installation Options

CommandPurpose
npx @workweave/router --claudeDirect setup for Claude Code
npx @workweave/router --codexDirect setup for OpenAI Codex CLI
npx @workweave/router --opencodeDirect setup for opencode
npx @workweave/router --scope projectSaves settings to settings.json or .codex/ / opencode.json per repo
npx @workweave/router --localSelf-host on localhost:8080
npx @workweave/router --base-url <url>Set a custom internal base URL
npx @workweave/router@0.1.0Install a specific version

Self-Hosting the Stack

To run the router and the management dashboard on your own hardware:

Step 1: Execute make full-setup

Results:

  • Router: http://localhost:8080
  • Dashboard: http://localhost:8080/ui/ (Default password: admin)
  • Key: Your rk_... key will be generated.

💻 Usage & API

Request Examples

You can interact with the router using standard API formats.

Anthropic Style:

curl -sS http://localhost:8080/v1/messages \
  -H "Authorization: Bearer rk_..." \
  -d '{
    "model":"claude-sonnet-4-5",
    "max_tokens":256, 
    "messages":[{"role":"user","content":"hi"}]
  }'

OpenAI Style:

curl -sS http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer rk_..." \
  -d '{
    "model":"gpt-4o-mini", 
    "messages":[{"role":"user","content":"hi"}]
  }'

Routing Preview: To see which model the router would pick without actually sending the request: curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..."


🔌 Tool Integration Details

Claude Code

Run make install-cc to link Claude Code to your local router.

  • Control: Use slash commands /router-on, /router-off, and /router-status.

OpenAI Codex

The router patches ~/.codex/config.toml (or the project-level config) by adding a managed [model_providers.weave] block.

  • Auth: Your OPENAI_API_KEY still goes to OpenAI for plan passthrough, but the router uses the X-Weave-Router-Key header.
  • Removal: Use npx @workweave/router --uninstall --codex to clean up the config.

opencode

The router merges a provider.weave entry into ~/.config/opencode/opencode.json.

  • Mechanism: It leverages the @ai-sdk/anthropic provider, as the router natively speaks the Anthropic Messages API.

Cursor

  1. Navigate to Settings \rightarrow Models.
  2. Override the OpenAI Base URL to http://localhost:8080/v1.
  3. Enter your rk_... key.

📊 Technical Reference

Endpoints

EndpointFormatDescription
POST /v1/messagesAnthropicRouted Messages API
POST /v1/chat/completionsOpenAIRouted Chat Completions
POST /v1beta/models/:actionGeminiRouted generateContent
POST /v1/routeInternalReturns routing decision only
GET /v1/modelsAnthropicPassthrough
POST /v1/messages/count_tokensAnthropicPassthrough
GET /healthSystemLiveness and key validation

Routing Logic

The routing decision can be conceptualized as a function of the input embedding EE and the cluster weights WW: Modelselected=argmax(Score(E,W))\text{Model}_{\text{selected}} = \text{argmax}(\text{Score}(E, W))

Documentation & Contribution

Scientific Basis: For more on the logic, see the paper: Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing (Avengers-Pro) on arXiv.


✅ Checklist for Users

  • Install Node 18\ge 18
  • Install jq (if using Claude Code/opencode)
  • Run npx @workweave/router
  • Configure Base URL in Cursor/Codex
  • Verify status via /router-status or Dashboard

Group 1(1)