Cohere's First Model for Developers
Introducing North Mini Code: Cohere’s Premier Developer Model

Published: June 09, 2026 | Read Time: ~3 minutes
Written By: The Cohere Team | Tags: #ProductLaunch #CompanyNews
"AI isn’t a shortcut."
Today, Cohere is proud to announce the open-source release of North Mini Code. As the first member of our next-generation model family, this is our inaugural agentic coding model specifically engineered for the sovereign developer community.
The Architecture of Efficiency
North Mini Code utilizes a Mixture-of-Experts (MoE) architecture. This allows the model to maintain a high capacity of knowledge while remaining computationally lean.
Mathematically, the active parameter ratio can be expressed as:
By utilizing only 3B active parameters out of a 30B total, North Mini Code provides professional-grade software engineering capabilities without requiring massive hardware clusters. It is designed to be deployed exactly where the developer needs it.
📊 Model Snapshot: North-Mini-Code-1.0
| Feature | Specification |
|---|---|
| License | Apache 2.0 |
| Parameter Count | 30B Total / 3B Active |
| Context Window | 256K total / 64K max generation |
| Primary Optimizations | Code gen, agentic SE, terminal operations |
| Minimum Hardware | |
| Availability | Hugging Face, Cohere API, Model Vault, OpenRouter |
Agentic Capabilities & Performance
North Mini Code isn't just for autocomplete; it is built for agency. It is designed to handle complex, multi-step software engineering workflows.
Benchmarking Success
The model holds a competitive position among similarly sized open-source models, achieving a score of 33.4 on the Artificial Analysis Coding Index.

Note: Performance data indicates strong proficiency in real-world software engineering and terminal-based tasks.
The Speed Advantage
Efficiency is central to the design of North Mini Code, focusing heavily on reducing the Total Cost of Ownership (TCO). In internal head-to-head tests against Devstral Small 2, the results were striking:
- Throughput: North Mini Code delivered up to 2.8x higher output throughput under identical hardware and concurrency.
- Latency: It showed a 30% improvement in inter-token latency, ensuring a smoother, more consistent generation pace.
- TTFT: Time-to-first-token was closely matched, though Devstral Small 2 maintained a slight lead.

Empowering the Sovereign Developer
We believe the future of AI should be shaped by those who actually run and test the code. By releasing this model under the Apache 2.0 license, we are moving away from proprietary vendor lock-in and toward Sovereign AI.
Developers now have the flexibility to control their own agentic infrastructure, allowing them to:
- Map complex system architectures.
- Manage and orchestrate specialized sub-agents.
- Conduct automated, high-quality code reviews.
🚀 Getting Started
We invite the community to help us refine this ecosystem. You can integrate North Mini Code into your workflow today:
- Download weights via Hugging Face.
- Deploy in a managed environment using Cohere Model Vault.
- Test for free using an API key or via OpenCode.
- Join the conversation on X, Discord, or Reddit.
For those looking for technical implementation, refer to our documentation for deployment guides and cookbooks.
# Example: Accessing North Mini Code via API
import cohere
co = cohere.Client('YOUR_API_KEY')
response = co.chat(
model='north-mini-code-1.0',
message='Refactor this function for better time complexity...'
)
print(response.text)

Footnotes: ¹ *Competitor data sourced from original reports or the Artificial Analysis Intelligence Index. Where data was missing (marked with ), internal tests were conducted. Gemma 4 agentic scores were provided by the Qwen team. ² Evaluations utilized the "SWE-agent" harness for SWE-Bench (Verified/Pro), a ReAct harness for Terminal Bench v2, and the Terminus-2 harness for Terminal Bench Hard.