logo
Kurashizu Blogwhere ideas flow
  • Home
  • News
  • Blog
  • About
  • Home
  • News
  • Blog
  • About

© 2026 Kurashizu. All rights reserved.

Admin·New Post·Service Status
Powered by Cloudflare·23.06.26
← Back to blog

What Makes a Good Agent Harness?

May 19, 2026|Kurashizu
harness-engineering

And why the sandbox paradigm is more nuanced than it looks.


There is a question that keeps surfacing whenever engineers move from building LLM-powered prototypes to production agents: how much structure should the scaffolding impose on the model?

Get it wrong in one direction, and your "agent" is just a fancy state machine with an LLM classifier bolted on. Get it wrong in the other direction, and you have an autonomous process that is powerful, unpredictable, and nearly impossible to debug.

This post explores what a well-designed agent harness actually looks like, where the sandbox paradigm fits, and how to think about the tradeoffs.


The Core Tension

This represents a spectrum:

  • State Machine: fully pre-defined control flow
  • ReAct: mixed reasoning + tool use with structure
  • Function Calling: model chooses tools but within fixed contracts
  • Sandbox: model writes and executes code freely inside constraints

The key question is not “which is best”, but which matches the task and model capability.


What a Harness Actually Does

A harness has exactly three legitimate jobs:

  • execute what the LLM decided
  • return results faithfully (including errors)
  • enforce safety boundaries without replacing reasoning

In practice, this means the harness is just an execution layer. It should not “interpret” or “optimize” the model’s intent.


Four Ways Harnesses Go Wrong

1. Structural Lobotomy

Instead of letting the model reason, we pre-route everything into fixed branches like:

  • search
  • answer
  • clarify

This turns the model into a classifier. It cannot dynamically chain tools or revise plans.


2. Context Fragmentation

When we split an agent into many micro-nodes, each node only sees partial history.

The result is predictable: the system loses global coherence, and decisions become locally optimal but globally inconsistent.


3. Swallowed Errors

A common failure mode is silently handling tool failures:

  • retrying without exposing the error
  • substituting fallback outputs
  • hiding timeouts or exceptions

This removes a critical signal. If the model cannot see failures, it cannot adapt its strategy.


4. Harness-Owned Termination

There are two ways to stop an agent loop:

  • Bad default: stop after max_steps
  • Better design: stop when the model explicitly signals completion

The first forces arbitrary cutoffs; the second preserves the model’s agency.


The Ideal Harness Structure

The ideal design is simple:

  1. The user sends a task to the harness

  2. The harness forwards full context + tool definitions to the model

  3. The model decides:

    • what tool to call
    • how to interpret results
    • when to stop
  4. The harness only executes tools and returns raw outputs

A key principle: the model owns decision-making, the harness owns execution.


The Sandbox Paradigm

The sandbox is the extreme version of this design philosophy.

Instead of giving the model a fixed set of tools, we give it a runtime environment where it can write and execute code.

What this changes conceptually:

  • Tool selection becomes program writing
  • Multi-step reasoning becomes code execution flow
  • Intermediate steps are no longer externalized as tool calls, but internalized in code

Tradeoffs:

Advantages

  • extremely expressive
  • flexible for unknown workflows
  • easy to audit (code is explicit)

Limitations

  • requires strong coding ability from the model
  • harder integration with authenticated external systems
  • resource usage becomes less predictable
  • failures can be more complex to debug

Choosing the Right Paradigm

A practical decision rule:

ScenarioBest fit
Fully predictable workflowState machine
Multi-step reasoning with known toolsFunction calling
Open-ended tasks without fixed structureSandbox

A useful intuition:

  • If you already know the steps → don’t use a sandbox
  • If you don’t know the steps → sandbox may be appropriate
  • If you want controlled flexibility → function calling is usually enough

A Note on Model Capability

The most important factor is not architecture — it is model quality.

Weak models:

  • mis-select tools
  • fail to recover from errors
  • lose coherence in long tasks

Strong models:

  • self-correct when tools fail
  • maintain long-horizon intent
  • naturally choose appropriate abstraction levels

In other words, harness design amplifies capability, it does not replace it.


Summary

A good agent harness:

  • keeps the LLM in control of decisions
  • executes tools without interpretation
  • returns full error signals
  • preserves full context
  • avoids imposing artificial structure
  • lets the model decide when to stop

The sandbox paradigm is the most powerful expression of this idea — but also the most demanding.

It works well only when the model is strong enough to operate with that level of freedom. Otherwise, simpler function-calling systems are usually more reliable and easier to control.


Related reading: Does a Harness Lobotomize Your LLM? — on the structural failure modes of over-engineered agent scaffolding.