Formal Methods and the Future of Programming

For a quarter of a century, I have maintained a consistent stance: Jane Street, as an entity, simply wasn't interested in formal methods.

To be clear, we are not Luddites regarding software reliability. We believe deeply in utilizing tools that enable us to produce more robust code. In fact, we view type systems as a form of "lightweight" formal methods, and they have provided us with immense value.

Naturally, one might assume this would lead us toward more rigorous formal verification. However, except for niche applications like hardware synthesis, we concluded that the overhead of full-scale formal methods outweighed the benefits.

The Cost of Absolute Certainty

Consider the example of a formally verified microkernel—a landmark achievement in computer science.

It required 25 person-years of labor to verify a mere 8,700 lines of C. On average, every single line of code demanded roughly 23 lines of proof and half a day of human effort to verify.

While this level of rigor is justifiable for a security-critical microkernel—where the specifications are rigid and the stakes are astronomical—it is impractical for the vast majority of software. For us, it didn't even make sense for our most mission-critical systems.

The Paradigm Shift: Agentic Coding

The rise of agentic coding has fundamentally altered our perspective. We have moved from being ~~skeptical~~ excited about the potential of these tools. Consequently, we are now assembling a dedicated team to explore formal methods.

Our objective is simple: Make formal methods as ubiquitous and practical for software engineering as advanced type systems are today.

Why Agents Change the Equation

Agentic coding disrupts the traditional cost-benefit analysis of formal methods in several ways:

Lowering the Barrier to Entry: While AI agents cannot magically solve every complex proof, they act as powerful intermediaries, allowing a broader range of developers to use these tools effectively.
Addressing the "Verification Bottleneck": There is a widening chasm between the code an AI generates and the code we actually trust in production.

The Problem of "AI Slop"

Models are excellent at hitting a target goal, but they often fail to maintain the overall health of a codebase. Agentic code frequently suffers from "slop":

Unnecessary complexity.
Obscure bugs and edge cases.
Violation of critical codebase invariants.

This creates a massive burden on human reviewers. Formal methods can alleviate this by automating the verification of agent-produced code, making the review process significantly more efficient.

Feedback Loops and Universal Guarantees

Formal methods provide a superior feedback mechanism for both Reinforcement Learning (RL) and active coding agents. While we value testing—and have invested heavily in property-based testing and fuzzing—tests have a fundamental limitation: they cannot explore the entire state space.

In our work with OxCaml, we've noticed that agents thrive when given universal guarantees, represented mathematically as:

$\forall x \in \text{State}, P(x)$

Where $P$ is a property that must hold for all possible inputs. For example:

Data Races: A type system that prevents data races eliminates them $\forall$ instances.
Security: Types that make cross-site scripting (XSS) impossible provide a guarantee that testing alone cannot match.

Method	Scope	Guarantee	Effort (Traditional)	Effort (Agentic)
Testing	Sampled	Probabilistic	Low	Very Low
Type Systems	Structural	Partial/Lightweight	Medium	Low
Formal Methods	Exhaustive	Absolute	Extremely High	Decreasing

Why Jane Street?

Many startups are currently attempting to merge AI agents with formal methods. Why is Jane Street uniquely positioned to succeed?

1. Language Sovereignty

We possess deep control over our programming language. This allows us to modify the language itself to better support proof-oriented workflows. Potential avenues include:

Integrating modular property specifications into the type system.
Adding type-level constraints for ownership and mutability.
Embedding proof techniques directly into the language syntax.

2. A Prepared Community

We have a developer base that is more comfortable with complex abstractions than perhaps any other professional community. In the world of PL (Programming Language) research, the "hard part" isn't the idea—it's getting people to actually use it.

Our Roadmap

Integrate modular specifications.
Enhance ownership/mutability constraints.
Develop agent-driven proof generation.
Reduce the "proof-to-code" ratio: $\frac{\text{Lines of Proof}}{\text{Lines of Code}} \ll 23$

Our users are already accustomed to—and even demand—the "weird" and powerful type-system features we provide. This makes them the perfect partners for the next evolution of programming.