I taught a bucket to speak Git
I Taught a Bucket to Speak Git
By Xe Iaso | Senior Cloud Whisperer | June 23, 2026
What occurs when you simply direct a git server to target an object storage bucket?
The Genesis: Billy and the Sandbox
While I was in the process of migrating agent sandboxes over to Go, I relied heavily on billy, which serves as a filesystem abstraction layer for the Go language.
The core strategy of that project involved manipulating a Tigris bucket to mimic a filesystem so convincingly that shell interpreters and their associated utilities couldn't detect the ruse. billy was the essential glue that allowed this illusion to function.
Once the system was operational, I realized I was utilizing billy in a way it wasn't originally intended. It was designed specifically for go-git—a library that implements git's data formats and protocols entirely in pure Go. Essentially, every single method within the billy interface exists solely to satisfy the requirements of go-git.
This sparked a "terrible" idea: I already possessed a bucket that could "quack" like a filesystem, and go-git speaks "filesystem" as its native tongue.
Git is Secretly an Object Store
If you peel back the "porcelain" (the user-facing commands), a git repository consists of four fundamental elements:
| Component | Description |
|---|---|
| Objects | Compressed blobs containing the actual data. |
| Commits | Objects that reference a specific tree and a parent commit. |
| Trees | The structure that maps files to their respective blobs. |
| Refs | Mutable pointers (like branches or tags) that point into the object pile. |
Correction on Mental Models:
Until I dove into this, I believed git only stored the incremental patches applied to an empty directory to reconstruct history.In reality, it tracks entire files. This explains why the tooling struggles so much when dealing with massive binary blobs. While the "diff" model is great for daily use, it is inaccurate at the storage layer.
Anatomy of a .git Folder
If you initialize a repository and commit a README.md, the internal structure looks like this:
$ tree .git
.git
├── COMMIT_EDITMSG
├── config
├── HEAD
├── index
├── objects
│ ├── 5e
│ │ └── b8151eb669aa4467b6dea2c4bce19183cd0b41
│ ├── 6a
│ │ └── 6a8ecfcae2632152486aca3d9150ef83dedd66
│ ├── f4
│ │ └── d2487a1c6d742c8037c0296ddf80625190bd80
│ ├── info
│ └── pack
└── refs
├── heads
│ └── main
└── tags
In this example, we have three objects: the commit (5eb8...), the tree, and the README file itself. The main branch is simply a pointer:
$ cat .git/refs/heads/main
5eb8151eb669aa4467b6dea2c4bce19183cd0b41
The Synergy with Tigris
The beauty of this system is that it is largely content-addressed. Mathematically, we can view the object storage as a function:
Because content-addressed objects never change after they are committed, they are perfectly aligned with Tigris's internal append-only storage model. The only volatile parts are the refs, but since those are tiny files, Tigris handles the updates effortlessly.
The "Stateful" Nightmare of Git Hosting
Despite the decentralized theory of Git, most of us rely on centralized hubs. This creates a massive architectural problem:
- Single Points of Failure: Repos are often hosted on single machines that are prone to crashing.
- The Binary Dependency: Even giants like GitHub often shell out to the actual
gitbinary to handle storage. - Cloud Paradox: We strive for stateless cloud-native environments, yet git hosting is one of the most stateful services in existence.
GitHub manages this at an unfathomable scale, but they are forced to use massive mounted filesystems because the git tooling provides no other alternative.
"A travesty of horrors beyond human comprehension."
Searching for a Better Library
If you want to build a git server without relying on a local filesystem, your options are bleak:
- Shelling out to
gitbinary:- Your "API" is just command-line arguments.
- Error handling requires screen-scraping text.
- The code is littered with
die()calls that kill the entire process.
- Using
libgit:- You inherit the same
die()behavior, leading to random app crashes.
- You inherit the same
- Using
libgit2:- Legal headaches regarding the GPL (despite linking exceptions).
- Constant overhead from jumping between Go and C.
- Stalled development and archived Go bindings.
- Still assumes a local filesystem exists.
The Checklist for a Perfect Solution:
- Pure Go implementation
- No dependency on
cgo - No requirement for
/usr/bin/git - Agnostic to the underlying storage (not tied to local FS)
The Solution: go-git
This is where go-git shines. It is a from-scratch implementation of git internals in Go. Crucially, its storage interface is built on billy—the exact interface I had already adapted for Tigris.
Conclusion: Oh no, it works
By combining these pieces, I created objgit: a git server backed entirely by object storage. The transition was nearly seamless; I only had to implement one additional filesystem call to get the whole thing booting.