Dancing mad with sandboxing
Published on , 2808 words, 11 minutes to read
Kefka is a Go-native shell sandbox with coreutils, Python via WebAssembly, and more. Learn the works of madness that went into making this happen!

The definition of an operating system gets really fuzzy when you start looking at the edges of it, but let's say that an operating system is any part of a computer system that doesn't involve pure math. When you print to the screen, render 3d graphics, connect to the internet, and write to files your code calls into the underlying system to do that work. These system calls are defined by your operating system and are exposed as functions*.
System calls are injected into each operating system process via a process kinda like how you inject dependencies into your applications for database sessions or object storage operations.
Bashing your head into the wall
A while ago a new JavaScript package got into the meme sphere at work: just-bash. It's a sandboxed environment with a shell interpreter that was originally intended for use with AI agents after its author observed that AI agents know how to use a tool called bash a lot better than a tool called search_documentation. This is backed by a "fake" shell with "fake" core utilities (cat, ls, etc, hereinafter coreutils) so that when an agent decides to rm -rf /, nothing important actually leaves the room. One of my coworkers made @tigrisdata/agent-shell on top of this that uses Tigris as its storage layer.
This is great for people in the JavaScript ecosystem, but I am not mainly a JavaScript developer. I really wanted to play with it so I started thinking what it would take to have something like this in Go. mvdan's shell package makes this a heck of a lot easier, meaning that this "fake" shell would be powered by a real shell instead of either porting half of bash to JavaScript or making up hopefully-compatible behaviour.
After a bunch of thought, hacking, and a spot of vibe coding while I did some Dawntrail extreme mount farms, I ended up with Kefka, a "fake" shell with coreutils implementations that lets you put your programs in clown jail. This package lets you add a sandboxed-in-userspace shell to your existing projects without shelling out to the actual implementations of coreutils on your machine.
So I did that
So after some thought, I came up with this interface for the "commands" to use: Execer. This takes process context and passes it as an argument to a function named Exec. Exec then does whatever the process needs it to (list files, write to stdout, etc.) and returns an error if things went wrong and no error if things didn't.
type ExecContext struct {
Stdin io.Reader
Stdout, Stderr io.Writer
Dir string
Environ expand.Environ
FS billy.Filesystem
// Runner is the active shell runner. Commands that need to dispatch a
// child command (for example, `time CMD`) should call Runner.Subshell()
// and re-enter the shell so the call goes through the same exec handler
// chain instead of poking at the registry directly. May be nil in
// embedders or tests that have not wired up a runner.
Runner *interp.Runner
}
type Execer interface {
Exec(ctx context.Context, ec *ExecContext, args []string) error
}
This is where I started vibe coding things, mostly via a skill that ports a just-bash command to the Execer interface and filesystem in Go. just-bash itself looks vibe coded from help output and manpages; I tried to go further and stay POSIX compatible, down to matching flag syntax (and in some cases output formats). If your muscle memory fails you, it's a bug in my book.
I made an operating system*
This is basically an operating system: it provides interfaces for programs (well, in this case functions) to get input from a user, send output to a user, interact with a filesystem, and more. Eventually I want to add networking via a network stack on ExecContext, probably with tsnet or wireguard-go's netstack package for the user-level side. Maybe there's room for adding CEL based network filters there too.
Porting applications with WebAssembly
Once I got basic coreutils working, I thought it would be fun to get Python, jq, and ripgrep working. From previous experimentation back in the strawberry era of AI, I had already gotten Python running in WebAssembly via wazero. This used the stdlib io/fs#FS interface to allow me to inject virtual filesystems into the WebAssembly context. I used this to isolate my chatbot's filesystem state so that it (hopefully) wasn't able to delete anything important by accident.
io/fs#FS has methods for the important stuff, and runtime interface assertions let you bridge the gap for things like writes. But it was really designed for embedded filesystems, and writes get hairy fast once you're talking to object storage or anything that isn't a tree of bytes on disk.
At some point I hit a wall and had to switch from io/fs#FS to billy, another filesystem interface that I think predates the standard library one. This gives you a bunch more methods that map a lot closer to filesystem semantics in ways that coreutils crave. The interface was also mostly compatible with io/fs#FS so most of the hard part was really changing out the type and then chasing down compiler errors until I found enough of a pattern to have Opus automate the rest of it.
From there it was a matter of adapting billy's filesystem to wazero's experimental sys interface. Mostly glue code, except where I had to translate Go errors into POSIX errno values. I had to read both the POSIX spec, the WASI spec, and the wazero source to figure out how to map errors between the two worlds. I think I'm at least 95% correct, which is likely within the margin of porting error.
Adapting that codeinterpreter/python library to the new interface was mostly straightforward, and I ended up with a flow like this:
// from https://tangled.org/xeiaso.net/kefka/blob/main/command/internal/python3/python3.go
func (Impl) Exec(ctx context.Context, ec *command.ExecContext, args []string) error {
fsConfig := wazero.NewFSConfig().
(sysfs.FSConfig).
WithSysFSMount(billyfs.New(ec.FS), "/")
config := wazero.NewModuleConfig().
// Pipe ExecContext stdio
WithStdin(ec.Stdin).WithStdout(ec.Stdout).WithStderr(ec.Stderr).
// Pipe argv
WithArgs(append([]string{"python3"}, args...)...).
WithName("python3").
// Pipe filesystem
WithFSConfig(fsConfig).
// Pipe system time
WithSysNanosleep().WithSysNanotime().WithSysWalltime()
mod, err := runtime.InstantiateModule(ctx, compiled, config)
if err != nil {
// Fit the square peg into the round hole
if exitErr, ok := errors.AsType[*wsys.ExitError](err); ok {
if code := exitErr.ExitCode(); code != 0 {
return interp.ExitStatus(uint8(code))
}
return nil
}
return err
}
return mod.Close(ctx)
}
Same trick got me ripgrep and jq. jq was annoying — wasi-sdk doesn't love jq's (ab)use of cmake — but 30 or so minutes of tweaking compiler flags got me a binary that works enough.
I could see it being pretty easy to port over arbitrary programs to Kefka using WebAssembly like this. There's just one small problem: WASI preview 0.1 doesn't allow you to open arbitrary network sockets. This has been a huge pain in practice (it means you can't do HTTP requests, database connections, or other common internet things from inside the WASM sandbox) and future work probably would include adapting wazero to use wasix instead of WASI 0.1.
Using filesystems that don't exist
OK, that handles filesystems that (arguably) exist, like the btrfs volume on my dev box. What about filesystems that don't? For the sake of argument, let's say you want this fake shell to interact with object storage as its main filesystem. At some level all you need to do is adapt the billy interface to object storage using something like storage-go.
After finding a basic implementation of an S3 -> Billy adapter, I vendored it into the Kefka repo and swapped out the "real" filesystem in cmd/kefka for an s3fs implementation pointed at a sample Tigris bucket. From there it was down to an iterative process of running commands, finding feature gaps when errors showed up, implementing them, fuzzing, and making sure things work mostly the same against Tigris as they do against a local filesystem.
WASI is cursed: it has no process-level "current working directory," which most programs assume exists. You patch around it by passing a CWD envvar, or just use absolute paths. I haven't hit anything broken in casual use, but expect rough edges. Here be dragons and this code may be known by the state of California to cause cancer.
Why does it have to use the command line?
Once everything got working with s3fs and a local shell, I wondered how hard it would be to make this work as an SSH server using the github.com/gliderlabs/ssh package. Hooking things up was pretty easy:
func HandleSSH(sess ssh.Session) error {
// Convenience variables for SSH session values
var stdout io.Writer = sess
var stderr io.Writer = sess.Stderr()
var stdin io.Reader = sess
ctx := sess.Context() // cancelled when the user disconnects
// Kefka command registry with coreutils/python/jq/etc
commands := registry.New()
coreutils.Register(commands)
wasmprog.Register(commands)
// Base envvars for all programs, needed by POSIX
env := expand.ListEnviron(
"HOME=/",
"PWD=/",
"IFS=\n",
"HOSTNAME=localhost",
"USER="+sess.User(),
// not strictly required, but just-bash sets it
"MACHTYPE=x86_64-pc-linux-gnu",
)
// Create shell engine
sh, err := interp.New(
// Set the "interactive" flag so the shell expands aliases
interp.Interactive(true),
// Forward our envvars
interp.Env(env),
// Wire up stdio
interp.StdIO(stdin, stdout, stderr),
// Change the shell exec handler such that it's constrained to the
// Kefka registry.
//
// Strictly speaking you don't have to do this, but if you don't
// then any time the registry doesn't have a command
// implementation, interp falls back to its default ExecHandler that
// executes the command as a subprocess. This is almost certainly
// not what you want.
interp.ExecHandlers(constrainToRegistry(commands)),
// Wire up per-command pwd state to the filesystem implementation
interp.CallHandler(billysh.CallHandler(commands, fsys, stdout, stderr)),
// Handle shell-level filesystem I/O (redirects, glob expansion, etc)
interp.StatHandler(billysh.FsysStatHandler(commands, fsys)),
interp.FsysOpenHandler(billysh.FsysOpenHandler(commands, fsys)),
interp.ReadDirHandler2(billysh.FsysReadDirHandler(commands, fsys)),
)
// Read shell commands
parser := syntax.NewParser()
fmt.Fprintf(stdout, "$ ")
// Split input into commands
for stmts, err := range parser.InteractiveSeq(stdin) {
if err != nil {
return err
}
if parser.Incomplete() {
fmt.Fprintf(stdout, "> ")
continue
}
for _, stmt := range stmts {
err := sh.Run(ctx, stmt)
if sh.Exited() {
return err
}
}
// Show prompt
fmt.Fprintf(stdout, "$ ")
}
return nil
}
The real handler is much messier because Python's REPL needs careful buffering, Ctrl-C has to actually cancel things, and pty wiring is its own can of cans of worms. None of that shows up if it's working. Tab completion and readline polish are easy enough; I'll let you wire those up as an exercise for the reader.
If you want to try it today, you can ssh into sophia.xeiaso.net:
$ ssh sophia.xeiaso.net
You'll get an isolated sandbox in your own bucket fork/branch. Every ls is a ListObjectsV2 against the bucket. Every qjs or python3 runs WebAssembly on the server, wired to that same bucket.
$ cat ./samples/hello.js
console.log("Hello, world!");
$ qjs ./samples/hello.js
Hello, world!
The demo bucket is seeded with examples. You'll probably have to poke around to find everything. Worst case, run help.
I want more experimental WebAssembly hacks like this to exist. I'll keep poking at it.
Put your programs in clown jail
With some effort, yeet could use Kefka's shell utilities to run Anubis builds on Windows; and if management ever makes you babysit AI agents, clown jail is a decent answer.
The code lives on Tangled. I'm wiring it into an agent harness so I can automate small tools against a local model (I'm loving Qwen3-36B-A3B).
There's a sister post on the Tigris blog that goes deeper into the AI-agent angle and the porting work using Claude Code. If you want, you can check it out here:

Facts and circumstances may have changed since publication. Please contact me before jumping to conclusions if something seems wrong or unclear.
Tags: