Every thread (agent turns + terminal + project commands) now execs into one
persistent per-user container (spoon-box-{username}) instead of ephemeral
docker run --rm — so the agent and terminal share the exact same running
environment, filesystem, and in-session installs.
- docker.ts: ensureUserContainer (persistent box) + streamExecInContainer/
runExecInContainer (docker exec, streaming) sharing a factored streamSubprocess
- user-container.ts: reference-counted box lifecycle (held while any thread
workspace is active or a terminal is connected; idle-reaped after
SPOON_AGENT_BOX_IDLE_MS, default 30m)
- worker.ts: runClaim acquires the box; codex turn + runProjectCommand exec into
it; release on stop/PR/failure
- terminal.ts: execs into the shared box (dockerode TTY) instead of a per-job
container; materializeUserHome runs the dotfiles setup in the box
- Verified: agent + terminal run in the same box, share fs, dotfiles + tmux load
- Each job/terminal now mounts a persistent per-user home (${workdir}/homes/
{username}) at /home/{username}; the thread checkout lives at
~/Code/{spoon}/{branch} so every thread shows up as a folder in one home and
dotfiles/tools/nvim plugins persist across sessions
- docker.ts helpers + git.ts cloneRepository take container home/cwd + dir name
(backward-compatible defaults); codex/opencode/terminal use the per-user paths
- new user-environment.ts: fetchUserEnvironment (worker-token Convex action) +
materializeUserHome — ensures ~/.bash_profile, applies the editable overlay
files, and (hashed/idempotent) clones the public dotfiles repo + runs the
setup command inside the job image
- stopWorkspace no longer deletes the home; only the container stops
- Verified: codex runs a real turn under the new /home/{user} + ~/Code layout;
overlay .bashrc loads in the interactive shell
- attachTerminalServer() upgrades /jobs/:id/terminal WS connections, verifying a
short-lived job-scoped HMAC token (verifyTerminalToken) so the browser never
holds the worker secret
- Bridges the socket to a bash PTY via dockerode exec (Tty) in a persistent
per-job shell container (spoon-agent-term-<id>) mounting the workspace; binary
frames = stdin, JSON text frames = resize; idle containers reaped after 30m
- New env: SPOON_AGENT_TERMINAL_IMAGE/SECRET/IDLE_MS (secret falls back to the
shared worker internal token)
Root cause of the prod empty-response: the spoon-agent-worker image shipped
without a docker CLI binary, so it could never launch the codex job container.
On Debian trixie (the bun base) 'docker.io' + --no-install-recommends installs
the daemon package but omits the client (split into 'docker-cli'), leaving no
'docker' on PATH. execa('docker', ...) hit ENOENT, and with reject:false that
resolves with exitCode undefined -> coerced to 0 -> looked like a successful
empty run -> 'Codex completed without producing an assistant response'.
- agent-worker.Dockerfile: drop docker.io, install the official static docker
CLI client pinned to 29.5.3 (matches the host daemon) to /usr/local/bin/docker
- runtime/docker.ts: normalizeRunResult() so a spawn failure (exitCode null) is
always a non-zero exit carrying the real reason, never a silent empty success
- tests: cover the spawn-failure and normal-result paths
- Pin codex@0.142.0 + opencode-ai@1.17.9 in the job image (was @latest,
causing dev/prod drift)
- Worker now s the job image once per process so prod stops
running a stale Codex
- Surface Codex error/turn.failed events instead of swallowing them, so the
real failure reason is reported rather than 'no assistant response'
- Harden the Codex JSON parser to also handle the legacy msg-wrapped shape
- Fix the docker-in-docker workdir: bind-mount identical host:container path
and set SPOON_AGENT_HOST_WORKDIR (named volume can't be mounted by sibling
job containers)
- Add docs/compose.prod.yml as a documented reference deployment