- Base fedora:41; reinstall toolchain (node 22, gcc/c++, neovim, tmux, git, etc.)
- Add QoL CLI from the user's Panama setup, all in default Fedora repos: zoxide,
eza, bat, fzf, fd, gh, gum, ripgrep, bash-completion; oh-my-posh via installer;
pnpm/yarn/bun via npm; keep codex@0.142.0 + opencode@1.17.9 pinned
- Ship neutral system-wide defaults that work even with an empty/mounted HOME:
/etc/profile.d/spoon.sh (zoxide/eza/fzf/oh-my-posh init + aliases),
/etc/tmux.conf (login-shell panes), /etc/spoon/omp.json (default prompt theme)
- .dockerignore: re-include docker/agent-job-rootfs into the build context
- Verified: codex runs a real turn on Fedora (exit 0); all tools present
- agent-job image: add neovim, tmux, less, unzip, wget, locales for the
interactive shell (tmux powers cross-reconnect session persistence)
- Wire NEXT_PUBLIC_SPOON_AGENT_WORKER_WS_URL as a build arg (Dockerfile +
compose.yml) since NEXT_PUBLIC vars are inlined at build time
- docs/agent-terminal.md: architecture, env, nginx WS exposure, dev testing,
security; note the build-time var in docs/compose.prod.yml
Root cause of the prod empty-response: the spoon-agent-worker image shipped
without a docker CLI binary, so it could never launch the codex job container.
On Debian trixie (the bun base) 'docker.io' + --no-install-recommends installs
the daemon package but omits the client (split into 'docker-cli'), leaving no
'docker' on PATH. execa('docker', ...) hit ENOENT, and with reject:false that
resolves with exitCode undefined -> coerced to 0 -> looked like a successful
empty run -> 'Codex completed without producing an assistant response'.
- agent-worker.Dockerfile: drop docker.io, install the official static docker
CLI client pinned to 29.5.3 (matches the host daemon) to /usr/local/bin/docker
- runtime/docker.ts: normalizeRunResult() so a spawn failure (exitCode null) is
always a non-zero exit carrying the real reason, never a silent empty success
- tests: cover the spawn-failure and normal-result paths
- Pin codex@0.142.0 + opencode-ai@1.17.9 in the job image (was @latest,
causing dev/prod drift)
- Worker now s the job image once per process so prod stops
running a stale Codex
- Surface Codex error/turn.failed events instead of swallowing them, so the
real failure reason is reported rather than 'no assistant response'
- Harden the Codex JSON parser to also handle the legacy msg-wrapped shape
- Fix the docker-in-docker workdir: bind-mount identical host:container path
and set SPOON_AGENT_HOST_WORKDIR (named volume can't be mounted by sibling
job containers)
- Add docs/compose.prod.yml as a documented reference deployment