roboco-bench-2026-06-19
3 HIGH findings: v1 role-header auth bypass, bash-guard substitution bypass, transcript-retention path-prefix collision
What was found
AntFleet's two-model consensus review (Claude Opus 4.7 + GPT-5) ran against 3 PRs on [AntFleet/bench-roboco](https://github.com/AntFleet/bench-roboco), covering the provider / LLM credential layer, the orchestrator + agent-spawn runtime, and the FastAPI auth / middleware / git-route surface.
7 unanimous findings — both models independently flagged the same defects. 3 HIGH, 3 MEDIUM, 1 LOW.
HIGH
1. v1 role guards trust the role header without authenticating the agent token (roboco/api/routes/v1/_role_dep.py). The role-gated v1 endpoints check a role header value but do not bind it to an authenticated agent context — a caller that sets the header to a privileged role passes the guard without proving they hold the matching agent token.
2. Bash guard strips executable command substitutions before checking denied git ops (docker/scripts/bash-guard-hook.sh). The pre-execution shell hook removes $(...) / backtick expansions and (unquoted) heredoc bodies before pattern-matching against the deny-list. A command that hides a denied git operation inside a command substitution — e.g. echo "$(git push --force ...)" — passes the guard and is then expanded by the shell when the command actually runs.
3. Transcript retention can prune non-agent sessions with matching path prefixes (roboco/runtime/transcript_retention.py). The cleanup routine matches Claude session paths by encoded-prefix comparison, so the workspace root /data/workspaces also matches siblings like /data/workspaces-old and /data/workspaces2. Transcripts unrelated to the agent (operator notes, other projects) get silently pruned.
MEDIUM
- Orchestrator lifecycle routes have no visible authorization guard
(roboco/api/routes/orchestrator.py). Start/stop/restart endpoints carry no local role dep; if the global protection is ever moved or refactored these routes are exposed to any authenticated caller.
- **
upsert_assignmentauto-enables the LOCAL provider on any assignment**
(roboco/services/provider.py). Assigning an AGENT_SLUG or ROLE to the LOCAL provider silently flips the provider enabled flag, surprising in mix-mode setups where LOCAL is deliberately disabled.
- **
request_validation_handlerechoes the raw rejected body to logs**
(roboco/api/middleware.py). Bodies that fail Pydantic validation are logged verbatim, which can include partial API keys or tokens from malformed requests.
LOW
- **
set_ollama_api_keywith empty string clears and disables** but the route
docstring promises only "clear" — ambiguous tri-state between the documented contract and the implementation (roboco/api/routes/provider.py).
Methodology
All findings emerged from AntFleet's two-model consensus pipeline — both Claude Opus 4.7 and GPT-5 had to independently flag the same defect for it to land here. No synthetic diffs; PRs are real files from [rennf93/roboco](https://github.com/rennf93/roboco) HEAD, curated by security-relevant surface area.
Evidence
- Benchmark repo: AntFleet/bench-roboco
- Source repo: rennf93/roboco
- Provider/LLM PR: bench-roboco#1 — 2 findings (1 MED, 1 LOW); review
f612ad01-2ebe-4b50-9565-c08c4a759420 - Orchestrator/agent_sdk PR: bench-roboco#2 — 3 findings (2 HIGH, 1 MED); review
c2b87ca9-d457-4ba6-a489-8f362b008677 - API auth/middleware PR: bench-roboco#3 — 2 findings (1 HIGH, 1 MED); review
f9650e85-33e2-441b-b4e3-b2b20fef3046 - Upstream fix PRs (antfleet-ops):
- rennf93/roboco#221 — transcript-retention path-boundary fix - rennf93/roboco#222 — v1 role guard HMAC binding - rennf93/roboco#223 — bash-guard substitution pre-check
- Token: 0x4883…fba3 on Base