Skip to main content
A gate turns a step’s exit code into a decision: did this step succeed, and if not, what should happen? Gates let you build bounded retry loops — run a step, check a condition, and restart from an earlier step until it passes — without any external scripting or orchestration. This is how Shipfox turns an agent into an iterating worker: the agent edits, a run step judges the result, and the gate sends the agent back until the check passes.

The gate schema

A gate is declared on a run step under the gate key:
steps:
  - run: npm test
    key: test
  - run: ./flaky-integration.sh
    gate:
      success_if: "exit_code == 0"
      on_failure:
        restart_from: test
        output: "Integration check failed; restarting from tests"
gate.success_if
string
A CEL expression that decides success. The only variable available is exit_code — the step’s process exit code. Examples: exit_code == 0, exit_code < 5. Required unless on_failure is present.
gate.on_failure.restart_from
string
The key of an earlier step in the same job to restart from when the gate fails (set key: on the target step). Execution resumes at that step.
gate.on_failure.output
string
An optional message recorded with the restart decision, shown in the run timeline.
Gates are valid on both run steps and agent stepssuccess_if always evaluates the step’s exit_code. The most reliable loop still puts the gate on a run step that objectively verifies the agent’s work (tests, a linter), with restart_from pointing back at the agent.

How a gate is evaluated

After a gated step finishes, Shipfox evaluates the gate and reaches one of three outcomes:
OutcomeWhenResult
Passedsuccess_if evaluates trueThe step succeeds; the job continues
Failedsuccess_if is falseRestart from restart_from if the budget allows; otherwise the job fails
UncheckableNo exit code, or the CEL expression errorsThe job fails immediately — never restarts

Loop bounds

Gates create loops, so Shipfox bounds them for you — there is no way to author an unbounded retry.
  • Per-step restart cap. A gating step is allowed 3 attempts by default (one initial run plus two restarts). When the cap is exhausted and the gate still fails, the step is marked failed and the whole job stops.
  • Job execution timeout. Independently, the job-level execution_timeout bounds total wall-clock time (default 6 hours). Whichever bound is reached first ends the loop.
Design loops to converge within a few attempts; the cap is a safety net, not a budget to spend.

Gate and retry guide

A worked example that retries a step until it passes.

Jobs & Steps

Step kinds, job isolation, and execution_timeout.