feat: skip Claude step gracefully when Anthropic API is unavailable#29
Conversation
Add a pre-flight availability check before running claude-code-action. When the Anthropic API returns 5xx or is unreachable, the step is skipped with a warning rather than failing the pipeline. A second check after any runtime failure distinguishes service outages from legitimate errors. Fixes: dotCMS/core#35328 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6a0d3d308f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| *) | ||
| echo "::warning::Claude Code step failed due to API service degradation (HTTP $HTTP_CODE). Skipping gracefully." |
There was a problem hiding this comment.
Fail workflow on non-service Claude errors
The post-failure classifier treats every non-200 response as a service outage, so legitimate errors like invalid/missing API key (401/403) or rate limiting (429) will now be silently downgraded to a warning and the job will succeed. This is a regression introduced by continue-on-error: true plus this broad * branch: when Run Claude Code fails for auth/config issues, Handle Claude execution result no longer re-fails the job, which can hide broken CI configuration.
Useful? React with 👍 / 👎.
Summary
claude-code-actioncontinue-on-error: true+ post-execution re-check distinguishes service outages from legitimate errorsProblem
When the Anthropic API is down, the Claude step fails with a hard error, blocking the entire CI pipeline. Example: dotCMS/core run 24461196854
Solution
Two layers of protection in
claude-executor.yml:Layer 1 — Pre-flight check (catches most outages):
curlthe/v1/modelsendpoint with a 15s timeout before running Claudeavailable=false→ skip Claude step → warn and succeedavailable=true→ proceed so action can surface the specific errorLayer 2 — Runtime protection (catches mid-execution degradation):
continue-on-error: trueon the Claude stepTest
Validated in
dotCMS/core-workflow-test#460:Handle Claude execution resultcorrectly re-fails for non-service errors (workflow validation failure in test PR)Consumer repos to update after merge
dotCMS/core— update@v2.0.0→ new tagdotCMS/core-workflow-test— update@v2.0.0→ new tagFixes: dotCMS/core#35328