Databricks expose task repair params by Beat-Nick · Pull Request #3 · Beat-Nick/airflow

Beat-Nick · 2026-06-29T19:52:35Z

Add Databricks-native retry settings to task operators

Summary

Adds first-class Databricks task retry settings to DatabricksNotebookOperator and DatabricksTaskOperator: max_retries, min_retry_interval_millis, and retry_on_timeout.

These are Databricks task-level retries, not Airflow task retries. Databricks reruns the failed task attempt inside the same job run; Airflow retries rerun the operator.

The payload shape change is gated on explicit retry configuration, so existing standalone tasks keep their current runs/submit payload unless users opt in by setting a Databricks retry field.

This follows the recovery-model discussion in apache/airflow#68358: native task retries handle transient task failures first, while workflow repair remains separate follow-up work for run-level recovery.

Details

The retry fields live on Databricks Jobs API tasks, so the implementation sits in DatabricksTaskBaseOperator and applies to both standalone submits and tasks inside DatabricksWorkflowTaskGroup.

For standalone DatabricksNotebookOperator and DatabricksTaskOperator, _get_run_json() switches to the tasks[] submit form only when a retry field is configured through operator arguments or, for DatabricksTaskOperator, task_config. This is required because Databricks ignores these fields at the top level of runs/submit; they must be placed on a SubmitTask.

Monitoring becomes retry-aware only when the effective Databricks max_retries permits another native attempt (-1 or a positive integer). In that mode:

Standalone operators wait on the submit run, whose terminal state includes all Databricks retry attempts.
Workflow task operators re-resolve the latest attempt for the same task_key and treat a failed attempt as final only after the parent workflow run is terminal.
Deferrable workflow monitoring passes workflow_run_id and databricks_task_key to DatabricksExecutionTrigger, so on_kill can cancel the latest retry attempt instead of a stale attempt id.

Explicit settings that do not enable retries, such as max_retries=0, retry_on_timeout=False, or min_retry_interval_millis alone, still land in the task payload but keep existing single-attempt monitoring behavior.

Changes

Adds retry settings to DatabricksNotebookOperator and DatabricksTaskOperator.
Preserves DatabricksTaskOperator precedence: direct operator arguments override matching task_config fields, and the operator-managed task_key cannot be shadowed by task_config.
Updates sync and deferrable monitoring to wait for the final Databricks retry outcome.
Accepts WAITING_FOR_RETRY and BLOCKED as non-terminal RunState life cycle states.
Adds tests for payload generation, argument precedence, sync and deferrable monitoring, trigger serialization, and waiting through WAITING_FOR_RETRY.

DatabricksSubmitRunOperator and DatabricksCreateJobsOperator remain raw payload pass-through operators; users can already set per-task retry fields in their task payloads.

Was generative AI tooling used to co-author this PR?

Yes - Codex (GPT-5)

Generated-by: Codex (GPT-5) following the guidelines

Add Databricks-native retry settings to task operators

c9908a4

Beat-Nick force-pushed the databricks-expose-task-repair-params branch from e93052b to c9908a4 Compare June 30, 2026 14:19

Fix Databricks task retry completion handling

98a6d6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Databricks expose task repair params#3

Databricks expose task repair params#3
Beat-Nick wants to merge 2 commits into
mainfrom
databricks-expose-task-repair-params

Beat-Nick commented Jun 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Beat-Nick commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Databricks-native retry settings to task operators

Summary

Details

Changes

Was generative AI tooling used to co-author this PR?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Beat-Nick commented Jun 29, 2026 •

edited

Loading