Skip to content

feat: job sdk support tracking#1175

Open
FangwenDave wants to merge 9 commits into
alibaba:masterfrom
FangwenDave:feat/job-sdk-tracking
Open

feat: job sdk support tracking#1175
FangwenDave wants to merge 9 commits into
alibaba:masterfrom
FangwenDave:feat/job-sdk-tracking

Conversation

@FangwenDave

Copy link
Copy Markdown
Collaborator

resolve issue refs #1103

FangwenDave and others added 4 commits June 26, 2026 03:37
- TrackingAdapter: pluggable protocol for job metrics reporting backends.
  Third-party packages register adapters via entry_points (zero coupling).
  resolve_tracking_adapter() discovers and loads the first available adapter.

- BashTrialResult: TrialResult subclass for Bash Jobs that extracts score
  from the '=== Score Summary ===' stdout block via regex.

- BashTrial now returns BashTrialResult instead of base TrialResult.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Job._build_result() now calls _report_tracking() before returning.
The tracking adapter (if discovered) runs its full lifecycle:
init() → report() per trial → report() job summary → close().

All tracking calls are wrapped in exception isolation — adapter
failures log a warning but never break the job.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- namespace, experiment_id, job_id (replacing project, run_name)
- Align with ml_tracker SDK conventions
- Update api.py call site accordingly
- Fix test _ConcreteAdapter.init() signature to match ABC (namespace/experiment_id/job_id)
- Move adapter.close() to finally block to ensure cleanup on report failure
- Update TrackingAdapter.init() docstring to use correct parameter names
- Unify default values between init kwargs and init_config dict

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@FangwenDave FangwenDave force-pushed the feat/job-sdk-tracking branch from b7f9ce9 to a0b33d6 Compare June 29, 2026 06:33
@FangwenDave FangwenDave requested a review from dengwx2009 June 29, 2026 12:34
Breaking change to TrackingAdapter interface:
- init() now receives full JobConfig instead of pre-built dict
- Framework no longer extracts metadata; adapters do it themselves
- Enables adapters to access model_name, agents, labels, env vars

Simplifies _report_tracking:
- Removes init_config dict construction
- Passes self._config directly to adapter.init(config=...)

Adapter implementations must update to extract their own metadata
from JobConfig (e.g. model_name from config.agents or environment.env).
Comment thread rock/sdk/job/adapter.py Outdated
FangwenDave and others added 3 commits June 30, 2026 14:50
Add new function that discovers and returns all available tracking
adapters instead of just the first one. This enables multiple adapters
to work in parallel.

- resolve_tracking_adapters() returns list[TrackingAdapter]
- Keeps resolve_tracking_adapter() for backward compatibility
- Add comprehensive tests for multi-adapter scenarios

feat: add fan-out support for tracking adapters

- Update Job._report_tracking() to call all registered adapters
- Add error isolation: each adapter has independent try/except
- Add 3 test cases for fan-out behavior
- Update proposal document with progress
…overy

- resolve_tracking_adapters() now scans ROCK_TRACKING_LOAD_PATHS for
  TrackingAdapter subclasses instead of importlib entry_points
- add ROCK_TRACKING_LOAD_PATHS env var (default: rock/sdk/tracking),
  comma-separated, mirroring ROCK_CLI_LOAD_PATHS
- drop single-adapter resolve_tracking_adapter() (unused)
- rewrite test_adapter.py for directory-scan semantics

Internal adapters are layered into rock/sdk/tracking via a symlink
(setup_xrl_link.sh), so no entry_points registration or pip install step
is required. Published open-source package ships an empty directory ->
resolve_tracking_adapters() returns [] (clean no-op).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…/job/tracking

Update ROCK_TRACKING_LOAD_PATHS default to scan rock/sdk/job/tracking instead
of rock/sdk/tracking, aligning with the open-source SDK layout at
rock/sdk/job/adapter.py. The internal tracking adapter is symlinked into
rock/sdk/job/tracking by setup_xrl_link.sh.
@@ -0,0 +1,120 @@
"""Pluggable tracking adapter protocol for Job SDK.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件名 + 路径名,看不出来这个类的意义。得有tracking

- Move rock/sdk/job/adapter.py → rock/sdk/job/tracking/adapter.py
- Create rock/sdk/job/tracking/__init__.py package
- Fix docstring path: rock/sdk/tracking → rock/sdk/job/tracking
- Resolve adapters in Job.__init__ instead of on every report call
- Fix obj.__module__ check to use startswith (align with CommandLoader)
- Update all import paths in api.py, tests, and internal source
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants