Skip to content

feat: add notifications when intern needs user input#71

Open
yokunal wants to merge 1 commit intohuggingface:mainfrom
yokunal:add-notifications
Open

feat: add notifications when intern needs user input#71
yokunal wants to merge 1 commit intohuggingface:mainfrom
yokunal:add-notifications

Conversation

@yokunal
Copy link
Copy Markdown

@yokunal yokunal commented Apr 23, 2026

Summary

  • Add NotificationProvider config supporting email, pushbullet, telegram, slack, discord, and system notifications
  • Add NotificationService for async non-blocking notifications
  • Send notifications on: approval_required, job_complete, job_failed, error, session_saved

Changes

File Change
agent/config.py Added NotificationProvider model
agent/utils/notification_service.py New file with 6 notification providers
agent/core/session.py Added notification_service param and notify_* methods
agent/core/agent_loop.py Initialize NotificationService, emit on approval/error
agent/tools/jobs_tool.py Call notify_job_complete/failed after job finishes
configs/main_agent_config.example.json Example notification config

Test plan

  • All 6 modified files compile without errors
  • AST syntax tests pass (16/16)
  • NotificationService tested directly - empty config, providers, all notify methods work
  • Tested locally with actual notification providers (need credentials)

Closes #65

- Add NotificationProvider config supporting email, pushbullet, telegram,
  slack, discord, and system notifications
- Add NotificationService for async non-blocking notifications
- Integrate with agent_loop for approval_required and error events
- Add job completion/failure notifications in jobs_tool
- Add example config in configs/main_agent_config.example.json

Closes huggingface#65
@akseljoonas
Copy link
Copy Markdown
Collaborator

@claude please review this PR. Focus on correctness, security, LiteLLM/Bedrock routing impact, and whether it's safe to merge against current main. Keep it concise and prefer inline comments where it matters.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Claude finished @akseljoonas's task in 2m 30s —— View job


PR Review: feat: add notifications when intern needs user input

Tasks

  • Gather context and read changed files
  • Review agent/config.py
  • Review agent/utils/notification_service.py
  • Review agent/core/session.py
  • Review agent/core/agent_loop.py
  • Review agent/tools/jobs_tool.py
  • Review configs/main_agent_config.example.json
  • Check for security issues, correctness, routing impact
  • Post final review

Summary

Feature works correctly end-to-end and has zero LiteLLM/Bedrock routing impact — the notification path is completely orthogonal to LLM routing. Two issues block safe merge: a command-injection vector in SystemProvider and aiohttp not declared as a direct dependency (though it's likely available transitively). One medium issue around async/sync boundary needs a comment at minimum.


Blocking Issues

1. Command injection in SystemProvidernotification_service.py:245-265

message and title are interpolated directly into AppleScript (macOS) and PowerShell (Windows) strings without any escaping:

# macOS — double-quote in message/title breaks out of the AppleScript string
script = f'display notification "{message}" with title "[ML Intern] {title}"'
subprocess.run(["osascript", "-e", script], ...)

# Windows — same for PowerShell CreateTextNode args
f'...CreateTextNode("[ML Intern] {title}")) | Out-Null; '
f'...CreateTextNode("{message}")) | Out-Null; '

message originates from notify_error (LLM traceback strings — can contain ", \n, backticks) and notify_job_failed (HF API error strings). A double-quote in any of these terminates the current string literal and allows injection of additional AppleScript/PowerShell commands.

Linux notify-send is fine — it uses a subprocess arg list, not a shell string.

Fix: Escape before interpolation:

# macOS
safe_msg = message.replace('\\', '\\\\').replace('"', '\\"')
safe_title = title.replace('\\', '\\\\').replace('"', '\\"')
script = f'display notification "{safe_msg}" with title "[ML Intern] {safe_title}"'

Fix this →

2. aiohttp not in pyproject.tomlnotification_service.py:99,135,172,205

Four providers (pushbullet, telegram, slack, discord) do import aiohttp at call time. aiohttp is not declared in pyproject.toml. It's almost certainly available transitively today (via litellm), but transitive deps aren't guaranteed to stay. httpx is already a declared direct dep and supports the same async patterns.

Either add "aiohttp>=3.9" to pyproject.toml, or replace with httpx (preferred, already available):

# Replace aiohttp pattern with httpx equivalent:
async with httpx.AsyncClient(timeout=30) as client:
    resp = await client.post(url, headers=headers, json=payload)
    return resp.status_code == 200

Fix this →


Medium Issues

3. asyncio.create_task() called from a def method — session.py:302

save_and_upload_detached is a sync def, but it calls asyncio.create_task():

def save_and_upload_detached(self, repo_id: str) -> Optional[str]:
    ...
    if self.notification_service:
        asyncio.create_task(self.notify_session_saved(repo_id))  # line 302

All current callers happen to be inside the running event loop (async contexts), so it works. But asyncio.create_task() from a sync function with no running loop raises RuntimeError. This is a footgun — if someone calls save_and_upload_detached from a CLI path, signal handler, or test without a running loop, the Popen upload will succeed but the notification will crash silently (swallowed by the except on line 303).

Minimal fix: Add a guard:

try:
    loop = asyncio.get_running_loop()
    loop.create_task(self.notify_session_saved(repo_id))
except RuntimeError:
    pass  # no running loop — skip notification

4. Secrets as plain str in NotificationProviderconfig.py:40-48

smtp_password, pushbullet_api_key, telegram_bot_token are plain str fields. When model_dump() is called in agent_loop.py:1226, they're emitted as plaintext strings into the dict passed to NotificationService. If config is ever debug-logged, these leak.

The example config correctly uses ${VAR} substitution which is the right user-facing pattern. But there's no technical guard. Using pydantic.SecretStr is the standard fix — note you'd need to call .get_secret_value() in the provider code where the values are read.

Note: Session trajectory does not serialize the config object, so no leak there.


Minor / Non-blocking

notify() is async without needing to be (notification_service.py:333): The method creates tasks and returns; it never awaits. It could be def + asyncio.get_running_loop().create_task(...). Minor style point, not blocking.

Metadata logged as raw Python dict (notification_service.py:72,139): f"\nMetadata: {metadata}" prints {'key': 'value'} not JSON. json.dumps(metadata) would be cleaner for anything user-visible (email body, Telegram messages).

Discord returns 204 on success (notification_service.py:224): Already handled correctly (if resp.status == 200 or resp.status == 204). Good catch.


Verdict

Not safe to merge as-is. Issues 1 and 2 need to be addressed. The injection risk is real given that error tracebacks flow into message. The dependency issue could silently break all webhook providers in a clean install.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add notifications when the intern needs something from you

2 participants