Skip to content

Align skills with Agent Skills spec (agentskills.io)#36

Merged
shrey150 merged 21 commits intomainfrom
fix/skill-setup-frontmatter
Feb 26, 2026
Merged

Align skills with Agent Skills spec (agentskills.io)#36
shrey150 merged 21 commits intomainfrom
fix/skill-setup-frontmatter

Conversation

@shrey150
Copy link
Contributor

@shrey150 shrey150 commented Feb 22, 2026

Summary

Aligns both skills with the Agent Skills spec, rewrites the browser skill commands to match the actual CLI, and adds OpenClaw plugin metadata for discoverability.

Spec alignment

  • Remove setup.json — no agent reads it; only SKILL.md is loaded into context
  • Rename browser-automation/ to browser/ — spec requires directory name to match the name frontmatter field
  • Quote compatibility YAML value — the : set substring caused a strictyaml parse error
  • Rewrite functions description — moved trigger keywords into the frontmatter description
  • Split functions/SKILL.md into SKILL.md + REFERENCE.md for progressive disclosure
  • Remove redundant "When to Use" body section from functions
  • Add license: MIT and LICENSE.txt to both skills
  • Add table of contents to browser/REFERENCE.md (535 lines; spec recommends TOC for files >100 lines)
  • Condense browser/EXAMPLES.md from 8 repetitive examples to 4 diverse ones

CLI command rewrite (critical fix)

The SKILL.md documented commands that do not exist in the browse CLI (navigate, act, extract, observe). These were Stagehand SDK concepts, not actual CLI commands. Agents were running nonexistent commands and falling back to guessing.

  • Replace all command docs with actual CLI syntax from browse --help
  • Documented commands: open, snapshot, screenshot, click <ref>, type, fill, select, press, scroll, wait, get, stop
  • Removed fake commands: navigate, act, extract, observe, close
  • Added snapshot-first workflow — snapshot is fast/structured, screenshot is slow/expensive. Old best practices said "screenshot after every command" which wasted vision tokens
  • Added typical workflow section showing the recommended open → snapshot → click → snapshot loop
  • Fixed quick example to use real commands

Discoverability improvements

  • Expand browser skill description with anti-bot stealth, automatic CAPTCHA solving, residential proxy, and bot detection trigger phrases for ClawHub semantic search and agent tool selection
  • Expand mode comparison table — break out CAPTCHA solving, residential proxies, and session persistence as separate rows

OpenClaw plugin metadata

  • Add metadata.openclaw.requires.bins — gates the skill on the browse CLI being installed
  • Add metadata.openclaw.install — tells OpenClaw how to auto-install the CLI (npm install -g @browserbasehq/browse-cli)
  • Add homepage — improves trust score on ClawHub
  • Do NOT gate on env vars — local mode works without Browserbase API keys

Local vs remote mode guidance

  • Rewrite "Environment Selection" section with clear guidance on when to use each mode:
    • Local (default): no keys needed, for dev/simple pages/trusted sites
    • Remote (Browserbase): for protected sites, CAPTCHAs, bot detection, Cloudflare, geo-restricted content

Troubleshooting

  • Zombie daemon fix: browse stop doesn't always kill the daemon. Added pkill -f "browse.*daemon" fallback for "No active page" errors

Validation

Both skills pass skills-ref validate:

$ skills-ref validate skills/browser
Valid skill: skills/browser

$ skills-ref validate skills/functions
Valid skill: skills/functions

Test plan

  • Run skills-ref validate against both skill directories
  • Verify all documented commands match browse --help output
  • Test snapshot-first workflow: browse open <url>browse snapshotbrowse click <ref>browse stop
  • Verify zombie daemon recovery: kill daemon mid-session, confirm pkill fallback works
  • Confirm metadata.openclaw.requires.bins correctly gates the skill when browse is not on PATH
  • Publish to ClawHub (clawhub publish ./skills/browser/ --slug browse --version 2.0.0) and verify semantic search surfaces it for "captcha", "anti-bot", "protected website" queries

Note

Low Risk
Documentation/metadata-only changes that primarily rename/restructure skills and update CLI examples; low risk aside from potential broken links if consumers still reference the removed browser-automation paths.

Overview
Updates the marketplace and README to replace the old browser-automation skill with a new spec-compliant browser skill, including refreshed positioning around Browserbase remote sessions (stealth/CAPTCHA/proxies).

Replaces the previous Stagehand-style browser command documentation/examples with the actual browse CLI workflow (snapshot-first, daemon/session commands) and adds OpenClaw installation/requirements metadata plus MIT licensing.

Refactors the functions skill docs by moving trigger language into frontmatter, adding MIT licensing, and splitting deeper invocation/pattern/troubleshooting content into a new REFERENCE.md for progressive disclosure.

Written by Cursor Bugbot for commit acc6cc3. This will update automatically on new commits. Configure here.

openclaw and others added 4 commits February 21, 2026 17:10
No agent (Claude Code, OpenCode, OpenClaw, Cursor, Windsurf) reads
setup.json — only SKILL.md is loaded into context. The setup.json
pattern was invisible to all agents and therefore broken everywhere.

Replace with standard frontmatter fields:
- requires.bins: [browser] — agents skip/suppress the skill if the
  binary isn't installed, rather than always injecting it and hoping
  the agent notices setupComplete: false
- install.kind/pkg — agents that support auto-install can invoke
  npm install @browserbasehq/stagehand-cli automatically

Also remove the ".env file" reference for credentials — the env vars
are read from the environment, not specifically from a .env file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed frontmatter

requires.bins and install.kind/pkg are not part of any agent skills spec.
Replace with:
- compatibility field (valid per agentskills.io spec) to surface requirements
- inline `which browser || npm install` check in the skill body so the
  agent can self-heal without relying on non-standard frontmatter fields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The agentskills.io spec requires the directory name to match the name
field in SKILL.md frontmatter. Since the skill is named "browser"
(invoked as /browser), the directory should be browser/ not browser-automation/.

Also fix inconsistent heading levels (### → ##).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Validated both skills against the skills-ref reference library and
fixed all issues found:

- Quote compatibility YAML value to fix strictyaml parse error
- Rewrite functions description with trigger keywords (schedule, webhook,
  cloud, cron, Browserbase Functions) -- the spec requires triggers in
  the description, not in the body
- Split functions SKILL.md into SKILL.md + REFERENCE.md for progressive
  disclosure (invocation examples, common patterns, troubleshooting)
- Remove "When to Use" body section from functions (redundant with
  description, invisible during skill discovery)
- Add license: Apache-2.0 and LICENSE.txt to both skills
- Add table of contents to browser REFERENCE.md (535 lines)
- Condense browser EXAMPLES.md from 8 repetitive examples to 4 diverse ones

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shrey150 shrey150 changed the title Use standard skill frontmatter for setup instead of setup.json Align skills with Agent Skills spec (agentskills.io) Feb 22, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shrey150 shrey150 requested a review from Kylejeong2 February 22, 2026 02:46
Copy link
Member

@Kylejeong2 Kylejeong2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits to resolve. Is the assumption for Functions that Claude Code or the Agent is going to generate code to run?

shrey150 and others added 3 commits February 23, 2026 13:35
- Update copyright year from 2025 to 2026 in both LICENSE.txt files
- Fix package name from @browserbasehq/stagehand-cli (doesn't exist)
  to @browserbasehq/browse-cli (actual npm package)
- Update CLI command from `browser` to `browse` to match the npm
  package binary name across SKILL.md, REFERENCE.md, and EXAMPLES.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add anti-bot stealth, CAPTCHA solving, residential proxy, and session
persistence details to the browser skill description and mode comparison
table. These trigger phrases help AI agents discover and select this
skill when users need to interact with protected or JavaScript-heavy
websites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add requires.bins (browse CLI) and install spec so OpenClaw can gate
  the skill properly and auto-install the CLI
- Add homepage for ClawHub trust score
- Do NOT gate on env vars — local mode works without Browserbase keys
- Rewrite "Environment Selection" section with clear guidance on when
  to use local mode (simple pages) vs remote mode (protected sites,
  CAPTCHAs, bot detection, Cloudflare, geo-restricted content)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shrey150 shrey150 requested a review from Kylejeong2 February 24, 2026 03:28
Session logs show the agent screenshots after every action (expensive,
slow) and ignores browse act/observe in favor of manual snapshot → click
ref loops. Update the skill to:

- Document browse snapshot and recommend it as default over screenshot
- Add guidance on when to use snapshot vs screenshot
- Steer toward browse act/observe over low-level ref-based commands
- Rewrite best practices to reflect snapshot-first, act-first workflow

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

shrey150 and others added 3 commits February 24, 2026 01:05
The SKILL.md documented commands (navigate, act, extract, observe) that
don't exist in the CLI. The actual commands are open, click, type, fill,
snapshot, etc. This caused agents to run nonexistent commands and fall
back to guessing.

- Replace all command docs with actual CLI syntax from browse --help
- Document snapshot-first workflow with element refs
- Add session management commands (stop, status, pages, tab_switch)
- Add "No active page" to troubleshooting
- Fix quick example to use real commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
browse stop doesn't always kill the daemon process. Add pkill fallback
for when the daemon is stuck with wsUrl: "unknown" after a SIGTERM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… CLI commands

- Rename browser-automation → browser in README.md and marketplace.json
- Rewrite EXAMPLES.md: replace nonexistent commands (navigate, act,
  extract, observe, close) with real browse CLI commands (open, snapshot,
  click, type, fill, get, stop). 4 concrete examples including remote
  mode escalation.
- Rewrite REFERENCE.md: replace Stagehand/Playwright architecture with
  actual daemon-based CLI docs, all 20 real commands, env var config.
- SKILL.md: add "Activating Remote Mode" progressive disclosure section,
  fix get text to require selector argument.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shrey150 shrey150 force-pushed the fix/skill-setup-frontmatter branch from b1b9eff to 3fb2f49 Compare February 24, 2026 16:38
Add browse mode to SKILL.md commands list and REFERENCE.md. Rewrite
"Activating Remote Mode" as "Switching Between Local and Remote Mode"
using browse mode as the primary mechanism.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Kylejeong2
Copy link
Member

@cursoragent can you review this PR

@cursor
Copy link

cursor bot commented Feb 24, 2026

Short answer: mostly good. I fixed the one real issue and pushed.

  • Re: “not on npm” — @browserbasehq/browse-cli isn’t resolvable on npm. I removed the auto-install from the OpenClaw metadata and made the skill gate purely on the browse binary being present. Also removed the explicit npm install -g @browserbasehq/browse-cli text from compatibility/setup. Commit: 5b74333.
  • Paths are fine now: .claude-plugin/marketplace.json points to ./skills/browser and README.md links to skills/browser/SKILL.md.
  • Year is already 2026 in both skills/*/LICENSE.txt.
  • Minor doc tweak: generalized one reference note that mentioned a specific “browse-cli v0.1.4”.

If/when the browse CLI is published, we can reintroduce the install stanza:

  • Add back metadata.openclaw.install with the correct npm package and bins: [browse].
  • Restore the setup hint to install via npm.

I’ve pushed the fixes to fix/skill-setup-frontmatter.

Open in Web Open in Cursor 

…rowse-cli; gate on 'browse' bin only\n\n- Drop OpenClaw install block to avoid failed installs\n- Make compatibility + setup check generic (no npm package name)\n- Generalize REFERENCE note (avoid browse-cli v0.1.4 mention)

Co-authored-by: Kyle Jeong <Kylejeong2@users.noreply.github.com>
@shrey150 shrey150 requested review from Kylejeong2 and removed request for Kylejeong2 February 25, 2026 01:13
Copy link
Member

@Kylejeong2 Kylejeong2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, would ask Peyton to test first before merging and breaking live version

cursoragent and others added 5 commits February 25, 2026 02:32
…basehq/browse-cli; gate on 'browse' bin only\n\n- Drop OpenClaw install block to avoid failed installs\n- Make compatibility + setup check generic (no npm package name)\n- Generalize REFERENCE note (avoid browse-cli v0.1.4 mention)"

This reverts commit 5b74333.
Remove: get html, drag, highlight, is, execute references.
Add docs for: hover, newpage, eval, viewport, network capture,
snapshot --compact, screenshot --full-page, type --delay/--mistakes,
fill --no-press-enter, stop --force. Replace get html with get value
in SKILL.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ked commands

These commands are functional again after PR browserbase/stagent-cli#11
fixed the daemon startup (EPIPE crash) and restored selector command surfaces.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…NCE.md

These CLI features were verified working in browse-cli v0.1.5 but were
missing from the reference documentation:
- `refs` command for cached ref map lookup
- `open --wait` flag (networkidle/domcontentloaded) for SPAs
- `--json` global flag for structured output
- `--session` global flag for concurrent browser sessions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
shrey150 and others added 2 commits February 25, 2026 23:23
Match the #### heading + description + code block pattern used
throughout the rest of REFERENCE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hing

- Add Table of Contents to REFERENCE.md (matches functions/REFERENCE.md
  and top skills like terraform-skill, mcp-builder)
- Remove Typical Workflow and Local vs Remote Mode sections from
  REFERENCE.md — these were near-identical copies of SKILL.md content
- Condense "Switching Between Local and Remote Environment" in SKILL.md
  from 38 lines to 16 — keeps the signal detection list and credential
  setup, drops redundant env commands already shown in Commands section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@shrey150 shrey150 merged commit f780061 into main Feb 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants