Align skills with Agent Skills spec (agentskills.io)#36
Merged
Conversation
No agent (Claude Code, OpenCode, OpenClaw, Cursor, Windsurf) reads setup.json — only SKILL.md is loaded into context. The setup.json pattern was invisible to all agents and therefore broken everywhere. Replace with standard frontmatter fields: - requires.bins: [browser] — agents skip/suppress the skill if the binary isn't installed, rather than always injecting it and hoping the agent notices setupComplete: false - install.kind/pkg — agents that support auto-install can invoke npm install @browserbasehq/stagehand-cli automatically Also remove the ".env file" reference for credentials — the env vars are read from the environment, not specifically from a .env file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed frontmatter requires.bins and install.kind/pkg are not part of any agent skills spec. Replace with: - compatibility field (valid per agentskills.io spec) to surface requirements - inline `which browser || npm install` check in the skill body so the agent can self-heal without relying on non-standard frontmatter fields Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The agentskills.io spec requires the directory name to match the name field in SKILL.md frontmatter. Since the skill is named "browser" (invoked as /browser), the directory should be browser/ not browser-automation/. Also fix inconsistent heading levels (### → ##). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Validated both skills against the skills-ref reference library and fixed all issues found: - Quote compatibility YAML value to fix strictyaml parse error - Rewrite functions description with trigger keywords (schedule, webhook, cloud, cron, Browserbase Functions) -- the spec requires triggers in the description, not in the body - Split functions SKILL.md into SKILL.md + REFERENCE.md for progressive disclosure (invocation examples, common patterns, troubleshooting) - Remove "When to Use" body section from functions (redundant with description, invisible during skill discovery) - Add license: Apache-2.0 and LICENSE.txt to both skills - Add table of contents to browser REFERENCE.md (535 lines) - Condense browser EXAMPLES.md from 8 repetitive examples to 4 diverse ones Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kylejeong2
requested changes
Feb 23, 2026
Member
Kylejeong2
left a comment
There was a problem hiding this comment.
A few nits to resolve. Is the assumption for Functions that Claude Code or the Agent is going to generate code to run?
- Update copyright year from 2025 to 2026 in both LICENSE.txt files - Fix package name from @browserbasehq/stagehand-cli (doesn't exist) to @browserbasehq/browse-cli (actual npm package) - Update CLI command from `browser` to `browse` to match the npm package binary name across SKILL.md, REFERENCE.md, and EXAMPLES.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add anti-bot stealth, CAPTCHA solving, residential proxy, and session persistence details to the browser skill description and mode comparison table. These trigger phrases help AI agents discover and select this skill when users need to interact with protected or JavaScript-heavy websites. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add requires.bins (browse CLI) and install spec so OpenClaw can gate the skill properly and auto-install the CLI - Add homepage for ClawHub trust score - Do NOT gate on env vars — local mode works without Browserbase keys - Rewrite "Environment Selection" section with clear guidance on when to use local mode (simple pages) vs remote mode (protected sites, CAPTCHAs, bot detection, Cloudflare, geo-restricted content) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Session logs show the agent screenshots after every action (expensive, slow) and ignores browse act/observe in favor of manual snapshot → click ref loops. Update the skill to: - Document browse snapshot and recommend it as default over screenshot - Add guidance on when to use snapshot vs screenshot - Steer toward browse act/observe over low-level ref-based commands - Rewrite best practices to reflect snapshot-first, act-first workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
The SKILL.md documented commands (navigate, act, extract, observe) that don't exist in the CLI. The actual commands are open, click, type, fill, snapshot, etc. This caused agents to run nonexistent commands and fall back to guessing. - Replace all command docs with actual CLI syntax from browse --help - Document snapshot-first workflow with element refs - Add session management commands (stop, status, pages, tab_switch) - Add "No active page" to troubleshooting - Fix quick example to use real commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
browse stop doesn't always kill the daemon process. Add pkill fallback for when the daemon is stuck with wsUrl: "unknown" after a SIGTERM. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… CLI commands - Rename browser-automation → browser in README.md and marketplace.json - Rewrite EXAMPLES.md: replace nonexistent commands (navigate, act, extract, observe, close) with real browse CLI commands (open, snapshot, click, type, fill, get, stop). 4 concrete examples including remote mode escalation. - Rewrite REFERENCE.md: replace Stagehand/Playwright architecture with actual daemon-based CLI docs, all 20 real commands, env var config. - SKILL.md: add "Activating Remote Mode" progressive disclosure section, fix get text to require selector argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b1b9eff to
3fb2f49
Compare
Add browse mode to SKILL.md commands list and REFERENCE.md. Rewrite "Activating Remote Mode" as "Switching Between Local and Remote Mode" using browse mode as the primary mechanism. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Member
|
@cursoragent can you review this PR |
|
Short answer: mostly good. I fixed the one real issue and pushed.
If/when the browse CLI is published, we can reintroduce the install stanza:
I’ve pushed the fixes to |
…rowse-cli; gate on 'browse' bin only\n\n- Drop OpenClaw install block to avoid failed installs\n- Make compatibility + setup check generic (no npm package name)\n- Generalize REFERENCE note (avoid browse-cli v0.1.4 mention) Co-authored-by: Kyle Jeong <Kylejeong2@users.noreply.github.com>
Kylejeong2
approved these changes
Feb 25, 2026
Member
Kylejeong2
left a comment
There was a problem hiding this comment.
Approved, would ask Peyton to test first before merging and breaking live version
…basehq/browse-cli; gate on 'browse' bin only\n\n- Drop OpenClaw install block to avoid failed installs\n- Make compatibility + setup check generic (no npm package name)\n- Generalize REFERENCE note (avoid browse-cli v0.1.4 mention)" This reverts commit 5b74333.
Remove: get html, drag, highlight, is, execute references. Add docs for: hover, newpage, eval, viewport, network capture, snapshot --compact, screenshot --full-page, type --delay/--mistakes, fill --no-press-enter, stop --force. Replace get html with get value in SKILL.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ked commands These commands are functional again after PR browserbase/stagent-cli#11 fixed the daemon startup (EPIPE crash) and restored selector command surfaces. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…NCE.md These CLI features were verified working in browse-cli v0.1.5 but were missing from the reference documentation: - `refs` command for cached ref map lookup - `open --wait` flag (networkidle/domcontentloaded) for SPAs - `--json` global flag for structured output - `--session` global flag for concurrent browser sessions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Match the #### heading + description + code block pattern used throughout the rest of REFERENCE.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hing - Add Table of Contents to REFERENCE.md (matches functions/REFERENCE.md and top skills like terraform-skill, mcp-builder) - Remove Typical Workflow and Local vs Remote Mode sections from REFERENCE.md — these were near-identical copies of SKILL.md content - Condense "Switching Between Local and Remote Environment" in SKILL.md from 38 lines to 16 — keeps the signal detection list and credential setup, drops redundant env commands already shown in Commands section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Aligns both skills with the Agent Skills spec, rewrites the browser skill commands to match the actual CLI, and adds OpenClaw plugin metadata for discoverability.
Spec alignment
setup.json— no agent reads it; onlySKILL.mdis loaded into contextbrowser-automation/tobrowser/— spec requires directory name to match thenamefrontmatter fieldcompatibilityYAML value — the: setsubstring caused astrictyamlparse errorfunctionsdescription — moved trigger keywords into the frontmatterdescriptionfunctions/SKILL.mdintoSKILL.md+REFERENCE.mdfor progressive disclosurefunctionslicense: MITandLICENSE.txtto both skillsbrowser/REFERENCE.md(535 lines; spec recommends TOC for files >100 lines)browser/EXAMPLES.mdfrom 8 repetitive examples to 4 diverse onesCLI command rewrite (critical fix)
The SKILL.md documented commands that do not exist in the browse CLI (
navigate,act,extract,observe). These were Stagehand SDK concepts, not actual CLI commands. Agents were running nonexistent commands and falling back to guessing.browse --helpopen,snapshot,screenshot,click <ref>,type,fill,select,press,scroll,wait,get,stopnavigate,act,extract,observe,closeDiscoverability improvements
descriptionwith anti-bot stealth, automatic CAPTCHA solving, residential proxy, and bot detection trigger phrases for ClawHub semantic search and agent tool selectionOpenClaw plugin metadata
metadata.openclaw.requires.bins— gates the skill on thebrowseCLI being installedmetadata.openclaw.install— tells OpenClaw how to auto-install the CLI (npm install -g @browserbasehq/browse-cli)homepage— improves trust score on ClawHubLocal vs remote mode guidance
Troubleshooting
browse stopdoesn't always kill the daemon. Addedpkill -f "browse.*daemon"fallback for "No active page" errorsValidation
Both skills pass
skills-ref validate:Test plan
skills-ref validateagainst both skill directoriesbrowse --helpoutputbrowse open <url>→browse snapshot→browse click <ref>→browse stoppkillfallback worksmetadata.openclaw.requires.binscorrectly gates the skill whenbrowseis not on PATHclawhub publish ./skills/browser/ --slug browse --version 2.0.0) and verify semantic search surfaces it for "captcha", "anti-bot", "protected website" queriesNote
Low Risk
Documentation/metadata-only changes that primarily rename/restructure skills and update CLI examples; low risk aside from potential broken links if consumers still reference the removed
browser-automationpaths.Overview
Updates the marketplace and README to replace the old
browser-automationskill with a new spec-compliantbrowserskill, including refreshed positioning around Browserbase remote sessions (stealth/CAPTCHA/proxies).Replaces the previous Stagehand-style
browsercommand documentation/examples with the actualbrowseCLI workflow (snapshot-first, daemon/session commands) and adds OpenClaw installation/requirements metadata plus MIT licensing.Refactors the
functionsskill docs by moving trigger language into frontmatter, adding MIT licensing, and splitting deeper invocation/pattern/troubleshooting content into a newREFERENCE.mdfor progressive disclosure.Written by Cursor Bugbot for commit acc6cc3. This will update automatically on new commits. Configure here.