CLI to control iOS and Android devices for AI agents influenced by Vercel’s agent-browser.
The project is in early development and considered experimental. Pull requests are welcome!
- Platforms: iOS (simulator + physical device core automation) and Android (emulator + device).
- Core commands:
open,back,home,app-switcher,press,long-press,focus,type,fill,scroll,scrollintoview,wait,alert,screenshot,close,reinstall. - Inspection commands:
snapshot(accessibility tree),appstate,apps,devices. - Device tooling:
adb(Android),simctl/devicectl(iOS via Xcode). - Minimal dependencies; TypeScript executed directly on Node 22+ (no build step).
npm install -g agent-deviceOr use it without installing:
npx agent-device open SampleAppUse refs for agent-driven exploration and normal automation flows.
agent-device open Contacts --platform ios # creates session on iOS Simulator
agent-device snapshot
agent-device click @e5
agent-device fill @e6 "John"
agent-device fill @e7 "Doe"
agent-device click @e3
agent-device closeagent-device <command> [args] [--json]Basic flow:
agent-device open SampleApp
agent-device snapshot
agent-device click @e7
agent-device fill @e8 "hello"
agent-device close SampleAppDebug flow:
agent-device trace start
agent-device snapshot -s "Sample App"
agent-device find label "Wi-Fi" click
agent-device trace stop ./trace.logCoordinates:
- All coordinate-based commands (
press,long-press,swipe,focus,fill) use device coordinates with origin at top-left. - X increases to the right, Y increases downward.
Gesture series examples:
agent-device press 300 500 --count 12 --interval-ms 45
agent-device press 300 500 --count 6 --hold-ms 120 --interval-ms 30 --jitter-px 2
agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-pongboot,open,close,reinstall,home,back,app-switchersnapshot,find,getclick,focus,type,fill,press,long-press,swipe,scroll,scrollintoview,pinch,isalert,wait,screenshottrace start,trace stopsettings wifi|airplane|location on|offappstate,apps,devices,session list
Notes:
- iOS snapshots use XCTest on simulators and physical devices.
- Scope snapshots with
-s "<label>"or-s @ref. - If XCTest returns 0 nodes (e.g., foreground app changed), agent-device fails explicitly.
Flags:
--version, -Vprint version and exit--platform ios|android--device <name>--udid <udid>(iOS)--serial <serial>(Android)--activity <component>(Android app launch only; package/Activity or package/.Activity; not for URL opens)--session <name>--count <n>repeat count forpress/swipe--interval-ms <ms>delay betweenpressiterations--hold-ms <ms>hold duration perpressiteration--jitter-px <n>deterministic coordinate jitter forpress--pause-ms <ms>delay betweenswipeiterations--pattern one-way|ping-pongrepeat pattern forswipe--verbosefor daemon and runner logs--jsonfor structured output
Pinch:
pinchis supported on iOS simulators.- On Android,
pinchcurrently returnsUNSUPPORTED_OPERATIONin the adb backend.
Swipe timing:
swipeaccepts optionaldurationMs(default250, range16..10000).- Android uses requested swipe duration directly.
- iOS uses a safe normalized duration to avoid long-press side effects.
Install the automation skills listed in SKILL.md.
npx skills add https://github.com/callstackincubator/agent-device --skill agent-deviceSessions:
openstarts a session. Without args boots/activates the target device/simulator without launching an app.- All interaction commands require an open session.
- If a session is already open,
open <app|url>switches the active app or opens a deep link URL. closestops the session and releases device resources. Pass an app to close it explicitly, or omit to just close the session.- Use
--session <name>to manage multiple sessions. - Session scripts are written to
~/.agent-device/sessions/<session>-<timestamp>.adwhen recording is enabled with--save-script. --save-scriptaccepts an optional path:--save-script ./workflows/my-flow.ad.- For ambiguous bare values, use an explicit form:
--save-script=workflow.ador a path-like value such as./workflow.ad. - Deterministic replay is
.ad-based; usereplay --update(-u) to update selector drift and rewrite the replay file in place. - On iOS,
appstateis session-scoped and requires an active session on the target device.
Navigation helpers:
boot --platform ios|androidensures the target is ready without launching an app.- Use
bootmainly when starting a new session andopenfails because no booted simulator/emulator is available. open [app|url] [url]already boots/activates the selected target when needed.reinstall <app> <path>uninstalls and installs the app binary in one command (Android + iOS simulator).reinstallaccepts package/bundle id style app names and supports~in paths.
Deep links:
open <url>supports deep links withscheme://....open <app> <url>opens a deep link on iOS.- Android opens deep links via
VIEWintent. - iOS simulator opens deep links via
simctl openurl. - iOS device opens deep links via
devicectl --payload-url. - On iOS devices,
http(s)://URLs open in Safari when no app is active. Custom scheme URLs (myapp://) require an active app in the session. --activitycannot be combined with URL opens.
agent-device open "myapp://home" --platform android
agent-device open "https://example.com" --platform ios # open link in web browser
agent-device open MyApp "myapp://screen/to" --platform ios # open deep link to MyAppFind (semantic):
find <text> <action> [value]finds by any text (label/value/identifier) using a scoped snapshot.find text|label|value|role|id <value> <action> [value]for specific locators.- Actions:
click(default),fill,type,focus,get text,get attrs,wait [timeout],exists.
Assertions:
ispredicates:visible,hidden,exists,editable,selected,text.is textuses exact equality.
Replay update:
replay <path>runs deterministic replay from.adscripts.replay -u <path>attempts selector updates on failures and atomically rewrites the same file.- Refs are the default/core mechanism for interactive agent flows.
- Update targets:
click,fill,get,is,wait. - Selector matching is a replay-update internal: replay parses
.adlines into actions, tries them, snapshots on failure, resolves a better selector, then rewrites that failing line.
Update examples:
# Before (stale selector)
click "id=\"old_continue\" || label=\"Continue\""
# After replay -u (rewritten in place)
click "id=\"auth_continue\" || label=\"Continue\""# Before (ref-based action from discovery)
snapshot -i -c -s "Continue"
click @e13 "Continue"
# After replay -u (upgraded to selector-based action)
snapshot -i -c -s "Continue"
click "id=\"auth_continue\" || label=\"Continue\""Android fill reliability:
fillclears the current value, then enters text.typeenters text into the focused field without clearing.fillnow verifies the entered value on Android.- If value does not match, agent-device clears the field and retries once with slower typing.
- This reduces IME-related character swaps on long strings (e.g. emails and IDs).
Settings helpers:
settings wifi on|offsettings airplane on|offsettings location on|off(iOS uses per-app permission for the current session app) Note: iOS supports these only on simulators. iOS wifi/airplane toggles status bar indicators, not actual network state. Airplane off clears status bar overrides.
App state:
appstateshows the foreground app/activity (Android).- On iOS,
appstatereturns the currently tracked session app (source: session) and requires an active session on the selected device. appsincludes default/system apps by default (use--user-installedto filter).
agent-device trace startagent-device trace stop ./trace.log- The trace log includes snapshot logs and XCTest runner logs for the session.
- Built-in retries cover transient runner connection failures and Android UI dumps.
- For snapshot issues (missing elements), compare with
--rawflag for unaltered output and scope with-s "<label>". - If startup fails with stale metadata hints, remove stale
~/.agent-device/daemon.json/~/.agent-device/daemon.lockand retry.
Boot diagnostics:
- Boot failures include normalized reason codes in
error.details.reason(JSON mode) and verbose logs. - Reason codes:
IOS_BOOT_TIMEOUT,IOS_RUNNER_CONNECT_TIMEOUT,ANDROID_BOOT_TIMEOUT,ADB_TRANSPORT_UNAVAILABLE,CI_RESOURCE_STARVATION_SUSPECTED,BOOT_COMMAND_FAILED,UNKNOWN. - Android boot waits fail fast for permission/tooling issues and do not always collapse into timeout errors.
- Use
agent-device boot --platform ios|androidwhen starting a new session only ifopencannot find/connect to an available target. - Set
AGENT_DEVICE_RETRY_LOGS=1to print structured retry telemetry (attempt, phase, delay, elapsed/remaining deadline, reason).
- Bundle/package identifiers are accepted directly (e.g.,
com.apple.Preferences). - Human-readable names are resolved when possible (e.g.,
Settings). - Built-in aliases include
Settingsfor both platforms.
- Core runner commands:
snapshot,wait,click,fill,get,is,find,press,long-press,focus,type,scroll,scrollintoview,back,home,app-switcher. - Simulator-only commands:
alert,pinch,record,reinstall,settings. - iOS device runs require valid signing/provisioning (Automatic Signing recommended). Optional overrides:
AGENT_DEVICE_IOS_TEAM_ID,AGENT_DEVICE_IOS_SIGNING_IDENTITY,AGENT_DEVICE_IOS_PROVISIONING_PROFILE.
pnpm testUseful local checks:
pnpm typecheck
pnpm test:unit
pnpm test:smokepnpm buildEnvironment selectors:
ANDROID_DEVICE=Pixel_9_Pro_XLorANDROID_SERIAL=emulator-5554IOS_DEVICE="iPhone 17 Pro"orIOS_UDID=<udid>AGENT_DEVICE_IOS_BOOT_TIMEOUT_MS=<ms>to adjust iOS simulator boot timeout (default:120000, minimum:5000).AGENT_DEVICE_DAEMON_TIMEOUT_MS=<ms>to override daemon request timeout (default90000). Increase for slow physical-device setup (for example120000).AGENT_DEVICE_IOS_TEAM_ID=<team-id>optional Team ID override for iOS device runner signing.AGENT_DEVICE_IOS_SIGNING_IDENTITY=<identity>optional signing identity override.AGENT_DEVICE_IOS_PROVISIONING_PROFILE=<profile>optional provisioning profile specifier for iOS device runner signing.AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH=<path>optional override for iOS runner derived data root. By default, simulator uses~/.agent-device/ios-runner/derivedand physical device uses~/.agent-device/ios-runner/derived/device. If you set this override, use separate paths per kind to avoid simulator/device artifact collisions.AGENT_DEVICE_IOS_CLEAN_DERIVED=1rebuild iOS runner artifacts from scratch. WhenAGENT_DEVICE_IOS_RUNNER_DERIVED_PATHis set, cleanup is blocked by default; setAGENT_DEVICE_IOS_ALLOW_OVERRIDE_DERIVED_CLEAN=1only for trusted custom paths.
Test screenshots are written to:
test/screenshots/android-settings.pngtest/screenshots/ios-settings.png
See CONTRIBUTING.md.
agent-device is an open source project and will always remain free to use. Callstack is a group of React and React Native geeks. Contact us at hello@callstack.com if you need any help with these technologies or just want to say hi.