protoAgent

Template repository for building protoLabs A2A agents on LangGraph.

The purpose of this repo is to keep the boring parts — A2A spec handling, cost/extension emission, tracing, release pipeline — stable across every agent in the fleet, so forking an agent is close to a rewrite of SOUL.md, graph/prompts.py, and tools/lg_tools.py and not much else.

Canonical reference implementation: protoLabsAI/roxy. Roxy is a filled-in fork — an autonomous ProtoMaker portfolio manager with its own persona, A2A skills, and project registry — a good example of what a fork looks like end-to-end.

Try it in 5 minutes: clone, pip install -r requirements.txt, python -m server, open http://localhost:7870, and walk the setup wizard — no forking, no sed, no Docker required to get your first agent talking. See the first-agent tutorial.

When you're ready to ship your own: click "Use this template" at the top of the GitHub repo, then follow Customize & deploy for the fork / rename / release-pipeline wiring.

What you get out of the box

Concern	Where it lives	What it does
A2A server	`server/a2a.py`, `a2a_executor.py`	JSON-RPC 2.0 over `/a2a`, SSE streaming, `tasks/*` lifecycle, push notifications, well-known agent card, dual token-shape parsing
Agent runtime	`graph/agent.py`, `server/`	LangGraph `create_agent()` wired to the A2A handler, with streaming token capture for cost-v1
LLM gateway	`graph/llm.py`	OpenAI-compatible client pointed at LiteLLM — swap models by editing the gateway config, not the fork
Subagents	`graph/subagents/config.py`	DeerFlow-pattern delegation via a `task()` tool; one worked example ships — a `researcher` (web + memory, plan→search→synthesize→cite)
Starter tools	`tools/lg_tools.py`, `tools/github_tools.py`	Default-on set: 4 keyless general (`current_time`, `calculator` safe AST eval, `web_search` via DuckDuckGo, `fetch_url`) + 2 HITL (`ask_human`, `request_user_input`) + 4 GitHub read tools over the `gh` CLI + 4 notes + 5 memory + 3 scheduler + 4 beads + inbox/peer (conditional). Drop any via `tools.disabled`; add via a plugin. See Starter tools
Knowledge store	`knowledge/store.py`	sqlite + FTS5 (LIKE fallback). One `chunks` table for operator notes, daily-log entries, and conversation findings. Default-on; turn off with `middleware.knowledge: false`
Extensibility	`graph/skills/`, `tools/mcp_tools.py`, `graph/plugins/`, `plugins/`	Opt-in ways to extend a running agent without forking: `SKILL.md` skills (AgentSkills format, auto-retrieved), MCP servers (external tools over stdio/HTTP), and plugins (drop-in packages adding tools, skills, FastAPI routes, background surfaces, subagents, managed MCP servers, and their own config/secrets/Settings). The first-party Discord ingress (`plugins/discord`) and Google Gmail/Calendar (`plugins/google`) ship as plugins — disable with `plugins.disabled`. See Skills, MCP, Plugins, ADR 0001 / 0018 / 0019
Scheduler	`scheduler/`	`schedule_task` / `list_schedules` / `cancel_schedule` tools backed by either a bundled sqlite scheduler or a Workstacean adapter (env-selected). Multi-agent-safe — every job is namespaced by `AGENT_NAME`. See Schedule future work
Eval harness	`evals/`	Side-effect-verified A2A test harness — audit log + reply text + KB state. `python -m evals.runner` against a running agent. See Eval your fork
Tracing	`tracing.py`	Langfuse trace_session with distributed `a2a.trace` propagation and the OTel cross-context-detach filter
Observability	`metrics.py`, `audit.py`	Prometheus metrics with per-agent prefix, JSONL audit log with trace IDs
Output protocol	`graph/output_format.py`	`<scratch_pad>` / `<output>` parsing so the model can think without it leaking to users
UI	`apps/web/` (React console), `chat_ui.py` (Gradio)	React operator console (the default `--ui console` tier + the Tauri desktop app) over the REST/A2A API; legacy Gradio chat (`--ui full`) with PWA shell. See ADR 0010
Release pipeline	`.github/workflows/*.yml`	Autonomous semver bumps, GHCR image push, GitHub release with filtered notes, optional Discord post

Quickstart — from zero to chatting in 5 minutes

# 1. Get the code (no fork needed for a first run)
git clone https://github.com/protoLabsAI/protoAgent.git my-agent
cd my-agent

# 2. Install deps into a venv
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 3. Run the server — no env vars required
python -m server

# 4. Open the wizard — pick your endpoint, pick a model, name the
#    agent, pick a persona preset, hit Launch. The chat UI appears
#    on the same page.
open http://localhost:7870

First-agent tutorial walks through every wizard step with screenshots.

Once you're happy and want to ship it as your own image in your own GHCR: Customize & deploy.

Architecture

┌──────────────┐     A2A JSON-RPC + SSE      ┌─────────────────┐
│   Consumer   │ ──────────────────────────▶ │  A2A handler    │
│  (any A2A    │                             │  (FastAPI)      │
│   client)    │ ◀──── cost-v1 DataPart ─────│                 │
└──────────────┘                             └────────┬────────┘
                                                      │
                                                      ▼
                                            ┌─────────────────┐
                                            │  graph/agent.py │
                                            │  (LangGraph     │
                                            │   create_agent) │
                                            └────────┬────────┘
                                                      │
                                                      ▼
                                            ┌─────────────────┐
                                            │  LiteLLM        │  ← model selection
                                            │  gateway        │    lives here,
                                            └─────────────────┘    not in code

The A2A handler never talks to the LLM directly — it submits a message to the LangGraph runtime, which owns the tool loop, the subagent task() delegation, and the structured-output protocol.

A2A extensions shipped by default

URI	Declared on card	Emitted at runtime
`cost-v1` (`https://proto-labs.ai/a2a/ext/cost-v1`)	Yes	Yes — every terminal task carries a cost-v1 DataPart with token usage + `durationMs`
`confidence-v1` (`https://proto-labs.ai/a2a/ext/confidence-v1`)	Yes	When the model self-reports a `<confidence>` tag — a confidence-v1 DataPart with the score (`[0,1]`), optional explanation, and `success`
`a2a.trace` propagation	No (it's a protocol convention, not a card extension)	Yes — reads caller's Langfuse trace context from `params.metadata["a2a.trace"]` and nests this agent's trace under it

Declare additional extensions on the card in server/a2a.py::_build_agent_card_proto when your agent's skills actually mutate shared state (see effect-domain-v1 in the Workstacean docs for when this applies).

Push notification support

The A2A handler supports both token shapes the spec permits:

// Shape 1 — top-level (what @a2a-js/sdk serialises by default)
{ "url": "https://consumer/callback/abc", "token": "shared-secret" }

// Shape 2 — structured (RFC-8821 AuthenticationInfo)
{
  "url": "https://consumer/callback/abc",
  "authentication": { "schemes": ["Bearer"], "credentials": "shared-secret" }
}

Both produce Authorization: Bearer shared-secret on outgoing webhooks. If your fork is getting 401s on callbacks, check which shape the consumer is sending before changing anything — the dual-token parser in a2a_auth.py reads both and the test suite covers both.

Observability

What	Where	How to use
Prometheus metrics	`/metrics`	Scrape; metric prefix is `AGENT_NAME_*` (sanitised)
JSONL audit log	`/sandbox/audit/audit.jsonl`	`jq` for forensic replay; every entry has `trace_id`
Langfuse traces	`LANGFUSE_*` env vars	Trace tag is `AGENT_NAME`, so filter by tag to find this agent's runs
Container logs	`docker logs <container>`	INFO is the default — `LOG_LEVEL=DEBUG` for more

Release pipeline

The included GitHub Actions pipeline is optional but opinionated.

On every merge to main → docker-publish.yml builds and pushes ghcr.io/protolabsai/<image>:latest + sha-<short>. Watchtower (or similar) can poll latest for auto-deploy.
When a non-release PR merges → prepare-release.yml opens a "chore: release vX.Y.Z" bump PR, auto-merges it, and pushes a semver tag.
When a semver tag lands → release.yml builds and pushes the stable semver Docker tags, creates a GitHub release with filtered notes, and posts a Discord embed via the shared protoLabsAI/release-tools Action.
On every PR + push → checks.yml runs pytest and verify-workspace-config (the fleet .beads/.automaker/owned-runner standard), so drift is caught in CI rather than mid-run.

All workflows run on the org-owned namespace-profile-protolabs-linux runner. The three release workflows (docker-publish, prepare-release, release) gate on github.repository == 'protoLabsAI/<name>' so they no-op on clones that haven't updated the owner — avoids surprise releases on forks. Update the repo check in all three when forking.

Requirements

Python 3.12+
Docker (for the bundled deployment)
A LiteLLM-compatible OpenAI gateway somewhere on the network (see config/langgraph-config.yaml)
Optional: Langfuse, Prometheus, Discord webhook

Skill loop — agents that learn from experience

protoAgent includes an end-to-end skill loop where the agent learns from its own runs — successful subagent workflows are captured as reusable skills, retrieved automatically on future tasks, and periodically optimised by the skill curator. The same index also serves human-authored skills dropped in as SKILL.md folders, so authored and agent-emitted skills are retrieved together.

Component	Where it lives	What it does
`SKILL.md` skills	`config/skills/`, `<config>/skills/`, plugins	Human-authored skills (AgentSkills format) loaded into the index on boot (`source=disk`). See Skills
Skill emission	`graph/extensions/skills.py`	Captures `task()` results as `SkillV1Artifact` when `emit_skill=True`, persisted to the index (`source=emitted`)
Skill index	`/sandbox/skills.db` (→ `~/.protoagent`)	SQLite (FTS5) store of authored + emitted skills, queried by `KnowledgeMiddleware`
Knowledge injection	`graph/middleware/knowledge.py`	Queries index before each LLM call, injects top-k matching skills as a `<learned_skills>` block
Skill curator	`graph/skills/curator.py`	Periodic agent that deduplicates, decays, and prunes emitted skills (disk skills are pinned)

Running the curator

# Dry-run — see what would change without touching the index
python -m graph.skills.curator --dry-run

# Full curation pass (deduplicate, decay, prune; writes an audit trail)
python -m graph.skills.curator

The curator applies a 90-day confidence half-life (confidence halves for every 90 days a skill goes unused), clusters near-duplicate skills by similarity and keeps the highest-confidence copy, and prunes any skill whose confidence has fallen below 0.2.

See docs/tutorials/skill-loop.md for a complete end-to-end example and cron setup.

Contributing

This is a template repo — bugs and improvements to the shared runtime (the server/ package, graph/agent.py, extension support, release pipeline) land here. Domain-specific agent logic lives in the fork, not here.

Name		Name	Last commit message	Last commit date
Latest commit History 554 Commits
.automaker		.automaker
.beads		.beads
.github		.github
activity		activity
apps		apps
beads		beads
config		config
deploy/openshell		deploy/openshell
docs		docs
enforcement		enforcement
evals		evals
events		events
examples/mcp		examples/mcp
graph		graph
inbox		inbox
knowledge		knowledge
mcp_servers/google		mcp_servers/google
nanobot		nanobot
operator_api		operator_api
plugins		plugins
runtime		runtime
scheduler		scheduler
scripts		scripts
server		server
static		static
surfaces		surfaces
tests		tests
tools		tools
workflows		workflows
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
README.md		README.md
TEMPLATE.md		TEMPLATE.md
a2a_auth.py		a2a_auth.py
a2a_executor.py		a2a_executor.py
a2a_stores.py		a2a_stores.py
audit.py		audit.py
autostart.py		autostart.py
cache.py		cache.py
chat_ui.py		chat_ui.py
docker-compose.yml		docker-compose.yml
egress.py		egress.py
entrypoint.sh		entrypoint.sh
metrics.py		metrics.py
package-lock.json		package-lock.json
package.json		package.json
paths.py		paths.py
pricing.py		pricing.py
pyproject.toml		pyproject.toml
requirements-core.txt		requirements-core.txt
requirements-google.txt		requirements-google.txt
requirements-ui.txt		requirements-ui.txt
requirements.txt		requirements.txt
seccomp-profile.json		seccomp-profile.json
security.py		security.py
telemetry_store.py		telemetry_store.py
tracing.py		tracing.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

protoAgent

What you get out of the box

Quickstart — from zero to chatting in 5 minutes

Architecture

A2A extensions shipped by default

Push notification support

Observability

Release pipeline

Requirements

Skill loop — agents that learn from experience

Running the curator

Contributing

About

Uh oh!

Releases 120

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

protoAgent

What you get out of the box

Quickstart — from zero to chatting in 5 minutes

Architecture

A2A extensions shipped by default

Push notification support

Observability

Release pipeline

Requirements

Skill loop — agents that learn from experience

Running the curator

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 120

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages