- Status: active
- Source of truth:
AGENTS.md,package.json,schemas/task-graph.schema.json,schemas/execution-state.schema.json,schemas/risk-policy.schema.json - Verified with:
npm run build,npm run test:unit,npm run validate:docs,npm run validate:synapse-example - Last verified: 2026-03-25
Spec2Flow is an open-source AI workflow framework for turning product requirements and repository context into a repeatable engineering loop.
It is the control plane for an agent-friendly development workflow:
Requirements -> Implement -> Design Tests -> Execute -> Report -> Collaborate
Modern development is no longer just about writing code. Teams need a repeatable workflow that can:
- understand product and design documents
- read and reason about an existing codebase
- translate requirements into implementation tasks
- generate test plans and test cases
- run repository-native validation commands
- run browser automation when UI coverage is needed
- collect evidence and draft bug reports
- connect execution results back into collaboration workflows
Spec2Flow aims to provide a practical foundation for that loop.
Spec2Flow is designed to start simple and grow into a modular AI-driven development and testing workflow framework.
At the top level, the model is intentionally narrow:
- the CLI orchestrates work
- adapters connect external model runtimes
- schemas define contracts
- docs explain the system and remain part of the product
Spec2Flow is organized around a simple six-stage workflow:
- Requirements Analysis: read docs and repository context, then produce a scoped requirement summary, assumptions, and impacted modules.
- Code Implementation: turn approved requirements into implementation tasks, code changes, and reviewable outputs.
- Test Design: generate structured test scope, risk areas, smoke coverage, regression coverage, and edge cases.
- Automated Execution: run deterministic validation commands, start environments when needed, and use Playwright for browser validation and evidence capture.
- Defect Feedback: turn failed execution results into evidence-backed bug drafts.
- Collaboration Workflow: route results through GitHub Actions, GitHub Issues, and pull request review.
Spec2Flow should produce structured outputs for each stage:
- requirement summaries
- implementation tasks
- test plans
- test cases
- execution reports
- bug drafts
- collaboration updates
The first version is intentionally narrow. It should be able to:
- read product docs and repository context
- produce implementation tasks and test plans
- run canonical validation commands and browser checks when needed
- capture evidence and draft bug reports
- feed results back into a collaboration workflow
The current baseline is explicitly built around:
- Copilot-compatible adapters for requirements analysis, implementation support, and test design
- Repository-native command execution for deterministic validation
- Playwright for browser automation and evidence capture when needed
- GitHub Actions for repeatable CI execution and artifact upload
- GitHub Issues for defect tracking and workflow coordination
Spec2Flow follows a workflow-centered architecture with explicit orchestration boundaries:
Responsible for:
- generating task graphs
- persisting execution state
- claiming ready tasks
- recording task results and artifacts
Responsible for:
- mapping one claimed task into a provider-specific runtime
- managing task-scoped agent execution
- returning structured task results
Responsible for:
- understanding specs
- understanding code
- generating implementation tasks
- generating plans and test cases
- interpreting failures
- drafting bugs
Responsible for:
- starting services or test environments
- running validation commands
- running Playwright tests when needed
- collecting artifacts
- producing structured execution results
Responsible for:
- publishing CI results
- retaining artifacts
- routing failures into GitHub Issues
- supporting pull request validation and team visibility
- Simple first - prefer explicit, explainable workflow boundaries
- Execution over demos - reliable automation matters more than impressive prompts
- Docs and code stay in sync - contracts, examples, and docs should reflect real behavior
- Modular by default - orchestration, adapters, execution, and collaboration should evolve independently
- Verifiable by default - meaningful changes should have a concrete validation path
- align docs with the six-stage workflow
- define schemas for plans, cases, execution reports, and bug drafts
- add collaboration conventions
- bootstrap Playwright
- define local startup flow
- capture evidence and execution summaries
- add GitHub Actions workflows
- publish artifacts from CI
- map failed runs into GitHub Issues drafts
- run a sample spec through requirement analysis, implementation planning, execution, and defect feedback
Spec2Flow is still in the bootstrap stage.
The next implementation target is to establish:
- stable document structure
- workflow schemas
- a minimal execution baseline through canonical validation commands and Playwright where needed
- a GitHub Issues-based defect feedback loop
- Docs governance lives in two places: use docs/structure.md for active documentation layout rules and docs/plans/index.md for archived and plan-only placement rules.
- AGENTS.md
- llms.txt
- docs/index.md
- docs/copilot.md
- docs/Harness_engineering.md
- docs/architecture.md
- docs/structure.md
- docs/collaboration.md
- docs/usage-guide.md
- docs/plans/index.md
- docs/plans/historical/index.md
- docs/synapse-integration-automation-design.md
- schemas/project-adapter.schema.json
- schemas/system-topology.schema.json
- schemas/risk-policy.schema.json
- schemas/task-graph.schema.json
- schemas/environment-preparation-report.schema.json
- schemas/onboarding-validator-result.schema.json
- schemas/execution-state.schema.json
- schemas/model-adapter-capability.schema.json
- schemas/model-adapter-runtime.schema.json
- docs/examples/synapse-network/README.md
- docs/examples/synapse-network/project.yaml
- docs/examples/synapse-network/topology.yaml
- docs/examples/synapse-network/risk.yaml
- docs/examples/synapse-network/generated/onboarding-validator-result.json
- docs/examples/synapse-network/generated/task-graph.json
- docs/examples/synapse-network/generated/execution-state.json
- docs/examples/synapse-network/generated/task-graph-frontend-change.json
- docs/examples/synapse-network/generated/task-graph-withdrawal-change.json
Spec2Flow now includes a minimal CLI runtime for onboarding validation, task graph generation, and execution-state lifecycle management.
The default runtime now executes the compiled CLI under packages/cli/dist/cli/spec2flow-dist-entrypoint.js. npm install triggers prepare, so the dist entrypoint is built before the example scripts run.
Example commands:
npm install
npm run build
npm run test:unit
npm run migrate:platform-db -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform
npm run validate:docs
npm run validate:synapse-example
npm run generate:synapse-task-graph
npm run generate:synapse-execution-state
npm run preflight:copilot-cli
npm run claim:synapse-next-task
npm run submit:synapse-task-result
npm run simulate:synapse-model-run
npm run run:synapse-task-with-adapter
npm run run:synapse-copilot-cli-loop
npm run run:synapse-workflow-loop
npm run init:platform-run -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --task-graph docs/examples/synapse-network/generated/task-graph.json --repository-id spec2flow --repository-name Spec2Flow --repo-root .
npm run lease:platform-task -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --run-id spec2flow-platform --worker-id worker-1
npm run heartbeat:platform-task -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --run-id spec2flow-platform --task-id some-task-id --worker-id worker-1
npm run start:platform-task -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --run-id spec2flow-platform --task-id some-task-id --worker-id worker-1
npm run expire:platform-leases -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --run-id spec2flow-platform
npm run get:platform-run-state -- --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --run-id spec2flow-platform
npm run spec2flow -- run-platform-worker-task --database-url postgresql://localhost:5432/spec2flow --database-schema spec2flow_platform --run-id spec2flow-platform --task-id environment-preparation --worker-id worker-1
npm run generate:synapse-task-graph:frontend-change
npm run generate:synapse-task-graph:withdrawal-changerun-platform-worker-task now includes execution-time lease protection:
- it starts a background heartbeat loop while the task is running
- it auto-renews the active lease on the configured heartbeat cadence
- it stops work if lease ownership is lost or if heartbeat transport failures hit the configured threshold
Spec2Flow now supports two adapter integration modes:
- capability-only simulation through
simulate-model-run - real external adapter execution through
run-task-with-adapterorrun-workflow-loop --adapter-runtime <file>
The example adapter is now wired to GitHub Copilot CLI through gh copilot -p, using the programmatic prompt mode documented in the Copilot CLI command reference.
Copilot CLI integration assumptions:
ghis installedgh copilotis available on the machine- the user is authenticated for Copilot CLI
Recommended bootstrap commands:
gh copilot -- --help
gh copilot login
gh auth statusOptional environment variables for the example adapter:
SPEC2FLOW_COPILOT_MODELSPEC2FLOW_COPILOT_ADAPTER_NAMESPEC2FLOW_COPILOT_CWD
If SPEC2FLOW_COPILOT_MODEL is unset, the adapter will use the Copilot CLI account default model instead of forcing one.
The preferred place to pin a model is model-adapter-runtime.json through adapterRuntime.model. If that field is omitted, Spec2Flow leaves model selection to Copilot CLI.
The adapter uses these Copilot CLI best-practice choices from the docs:
- one focused non-interactive session per task claim with
-p - repository custom instructions via
.github/copilot-instructions.md - explicit model selection via
--model - autonomous execution via
--no-ask-user - constrained tool surface via
--available-tools view,grep,glob - no unnecessary remote tooling via
--disable-builtin-mcps
Important boundary: this integration targets GitHub Copilot CLI, not the VS Code Copilot Chat session API. The adapter shells out to gh copilot -p because that is the documented programmatic entrypoint.
The external adapter contract is intentionally thin:
- Spec2Flow emits a claim payload for one
taskId - an external command receives that claim path and any environment variables defined in
model-adapter-runtime.json - the command returns JSON on stdout or writes JSON to a file
- Spec2Flow normalizes that result and persists it back into
execution-state.json
Before starting a real Copilot-backed run, you can probe the environment with:
npm run preflight:copilot-cliThat command checks:
gh copilotis availablegh auth statussucceeds- the configured model or Copilot default model works with a one-shot
gh copilot -pJSON probe
When run-task-with-adapter or run-workflow-loop uses an adapter runtime whose provider is github-copilot-cli, Spec2Flow now runs this preflight automatically before execution.
If you need to bypass that check deliberately, pass --skip-preflight.
generate-task-graph also supports diff-aware risk matching:
npm run spec2flow -- generate-task-graph \
--project docs/examples/synapse-network/project.yaml \
--topology docs/examples/synapse-network/topology.yaml \
--risk docs/examples/synapse-network/risk.yaml \
--changed-files-from-git \
--git-base origin/main \
--git-head HEAD \
--output docs/examples/synapse-network/generated/task-graph.jsoninit-execution-state expands every task in a task graph into a persisted runtime state file:
npm run spec2flow -- init-execution-state \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--run-id synapse-example-run \
--adapter spec2flow-cli \
--model gpt-5.4 \
--session-id example-session \
--output docs/examples/synapse-network/generated/execution-state.jsonupdate-execution-state advances one subtask, appends notes or artifacts, and automatically promotes newly unblocked tasks to ready:
npm run spec2flow -- update-execution-state \
--state docs/examples/synapse-network/generated/execution-state.json \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--task-id environment-preparation \
--task-status completed \
--notes bootstrap-okclaim-next-task acts as the first controller primitive. It selects the next ready subtask, marks it in-progress, and emits the payload that a model adapter should consume:
npm run spec2flow -- claim-next-task \
--state docs/examples/synapse-network/generated/execution-state.json \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--adapter-capability docs/examples/synapse-network/model-adapter-capability.json \
--output docs/examples/synapse-network/generated/task-claim.jsonsubmit-task-result closes the loop for one claimed subtask. It writes the outcome back into execution-state.json, appends artifacts or errors, and promotes any newly unblocked downstream subtasks:
npm run spec2flow -- submit-task-result \
--state docs/examples/synapse-network/generated/execution-state.json \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--claim docs/examples/synapse-network/generated/task-claim.json \
--result-status completed \
--summary requirements-ready \
--notes scope-confirmed \
--add-artifacts 'requirements-summary|report|spec2flow/outputs/execution/frontend-smoke/requirements-summary.json' \
--output docs/examples/synapse-network/generated/task-result.jsonsimulate-model-run is a provider-neutral reference adapter. It consumes a claim payload, produces a simulated adapter response, writes the result back into execution-state.json, and emits a combined execution record:
npm run spec2flow -- simulate-model-run \
--state docs/examples/synapse-network/generated/execution-state.json \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--claim docs/examples/synapse-network/generated/task-claim.json \
--adapter-capability docs/examples/synapse-network/model-adapter-capability.json \
--output docs/examples/synapse-network/generated/simulated-model-run.jsonrun-workflow-loop ties the pieces together. It repeatedly claims the next ready task, runs the simulated adapter, and persists each step until the workflow completes or reaches a step cap:
npm run spec2flow -- run-workflow-loop \
--state docs/examples/synapse-network/generated/execution-state.json \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--adapter-capability docs/examples/synapse-network/model-adapter-capability.json \
--max-steps 8 \
--output-base docs/examples/synapse-network/generated/loop \
--output docs/examples/synapse-network/generated/workflow-loop-summary.jsonExecution model:
- A single user development request creates one workflow run identified by
runId. - The task graph expands that run into multiple stable
taskIdvalues, typically one per route-stage node such asfrontend-smoke--requirements-analysis. execution-state.jsonis the persisted runtime ledger for that run. It stores overall workflow status, the current stage, every subtask status, attached artifacts, and structured errors.claim-next-taskis the scheduler boundary between persisted state and a real model adapter. It produces the exact subtask payload that should be sent to Copilot or another provider.submit-task-resultis the write-back boundary from a model adapter or executor into Spec2Flow state.simulate-model-runis a reference adapter loop for validating controller behavior before binding to a real provider API.run-workflow-loopis the first end-to-end controller loop for an entire workflow run.- All later model invocations, validation runs, logs, bug drafts, and review handoffs should attach to
runId + taskId, not to an implicit chat session.
It can also collect changed files directly from git diff:
npm run spec2flow -- generate-task-graph \
--project .spec2flow/project.yaml \
--topology .spec2flow/topology.yaml \
--risk .spec2flow/policies/risk.yaml \
--changed-files-from-git \
--git-diff-repo /path/to/target-repo \
--git-base origin/main \
--git-head HEADIf no git refs are provided, Spec2Flow defaults to git diff --name-only HEAD in the selected repository. Use --git-staged to read only staged changes.
Risk escalation is scoped to routes whose declared target paths are actually touched by the changed files, so a frontend-only diff does not raise unrelated backend or settlement routes.
Contributions are welcome. Early contributions are especially valuable in:
- workflow design
- schema design
- Playwright integration
- GitHub Actions workflows
- GitHub Issues templates
- end-to-end examples
MIT