Parent Epic: #62
What
New skill: rust-observability covering production monitoring and observability for Rust services.
Skill Content
- Structured Tracing --
tracing crate setup, span context, #[instrument] attribute, field conventions
- OpenTelemetry Integration --
tracing-opentelemetry subscriber, span export to Jaeger/OTLP, context propagation across HTTP boundaries
- Prometheus Metrics --
metrics crate with metrics-exporter-prometheus, histogram/counter/gauge patterns, endpoint setup
- Per-Request Context -- Request ID propagation, correlation IDs in traces and logs, HTTP header extraction
- Error Reporting -- Structured error spans, error rate metrics, alerting thresholds
- Log Filtering --
EnvFilter configuration, per-crate log levels, production vs development configs
Source Reference
Section 13 (Production Monitoring and Observability) from RUST_SYSTEM_PROGRAMMING_BEST_PRACTICES.md
Disciplined Engineering Alignment
| Phase |
Skill |
Observability Relevance |
| Research |
disciplined-research |
Identify observability gaps in current service; map existing log/metric/trace coverage; document SLI/SLO requirements |
| Design |
disciplined-design |
Specify tracing span hierarchy; define metric names/labels/buckets; design log level strategy per environment |
| Implementation |
disciplined-implementation |
Add tracing subscriber first, then instrument hot paths, then metrics, then OTel export -- each as separate step |
| Verification |
disciplined-verification |
Verify spans appear in collector; confirm metric cardinality within bounds; test log filtering at each level |
| Validation |
disciplined-validation |
Validate observability under production load; confirm alerts fire correctly; stakeholder sign-off on dashboard |
Key integration point: The devops skill should reference rust-observability for Rust service deployment monitoring requirements.
The SKILL.md should include:
- A "Disciplined Observability Checklist" mapping each V-model phase to specific observability tasks
- Integration examples for axum middleware (tracing + metrics)
Why a New Skill (Not Extension)
Observability is a distinct cross-cutting concern for any production Rust service. It's not specific to development idioms or performance optimization.
Relevance to Terraphim
terraphim-server runs as a persistent service. Structured tracing and metrics are essential for production operation and debugging.
Acceptance Criteria
Parent Epic: #62
What
New skill:
rust-observabilitycovering production monitoring and observability for Rust services.Skill Content
tracingcrate setup, span context,#[instrument]attribute, field conventionstracing-opentelemetrysubscriber, span export to Jaeger/OTLP, context propagation across HTTP boundariesmetricscrate withmetrics-exporter-prometheus, histogram/counter/gauge patterns, endpoint setupEnvFilterconfiguration, per-crate log levels, production vs development configsSource Reference
Section 13 (Production Monitoring and Observability) from RUST_SYSTEM_PROGRAMMING_BEST_PRACTICES.md
Disciplined Engineering Alignment
disciplined-researchdisciplined-designdisciplined-implementationdisciplined-verificationdisciplined-validationKey integration point: The
devopsskill should referencerust-observabilityfor Rust service deployment monitoring requirements.The SKILL.md should include:
Why a New Skill (Not Extension)
Observability is a distinct cross-cutting concern for any production Rust service. It's not specific to development idioms or performance optimization.
Relevance to Terraphim
terraphim-server runs as a persistent service. Structured tracing and metrics are essential for production operation and debugging.
Acceptance Criteria
skills/rust-observability/SKILL.mdcreated with full content