Skip to content

governance: define security & quality finding triage process (candidate ADR-014); #276 = first CodeQL instance #277

@scottschreckengaust

Description

@scottschreckengaust

Summary

We lack a durable, written process for triaging inbound security & quality findings (GitHub code scanning / CodeQL, OSV-Scanner, Semgrep, Grype, Retire.js, Dependabot, zizmor). Findings are currently handled ad hoc, per-alert. As the tool surface grows, we need an agreed posture for how findings are triaged, fixed, dismissed-with-reason, or tracked to closure — and a decision on whether this warrants a new ADR (proposed: ADR-014 — Security & quality finding triage).

This is a governance / process issue, not an implementation task. Its output is a team decision + (likely) an ADR, with #276 as the first worked instance.

Why this is governance, not a new Vision tenet

This was considered as a candidate Vision tenet ("fix security & quality issues as they arise") and rejected as miscategorized — capturing the reasoning here so we don't relitigate it:

  • Tenets are durable design preferences used for review judgment ("does this change fit where we're going?"). "Fix issues as they arise" is a development process commitment — you can't review a PR against it, and it's near-tautological.
  • The design stance on security is already covered by existing tenets: 4 (fail closed on risk), 5 (isolation & least privilege), 7 (observable/attributable — don't let signal rot), and 10 (sample, not shrink-wrapped — honest about gaps). Keeping findings triaged-and-tracked rather than hidden is an expression of tenet 10, not a new tenet.
  • The process is already partially anchored by ADR-008 (Definition of Done), ADR-013 (Tiered Validation Pyramid), and ADR-003 (Contribution Governance) — what's missing is the specific "inbound finding → triage → disposition" loop.

Conclusion: no Vision change. The gap is a process decision (ADR) and optionally a Roadmap maturity line.

Proposed decision to make (for the team / ADR-014)

Decide and document:

  1. Scope — which scanners/feeds this governs (code scanning/CodeQL, OSV, Semgrep, Grype, Retire.js, Dependabot, zizmor — i.e. everything under mise run security + GitHub-native scanning).
  2. Triage SLA / cadence — expected time-to-triage by severity (e.g. error/critical vs. warning/low); who owns first-pass triage.
  3. Disposition taxonomy — the allowed outcomes for any finding:
    • Fix (code change via the normal issue → branch → PR governance).
    • Dismiss with documented reasoning — false positive / won't fix / used-in-tests, with the justification recorded where it's discoverable.
    • Track — accepted-risk with an owning issue and review date.
  4. Where dismissal reasoning lives — code-scanning UI dismissal note vs. an in-repo record (e.g. a suppressions/triage log) so reasoning survives outside the GitHub alert UI.
  5. Default vs. advanced CodeQL posture — whether to convert from default setup to advanced setup (re-run on demand via Actions vs. default setup's "re-run requires a commit/push" + occasional rate limits). See fix(security): add CodeQL sanitizer model for redact_secrets to clear false-positive clear-text-logging alerts (e.g. #29) #276 options.
  6. ADR or not — confirm whether this rises to ADR-014 (we believe it does: it establishes a pattern others follow per docs/decisions/README.md test (b)).

First worked instance — #276 (CodeQL triage)

#276 is the concrete case that surfaced this need. It concerns two CodeQL error-severity "clear-text logging" findings:

#276 explores the disposition options this governance issue must standardize:

  • Fix at the analysis level — a generalized CodeQL sanitizer model pack (.github/codeql/extensions/) so redacted flows stop alerting (with detection of unredacted flows preserved). Optionally this is the path where "won't fix" findings get fixed by GitHub's CodeQL process itself (modeling) rather than by source changes or manual dismissal.
  • Custom query / advanced setup — if a sanitizer can't be expressed as a data extension.
  • Dismiss with documented reasoning — the always-available fallback (esp. for feat(plugin): add Claude Code plugin with guided skills, agents, and hooks #30, which is by-design).

So #276 is the template for option 2 (dismiss-with-reason) and the optional "fix via CodeQL modeling" path that this governance issue should bless or constrain.

Acceptance criteria

  • Team decides whether to author ADR-014 — Security & quality finding triage (proposed: yes).
  • If yes, ADR-014 documents: scope, triage cadence/ownership, disposition taxonomy (fix / dismiss-with-reason / track), where dismissal reasoning is recorded, and the default-vs-advanced CodeQL posture.
  • Decision on whether a Roadmap maturity line ("continuous security-finding triage & burndown") is added under Observability and safe deploy.
  • fix(security): add CodeQL sanitizer model for redact_secrets to clear false-positive clear-text-logging alerts (e.g. #29) #276 is explicitly referenced from ADR-014 as the first worked instance (CodeQL triage; optional fix-via-modeling).
  • CONTRIBUTING.md cross-links the triage process (or ADR-014) so contributors know the loop.

Out of scope

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    securityCedar/HITL, IAM least-privilege, secrets, PII/DLP, guardrails, supply-chain/CVE

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions