Skip to content

Fix legacy CA-signed certs not reissued on SAN mismatch#4493

Merged
caseydavenport merged 6 commits intotigera:masterfrom
caseydavenport:caseydavenport/fix-legacy-cert-san-check
Mar 5, 2026
Merged

Fix legacy CA-signed certs not reissued on SAN mismatch#4493
caseydavenport merged 6 commits intotigera:masterfrom
caseydavenport:caseydavenport/fix-legacy-cert-san-check

Conversation

@caseydavenport
Copy link
Member

@caseydavenport caseydavenport commented Mar 3, 2026

Fixes projectcalico/calico#11363

After the calico-apiserver namespace migration from calico-apiserver to
calico-system in v3.31, long-lived clusters with older operator-signed certs
hit TLS errors because the cert SANs still referenced the old namespace.

The root cause is in getKeyPair: the issuer identity check at line 504 used
exact match (==) against TigeraOperatorCAIssuerPrefix, but legacy operator
CAs use the format tigera-operator-signer@<timestamp>. This caused valid
legacy certs to be misidentified as BYO (Issuer=nil, so BYO() returns true),
which skips the SAN revalidation in GetOrCreateKeyPair.

The other issuer checks in the same function (lines 460, 489) already use
strings.HasPrefix correctly. This just makes the last check consistent.

With the fix, old certs enter the HasPrefix block, hit the authority key ID
mismatch (old CA key != current CA key), and getKeyPair returns nil —
triggering GetOrCreateKeyPair to issue a new cert with correct SANs.

Fix calico-apiserver TLS errors on upgrade to v3.31 for long-lived clusters.
The operator now correctly reissues certificates with updated SANs when the
apiserver namespace changes, instead of treating legacy operator-signed certs
as user-provided.

The issuer identity check at the end of getKeyPair used exact match
against TigeraOperatorCAIssuerPrefix, but legacy operator CAs use
the format "tigera-operator-signer@<timestamp>". This caused valid
legacy certs to be misidentified as BYO, skipping SAN revalidation.

Fixes the calico-apiserver TLS failure after the namespace migration
from calico-apiserver to calico-system in v3.31, where the cert had
SANs for the old namespace but was never reissued.
Use 365-day cert durations instead of 1-hour so tests actually exercise
the code paths past the 30-day grace period check. Also use
legacyWithClientKeyUsage (with legacySecretName) in the existing "does
replace a legacy secret" test so it hits line 504 instead of bailing
out early on invalid key usage. Remove the separate validLegacyCASecret
test since the existing test now covers the scenario properly.
The test intended to simulate a user-supplied cert but passed nil as
the CA, which creates an operator-signed cert via DefaultOperatorCASignerName.
With the HasPrefix fix, this cert is now correctly identified as
operator-signed and reissued instead of preserved. Use a real non-operator
CA (test.MakeTestCA) to properly simulate BYO behavior.
@caseydavenport caseydavenport merged commit fd1603e into tigera:master Mar 5, 2026
6 checks passed
@caseydavenport caseydavenport deleted the caseydavenport/fix-legacy-cert-san-check branch March 5, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

calico-apiserver TLS errors due to cert not being reissued after 3.31 upgrade namespace change

3 participants