Skip to content

feat(srv6): SRv6-native (DSR) ClusterIP Services#1043

Open
ryskn wants to merge 1 commit into
projectcalico:masterfrom
ryskn:srv6-dsr-upstream
Open

feat(srv6): SRv6-native (DSR) ClusterIP Services#1043
ryskn wants to merge 1 commit into
projectcalico:masterfrom
ryskn:srv6-dsr-upstream

Conversation

@ryskn

@ryskn ryskn commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in SRv6-native data path for pod-backed ClusterIP Services that
uses Direct Server Return (DSR) instead of cnat.

In an IPv6 single-stack + SRv6 setup, cross-node pod-backed ClusterIP Services
break: cnat's reverse session lives in the main FIB and the cnat-lookup input
arc never runs on the SRv6-decapped inner packet, so the reply is never
un-DNAT'd (the SYN-ACK keeps the backend Pod IP → client RST). This is the
pod-backed counterpart of the host-network case fixed in #1028.

Approach

Rather than make cnat compose with SRv6, steer the VIP over SRv6 and let the
backend accept it directly (DSR): the inner packet keeps dst=VIP end to end
and the backend replies with src=VIP, so no reverse translation is needed.
Source IP is preserved.

  • Opt-in: feature gate srv6NativeServicesEnabled + Service annotation
    cni.projectcalico.org/vppSRv6Native: "true". Eligible = ClusterIP, all ports
    port==targetPort, pod-backed. host-network / port-remapped / External-LB
    Services keep the cnat path.
  • Per-service ECMP SR policy over the backend nodes' End.DT6 SIDs + steering;
    backends bind the VIP and are allowed through uRPF; PodVRF delivery route.
    Reconcile-based, no NAT on the path.

Verified end to end on an IPv6 single-stack cluster: cross-node client reaches
the VIP with no cnat, no RST, and the client source IP preserved.

@ryskn ryskn force-pushed the srv6-dsr-upstream branch 2 times, most recently from f52ded1 to 29cbaed Compare June 8, 2026 01:40
@aritrbas

Copy link
Copy Markdown
Collaborator

/codebuild_run(29cbaed)

@aritrbas

Copy link
Copy Markdown
Collaborator

CI complains with the following errors:

golangci-lint run --color=never
0 issues.
markdownlint --dot \
    --ignore vpp-manager/vpp_build \
    --ignore vendor .
docs/services/README.md:35:81 error MD013/line-length Line length [Expected: 80; Actual: 87]
make: *** [Makefile:323: lint] Error 1
make: Leaving directory '/vpp-dataplane'
make: *** [Makefile:423: ci-lint] Error 2

[Container] 2026/06/10 17:04:18.760532 Command did not exit successfully make ci-lint exit status 2
[Container] 2026/06/10 17:04:18.767188 Phase complete: BUILD State: FAILED
[Container] 2026/06/10 17:04:18.767205 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: make ci-lint. Reason: exit status 2

@ryskn Can you please run make lint to validate it passes? Thanks.

Replace cnat DNAT/un-DNAT with Direct Server Return for pod-backed
ClusterIP services in the SRv6 data path. Traffic keeps dst=VIP end to
end; backend pods accept the VIP on lo and reply from it, so no reverse
translation is needed and the cross-node SRv6 un-DNAT failure disappears.
Source IP is preserved (no SNAT).

- config: SRv6NativeServicesEnabled feature gate + vppSRv6Native annotation
- services: classify eligible ClusterIP services (gate + annotation +
  ClusterIP + port==targetPort), skip cnat, publish DSRService events,
  reconcile-based diff/withdraw
- connectivity/SRv6: per-service ECMP SR policy over backend nodes'
  End.DT6 SIDs + steering; decap into PodVRF via sw_if_index; reconcile
  with retain-on-failure retry and periodic re-reconcile
- cni: bind VIP on local backend pods (lo) + uRPF allow + PodVRF ECMP
  delivery route; per-pod cleanup tracking; reconcile on pod add/del

Experimental; requires SRv6Enabled. Other services (NodePort/External,
port-remapped, host-network backed) keep the cnat path.

Signed-off-by: Ryosuke Nakayama <ryosuke.nakayama@ryskn.com>
@ryskn ryskn force-pushed the srv6-dsr-upstream branch from 29cbaed to c146809 Compare June 11, 2026 07:20
@ryskn

ryskn commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

thanks!
fixed the markdownlint MD013 (line-length) error in docs/services/README.md

@aritrbas

Copy link
Copy Markdown
Collaborator

/codebuild_run(c146809)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants