Skip to content

Gateway Remote Debug#1233

Merged
renuka-fernando merged 13 commits intowso2:mainfrom
renuka-fernando:gateway
Feb 20, 2026
Merged

Gateway Remote Debug#1233
renuka-fernando merged 13 commits intowso2:mainfrom
renuka-fernando:gateway

Conversation

@renuka-fernando
Copy link
Contributor

@renuka-fernando renuka-fernando commented Feb 18, 2026

Purpose

fix #1232

Summary

  • Enable remote debugging of gateway components (controller, runtime, builder) via Delve, with Docker Compose and VS Code launch configurations
  • Make the control plane host optional so the gateway runs in standalone mode when no platform API is configured
  • Remove redundant build-local Makefile targets across all gateway components

Changes

Remote Debug Infrastructure

  • Add docker-compose.debug.yaml with debug variants of gateway-controller and gateway-runtime services that expose Delve ports (2345/2346) and disable compiler optimizations
  • Add gateway/gateway-runtime/docker-entrypoint-debug.sh — entrypoint for debug runtime image; starts the policy engine under Delve, then launches Envoy
  • Update gateway-controller/Dockerfile and gateway-runtime/Dockerfile with a debug build stage that compiles binaries with -gcflags="all=-N -l" and embeds Delve
  • Add make build-debug and make run-debug targets to root gateway/Makefile, gateway-controller/Makefile, and gateway-runtime/Makefile
  • Add VS Code launch configurations (.vscode/launch.json) for attaching to all three gateway components via remote Delve
  • Extend gateway-builder compilation options to pass through build flags (-gcflags, -ldflags) for debug policy-engine builds

Standalone Gateway Mode

  • Make controlplane.host optional in gateway-controller config — an empty host now means standalone mode (no platform API connection); a token without a host is treated as a misconfiguration
  • Update docker-compose.yaml to omit the control plane host by default, allowing the gateway to start without a running platform API

Build Cleanup

  • Remove build-local targets from root gateway/Makefile and from gateway-controller, gateway-builder, and gateway-runtime component Makefiles — these targets used DOCKER_BUILDKIT=1 docker build but invoked the same BuildKit daemon as docker buildx build, making them redundant

Documentation

  • Expand gateway/DEBUG_GUIDE.md with a full remote debug walkthrough: building debug images, starting the debug Compose stack, and attaching VS Code to each component

Summary by CodeRabbit

  • New Features

    • Remote debugging for Controller and Policy Engine (Delve-enabled debug images, attachable debug flow and ports).
    • New debug-focused build targets producing debug image variants.
  • Documentation

    • Expanded Gateway Debug Guide with remote/local options, step-by-step setup, path/port mapping, and troubleshooting.
  • Chores

    • Consolidated runtime port labeling and compose overlays; added build-lock manifest and updated local/build command guidance.
  • Bug Fixes

    • Control-plane validation updated: token now requires a host (improved config validation message).

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Walkthrough

Adds Delve-based remote debugging: VS Code launch configs, debug Docker images and compose overlay, debug build targets and Dockerfile stages, compilation flags to preserve debug info, a debug entrypoint coordinating Policy Engine and Envoy, control-plane standalone-mode handling, and a policies build-lock manifest.

Changes

Cohort / File(s) Summary
VS Code & Debug Docs
\.vscode/launch.json, gateway/DEBUG_GUIDE.md
Added remote attach configs for Controller (port 2345) and Policy Engine (2346) with substitutePath mappings; replaced single local guide with a dual-option Gateway Debug Guide (Remote & Local Process).
Docker Compose (debug overlay)
gateway/docker-compose.debug.yaml, gateway/docker-compose.yaml
Added docker-compose.debug.yaml with debug services exposing dlv ports and SYS_PTRACE/seccomp adjustments; expanded/runtime port comments and kept compose files aligned.
Makefiles & CI refs
gateway/Makefile, gateway/gateway-controller/Makefile, gateway/gateway-runtime/Makefile, gateway/gateway-builder/Makefile, gateway/it/Makefile, .github/workflows/copilot-setup-steps.yml
Removed build-local targets and references; added build-debug targets and updated PHONY/help text and CI/workflow to use make build where applicable.
Controller image & Dockerfile
gateway/gateway-controller/Dockerfile, gateway/gateway-controller/Makefile
Added ARG ENABLE_DEBUG, conditional -gcflags for debug, new debug build stage installing dlv, and entrypoint to run dlv headless (2345).
Runtime image, entrypoint & Dockerfile
gateway/gateway-runtime/Dockerfile, gateway/gateway-runtime/docker-entrypoint-debug.sh, gateway/gateway-runtime/docker-entrypoint.sh, gateway/gateway-runtime/Makefile
Added ENABLE_DEBUG flow, debug build/runtime stages, exposes dlv (2346), new docker-entrypoint-debug.sh to run Policy Engine under dlv and coordinate Envoy; added sync comment to main entrypoint; new build-debug target.
Compilation & build flags
gateway/gateway-builder/internal/compilation/compiler.go, gateway/gateway-builder/internal/compilation/options.go, gateway/gateway-builder/internal/compilation/options_test.go, gateway/gateway-builder/pkg/types/policy.go
Introduced EnableDebug on CompilationOptions; propagate DEBUG env; generateLDFlags accepts enableDebug; runGoBuild appends -gcflags "all=-N -l" when debug enabled; tests updated.
Controller config & tests
gateway/gateway-controller/pkg/config/config.go, gateway/gateway-controller/pkg/config/config_test.go
Default controlplane.host set to empty (standalone mode allowed); validation permits empty host unless controlplane.token is set (token requires host); tests updated to cover token-without-host case.
Entry scripts & rollout
gateway/gateway-runtime/docker-entrypoint-debug.sh, gateway/gateway-runtime/docker-entrypoint.sh, gateway/gateway-builder/.../options_test.go
Added robust debug entrypoint script with process coordination, socket wait, signal handling, and logging prefixes; minor comment sync in main entrypoint; tests adjusted for LD flags signature.
Integration compose docs
gateway/it/docker-compose.test.yaml, gateway/it/docker-compose.test.postgres.yaml
Updated inline comments and top-level notes clarifying port semantics and duplication; no functional port changes.
Build lock / policies manifest
gateway/build-lock.yaml
Added static build-lock manifest listing gateway policy modules and pinned versions.
Misc docs
gateway/README.md, gateway/dev-policies/README.md, .github/copilot-instructions.md
Replaced references to make build-local with make build and updated build guidance/examples to reflect new debug/build flow.
Kubernetes debug image
kubernetes/gateway-operator/Dockerfile.debug
Pinned Delve install to fixed version v1.26.0 instead of latest tag.

Sequence Diagram

sequenceDiagram
    participant VSCode as VS Code (Debugger Client)
    participant Dlv as Delve (dlv)
    participant Container as Docker Container
    participant Entrypoint as docker-entrypoint-debug.sh
    participant Policy as Policy Engine (debug)
    participant Envoy as Envoy Router

    VSCode->>Dlv: Attach to port 2345/2346
    Container->>Entrypoint: Start container (entrypoint-debug)
    Entrypoint->>Entrypoint: Parse args & env, generate Envoy config
    Entrypoint->>Policy: Start Policy Engine under dlv (headless)
    Policy->>Dlv: Initialize debugger (listen)
    Entrypoint->>Entrypoint: Wait for Policy socket (timeout ~60s)
    Entrypoint->>Envoy: Start Envoy with generated config
    VSCode->>Dlv: Set breakpoints / control execution
    Dlv->>Policy: Pause / resume / inspect
    rect rgba(255, 100, 100, 0.5)
    Entrypoint->>Policy: On SIGTERM/SIGINT -> forward termination
    Entrypoint->>Envoy: On SIGTERM/SIGINT -> forward termination
    Entrypoint->>Entrypoint: Cleanup socket & exit
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰
I nibble logs beneath moonlight gleam,
dlv and ports align my dream,
Breakpoints bloom on two-three-four-six,
Envoy hums while policies mix,
Hopping through code — the gateway's seen!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Gateway Remote Debug' clearly and specifically describes the main change in the changeset: enabling remote debugging infrastructure for gateway components.
Description check ✅ Passed The PR description provides Purpose, Summary, Changes (organized by Remote Debug Infrastructure, Standalone Gateway Mode, Build Cleanup, and Documentation), and explains both the what and how of the implementation without following the exact template structure.
Linked Issues check ✅ Passed The PR successfully implements all coding objectives from issue #1232: adds remote debugging capability via Docker/Compose/VS Code configurations, enables standalone gateway mode by making controlplane.host optional, and removes redundant build-local targets.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the remote debug implementation, standalone mode support, and build cleanup objectives. Minor scope adjustments like fixed Delve version in kubernetes/gateway-operator/Dockerfile.debug and documentation updates are integral to the feature.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
gateway/gateway-builder/internal/compilation/options_test.go (1)

123-166: 🛠️ Refactor suggestion | 🟠 Major

Missing test coverage for enableDebug=true path in generateLDFlags and BuildOptions.

Every updated generateLDFlags call passes false for the third argument. There is no test that exercises enableDebug=true, so:

  • It is not verified that -s -w is absent when debug is enabled (required for dlv to have DWARF info).
  • BuildOptions is not tested for a ENABLE_DEBUG env var or the resulting EnableDebug field.
  • TestBuildOptions_Default never asserts opts.EnableDebug == false.
🧪 Suggested test additions
+func TestGenerateLDFlags_WithDebug(t *testing.T) {
+    metadata := &types.BuildMetadata{
+        Version:   "v1.0.0",
+        GitCommit: "abc123",
+        Timestamp: time.Date(2025, 6, 15, 14, 30, 0, 0, time.UTC),
+    }
+
+    ldflags := generateLDFlags(metadata, false, true)
+
+    // Debug builds must NOT strip symbols/DWARF (dlv requires them)
+    assert.NotContains(t, ldflags, "-s -w")
+    assert.Contains(t, ldflags, "-X main.Version=v1.0.0")
+}
+
+func TestBuildOptions_WithDebugEnabled(t *testing.T) {
+    os.Setenv("ENABLE_DEBUG", "true")
+    defer os.Unsetenv("ENABLE_DEBUG")
+
+    metadata := &types.BuildMetadata{Version: "v1.0.0", GitCommit: "abc", Timestamp: time.Now()}
+    opts := BuildOptions("/output/binary", metadata)
+
+    assert.True(t, opts.EnableDebug)
+    assert.NotContains(t, opts.LDFlags, "-s -w")
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/compilation/options_test.go` around lines
123 - 166, Add tests that exercise the enableDebug=true path: create a
TestGenerateLDFlags_DebugEnabled that calls generateLDFlags(metadata, false,
true) and assert that "-s -w" is NOT present while the -X
main.Version/GitCommit/BuildDate entries are present; add
TestBuildOptions_EnableDebug which sets the ENABLE_DEBUG env var (true/1) and
verifies BuildOptions()/opts.EnableDebug==true (and unset/restore env
afterwards); and update TestBuildOptions_Default to explicitly assert
opts.EnableDebug==false to cover the default behavior.
🧹 Nitpick comments (8)
gateway/gateway-controller/pkg/config/config.go (1)

519-526: Breaking-change risk: default controlplane.host changed to empty string.

Any deployment that previously relied on the compiled-in default (former "localhost:9243") without explicitly overriding controlplane.host in config/env will now silently enter standalone mode after upgrading — no control-plane connection, no error. Operators need to explicitly set APIP_GW_CONTROLPLANE_HOST (or equivalent) to preserve previous connected-mode behaviour.

Consider adding a prominent note in the migration guide / DEBUG_GUIDE.md or CHANGELOG about this default change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/pkg/config/config.go` around lines 519 - 526, The
default for ControlPlane in the config was changed to an empty Host which
silently forces standalone mode; restore the previous compiled-in default host
(set ControlPlane.Host back to "localhost:9243") in the ControlPlaneConfig
default block (the ControlPlane literal in config.go) or, alternatively, add a
startup warning/error when ControlPlane.Host is empty so operators are informed;
update the ControlPlaneConfig default initialization (refer to
ControlPlaneConfig and the ControlPlane default struct) to implement one of
these fixes and ensure tests/config docs are updated accordingly.
gateway/gateway-controller/pkg/config/config_test.go (1)

643-665: Consider adding a test case for the valid "token + host" combination.

The test suite covers:

  • ✅ host set, no token (line 643)
  • ✅ no host, no token — standalone (line 652)
  • ✅ no host, token set — misconfiguration (line 659)
  • ❌ host set, token set — should be valid, but not explicitly tested

Without this case, a future regression that accidentally rejects host + token would go undetected.

🧪 Suggested additional test case
 {
     name:             "Missing host (standalone mode)",
     ...
 },
+{
+    name:             "Token set with host (valid)",
+    host:             "localhost",
+    token:            "some-token",
+    reconnectInitial: 1 * time.Second,
+    reconnectMax:     30 * time.Second,
+    pollingInterval:  5 * time.Second,
+    wantErr:          false,
+},
 {
     name:        "Token set without host",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/pkg/config/config_test.go` around lines 643 - 665,
Add a table-driven test case that verifies the valid "host + token" combination
is accepted: insert an entry in the existing test cases with name: "Host and
token set (valid)", host: "localhost", token: "some-token", reconnectInitial:
1*time.Second, reconnectMax: 30*time.Second, pollingInterval: 5*time.Second,
wantErr: false (no errContains). Ensure the new case follows the same struct
shape used by the other entries (fields name, host, token, reconnectInitial,
reconnectMax, pollingInterval, wantErr, errContains) so the existing test runner
(the Test... function that iterates over the table) will execute it.
gateway/gateway-runtime/Makefile (1)

124-129: clean target doesn't remove debug images.

The clean target removes the regular and coverage images but not the debug variants, leaving stale debug images around.

Proposed fix
 clean: ## Clean Docker images
 	`@echo` "Cleaning Docker images..."
 	`@docker` rmi $(IMAGE_NAME):$(VERSION) 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME):latest 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME)-coverage:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:latest 2>/dev/null || true
 	`@rm` -rf target
 	`@echo` "Clean completed"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` around lines 124 - 129, The clean target in
the Makefile currently removes normal and coverage images but not the debug
variants; update the clean recipe (target "clean") to also remove the debug
image tags using the same variables (e.g. docker rmi
$(IMAGE_NAME)-debug:$(VERSION) and docker rmi $(IMAGE_NAME)-debug:latest),
ensuring you silence failures with "2>/dev/null || true" just like the other rmi
lines so stale debug images are removed safely; keep IMAGE_NAME and VERSION
variable usage consistent with the existing lines.
gateway/DEBUG_GUIDE.md (1)

121-148: Option 2 instructions reference the new port layout but ask users to hand-edit docker-compose.yaml.

The instructions tell users to modify docker-compose.yaml to comment out Policy Engine ports and change GATEWAY_CONTROLLER_HOST. Consider providing a dedicated docker-compose.local-debug.yaml override file (or an env-file) instead, so users don't have to edit — and potentially forget to revert — the checked-in compose file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/DEBUG_GUIDE.md` around lines 121 - 148, The current guide asks users
to edit gateway/docker-compose.yaml directly (service gateway-runtime) to change
GATEWAY_CONTROLLER_HOST and comment out Policy Engine ports, which is
error-prone; instead add a local override file (e.g.,
docker-compose.local-debug.yaml) or an env-file that sets
GATEWAY_CONTROLLER_HOST=host.docker.internal and redefines the gateway-runtime
ports block to omit the Policy Engine ports (9002, 9003), then document using
docker-compose -f docker-compose.yaml -f docker-compose.local-debug.yaml up (or
exporting the env-file) so users can apply local debug changes without modifying
the checked-in compose file.
gateway/gateway-runtime/docker-entrypoint-debug.sh (1)

121-144: Shutdown handler has no timeout — could hang if a process ignores SIGTERM.

The bare wait on line 137 blocks indefinitely. In a debug context this is usually fine, but a kill -9 fallback after a grace period would make the script more robust.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh` around lines 121 - 144,
The shutdown() handler can hang because the bare wait may block forever if a
process ignores SIGTERM; modify shutdown to implement a configurable grace
period (e.g., GRACE_PERIOD or a hardcoded few seconds) after sending SIGTERM to
PE_PID and ENVOY_PID, poll/wait for their termination during that grace window,
and if either process still exists after the timeout send SIGKILL (kill -KILL)
as a fallback; ensure you still remove the socket (${POLICY_ENGINE_SOCKET}) and
exit with appropriate status after cleanup. Reference symbols: shutdown(),
PE_PID, ENVOY_PID, POLICY_ENGINE_SOCKET.
gateway/docker-compose.debug.yaml (1)

29-59: Consider adding the sample-backend service or a note about it.

The main docker-compose.yaml includes sample-backend which the debug guide's test steps rely on (curl http://localhost:8080/petstore/v1/pets). Without it, developers using only this file will need to start the backend separately. A brief comment or inclusion of the sample-backend service would improve the out-of-box debug experience.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/docker-compose.debug.yaml` around lines 29 - 59, The debug compose is
missing the sample-backend used in the guide; add a minimal service named
sample-backend (matching the main compose's service name) that exposes port 8080
and uses the same image/ports used by the main docker-compose (or include a
clear comment near the gateway-controller service mentioning developers must
start the sample-backend separately) so the guide's test command (curl
http://localhost:8080/petstore/v1/pets) works out-of-the-box; if you add the
service ensure it joins gateway-network and any required volumes/envs match the
main setup.
gateway/gateway-controller/Dockerfile (1)

79-80: Pin the Delve version for reproducible debug builds.

Using @latest means each image build may pull a different dlv version, potentially introducing debugging inconsistencies or breaking changes. The current stable version is v1.26.0.

Suggested fix
-RUN go install github.com/go-delve/delve/cmd/dlv@latest
+RUN go install github.com/go-delve/delve/cmd/dlv@v1.26.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/Dockerfile` around lines 79 - 80, Replace the
non-deterministic Delve install that uses "@latest" with a pinned stable release
to ensure reproducible builds: update the Dockerfile RUN that calls "go install
github.com/go-delve/delve/cmd/dlv@latest" to use the known stable tag (e.g.,
"@v1.26.0") so the "dlv" binary version is fixed across image builds.
gateway/gateway-runtime/Dockerfile (1)

129-163: Consider extracting a shared base stage to reduce duplication with the production stage.

Lines 130-144 (base image, apt packages) and 146 (mkdir) are nearly identical to the production stage (lines 166-185). A shared intermediate stage would keep both in sync and reduce maintenance drift.

♻️ Sketch of a shared base stage
+# Stage 3: Common runtime base
+FROM envoyproxy/envoy:v1.35.3 AS runtime-base
+USER root
+RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
+    --mount=type=cache,target=/var/lib/apt/lists,sharing=locked \
+    apt-get update && apt-get install -y --no-install-recommends \
+    tini gettext-base ca-certificates curl \
+    netcat-openbsd dnsutils iproute2 iputils-ping net-tools
+RUN mkdir -p /app /var/run/api-platform /coverage
+
 # Stage 3a: dlv installer
 FROM golang:1.25.7-bookworm AS debug-dlv-builder
 RUN go install github.com/go-delve/delve/cmd/dlv@latest
 
-# Stage 3b: Debug Runtime
-FROM envoyproxy/envoy:v1.35.3 AS debug
-USER root
-RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
-    ...
-RUN mkdir -p /app /var/run/api-platform /coverage
+# Stage 3b: Debug Runtime
+FROM runtime-base AS debug
 ...
 
-# Stage 4: Runtime (production)
-FROM envoyproxy/envoy:v1.35.3
-USER root
-RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
-    ...
-RUN mkdir -p /app /var/run/api-platform /coverage
+# Stage 4: Runtime (production)
+FROM runtime-base
 ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 129 - 163, Extract the
repeated base setup (the FROM envoyproxy/envoy:v1.35.3 image, apt-get install of
packages, and the mkdir /app /var/run/api-platform /coverage) into a single
intermediate stage (e.g., a runtime-base stage) and have both the debug stage
and the production/runtime stage use COPY --from=runtime-base instead of
duplicating the same RUN/USER/ENV setup; keep stage-specific actions (like COPY
--from=debug-dlv-builder /go/bin/dlv and adding dlv, or any debug-specific
EXPOSE/ENTRYPOINT changes) in the debug stage (the stage currently named debug)
and keep policy-engine and other runtime-only copies (COPY
--from=policy-compiler /workspace/output/gateway-runtime/policy-engine) in the
final stages, ensuring file permissions (chmod +x ...) and ENV/EXPOSE/ENTRYPOINT
are applied in the appropriate final stages rather than the shared base.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/DEBUG_GUIDE.md`:
- Around line 22-25: The header comment inside docker-compose.debug.yaml
currently suggests using it as an overlay ("Usage: docker compose -f
docker-compose.yaml -f docker-compose.debug.yaml up"), which conflicts with the
README command in DEBUG_GUIDE.md that runs docker-compose.debug.yaml standalone;
update the comment in docker-compose.debug.yaml (the "Usage: docker compose -f
docker-compose.yaml -f docker-compose.debug.yaml up" line) to reflect standalone
usage (e.g., instruct to run "docker compose -f docker-compose.debug.yaml up")
or otherwise clarify that this file contains full service definitions and must
not be used as an override, so the two docs are consistent.

In `@gateway/docker-compose.debug.yaml`:
- Around line 19-20: The header comment ("Compose overlay for remote debugging
gateway components with dlv.") and the usage line are incorrect — this YAML is a
standalone docker-compose file, not an overlay; update the top comment and the
usage example to state it is a standalone compose configuration for remote
debugging and show the correct command (e.g., "docker compose -f
docker-compose.debug.yaml up") so users don't mistakenly combine it with
docker-compose.yaml and cause duplicate bindings; keep the description
mentioning dlv and debugging gateway components but remove or replace "overlay"
language and the "-f docker-compose.yaml -f" usage pattern.

In `@gateway/gateway-builder/internal/compilation/compiler.go`:
- Around line 130-136: Add a unit test that mirrors
TestGenerateLDFlags_WithCoverage but sets enableDebug=true to assert that
generateLDFlags does not include the stripped flags "-s -w"; specifically create
TestGenerateLDFlags_WithDebug (or TestBuildOptions_WithDebugEnabled) which calls
generateLDFlags with enableDebug=true (and enableCoverage=false) and asserts the
returned ldflags string/array does not contain "-s -w" while preserving other
expected flags — reference generateLDFlags and the existing
TestGenerateLDFlags_WithCoverage for test structure and assertions.

In `@gateway/gateway-builder/internal/compilation/options.go`:
- Around line 39-43: Replace the generic DEBUG env var lookup with ENABLE_DEBUG:
change the call that assigns debugEnv := os.Getenv("DEBUG") to use
os.Getenv("ENABLE_DEBUG") and keep the strings.EqualFold check and enableDebug
variable logic intact (symbols: enableDebug, debugEnv, os.Getenv,
strings.EqualFold). Also audit sibling coverage handling and ensure any similar
lookup uses "ENABLE_COVERAGE" instead of "COVERAGE" so env names align with the
Dockerfile (symbols to check: any COVERAGE variable, os.Getenv calls that
reference "COVERAGE", and the corresponding enableCoverage variable).

In `@gateway/gateway-controller/Makefile`:
- Line 31: The Copilot instructions reference a removed Makefile target
(`build-local`) so update .github/copilot-instructions.md to instruct users to
run the current target (`make build`) or `make build-debug` for debug workflows
instead; search for the text "rebuild Docker images using `cd gateway && make
build-local`" and replace it with a short, explicit recommendation to use `cd
gateway && make build` (or `make build-debug`), and ensure the guidance still
references the same file globs (`gateway/**/*.{go,yaml,yml,Dockerfile}`) so
automated prompts remain accurate.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Line 32: The script uses set -e which causes the monitor block using wait -n
to abort before cleanup (exit code capture, sibling termination, socket cleanup)
when a child exits non-zero; either disable errexit before the monitor block
(use set +e before the wait -n loop and restore set -e after) or change the wait
invocation to capture the child exit status inline (e.g., run wait -n and assign
its exit code to a variable without letting it trigger errexit), ensuring the
subsequent logic that sets rc, kills siblings, and removes sockets (the
monitor/cleanup block around wait -n and the exit code capture) always runs.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 125-127: The Dockerfile uses the build stage named
debug-dlv-builder and runs "go install
github.com/go-delve/delve/cmd/dlv@latest", which pulls a moving target; replace
the `@latest` suffix with a specific Delve release tag to ensure reproducible
debug builds (e.g., use a chosen semver tag like `@vX.Y.Z`) so the
debug-dlv-builder stage always installs the same dlv binary.

---

Outside diff comments:
In `@gateway/gateway-builder/internal/compilation/options_test.go`:
- Around line 123-166: Add tests that exercise the enableDebug=true path: create
a TestGenerateLDFlags_DebugEnabled that calls generateLDFlags(metadata, false,
true) and assert that "-s -w" is NOT present while the -X
main.Version/GitCommit/BuildDate entries are present; add
TestBuildOptions_EnableDebug which sets the ENABLE_DEBUG env var (true/1) and
verifies BuildOptions()/opts.EnableDebug==true (and unset/restore env
afterwards); and update TestBuildOptions_Default to explicitly assert
opts.EnableDebug==false to cover the default behavior.

---

Nitpick comments:
In `@gateway/DEBUG_GUIDE.md`:
- Around line 121-148: The current guide asks users to edit
gateway/docker-compose.yaml directly (service gateway-runtime) to change
GATEWAY_CONTROLLER_HOST and comment out Policy Engine ports, which is
error-prone; instead add a local override file (e.g.,
docker-compose.local-debug.yaml) or an env-file that sets
GATEWAY_CONTROLLER_HOST=host.docker.internal and redefines the gateway-runtime
ports block to omit the Policy Engine ports (9002, 9003), then document using
docker-compose -f docker-compose.yaml -f docker-compose.local-debug.yaml up (or
exporting the env-file) so users can apply local debug changes without modifying
the checked-in compose file.

In `@gateway/docker-compose.debug.yaml`:
- Around line 29-59: The debug compose is missing the sample-backend used in the
guide; add a minimal service named sample-backend (matching the main compose's
service name) that exposes port 8080 and uses the same image/ports used by the
main docker-compose (or include a clear comment near the gateway-controller
service mentioning developers must start the sample-backend separately) so the
guide's test command (curl http://localhost:8080/petstore/v1/pets) works
out-of-the-box; if you add the service ensure it joins gateway-network and any
required volumes/envs match the main setup.

In `@gateway/gateway-controller/Dockerfile`:
- Around line 79-80: Replace the non-deterministic Delve install that uses
"@latest" with a pinned stable release to ensure reproducible builds: update the
Dockerfile RUN that calls "go install github.com/go-delve/delve/cmd/dlv@latest"
to use the known stable tag (e.g., "@v1.26.0") so the "dlv" binary version is
fixed across image builds.

In `@gateway/gateway-controller/pkg/config/config_test.go`:
- Around line 643-665: Add a table-driven test case that verifies the valid
"host + token" combination is accepted: insert an entry in the existing test
cases with name: "Host and token set (valid)", host: "localhost", token:
"some-token", reconnectInitial: 1*time.Second, reconnectMax: 30*time.Second,
pollingInterval: 5*time.Second, wantErr: false (no errContains). Ensure the new
case follows the same struct shape used by the other entries (fields name, host,
token, reconnectInitial, reconnectMax, pollingInterval, wantErr, errContains) so
the existing test runner (the Test... function that iterates over the table)
will execute it.

In `@gateway/gateway-controller/pkg/config/config.go`:
- Around line 519-526: The default for ControlPlane in the config was changed to
an empty Host which silently forces standalone mode; restore the previous
compiled-in default host (set ControlPlane.Host back to "localhost:9243") in the
ControlPlaneConfig default block (the ControlPlane literal in config.go) or,
alternatively, add a startup warning/error when ControlPlane.Host is empty so
operators are informed; update the ControlPlaneConfig default initialization
(refer to ControlPlaneConfig and the ControlPlane default struct) to implement
one of these fixes and ensure tests/config docs are updated accordingly.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Around line 121-144: The shutdown() handler can hang because the bare wait may
block forever if a process ignores SIGTERM; modify shutdown to implement a
configurable grace period (e.g., GRACE_PERIOD or a hardcoded few seconds) after
sending SIGTERM to PE_PID and ENVOY_PID, poll/wait for their termination during
that grace window, and if either process still exists after the timeout send
SIGKILL (kill -KILL) as a fallback; ensure you still remove the socket
(${POLICY_ENGINE_SOCKET}) and exit with appropriate status after cleanup.
Reference symbols: shutdown(), PE_PID, ENVOY_PID, POLICY_ENGINE_SOCKET.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 129-163: Extract the repeated base setup (the FROM
envoyproxy/envoy:v1.35.3 image, apt-get install of packages, and the mkdir /app
/var/run/api-platform /coverage) into a single intermediate stage (e.g., a
runtime-base stage) and have both the debug stage and the production/runtime
stage use COPY --from=runtime-base instead of duplicating the same RUN/USER/ENV
setup; keep stage-specific actions (like COPY --from=debug-dlv-builder
/go/bin/dlv and adding dlv, or any debug-specific EXPOSE/ENTRYPOINT changes) in
the debug stage (the stage currently named debug) and keep policy-engine and
other runtime-only copies (COPY --from=policy-compiler
/workspace/output/gateway-runtime/policy-engine) in the final stages, ensuring
file permissions (chmod +x ...) and ENV/EXPOSE/ENTRYPOINT are applied in the
appropriate final stages rather than the shared base.

In `@gateway/gateway-runtime/Makefile`:
- Around line 124-129: The clean target in the Makefile currently removes normal
and coverage images but not the debug variants; update the clean recipe (target
"clean") to also remove the debug image tags using the same variables (e.g.
docker rmi $(IMAGE_NAME)-debug:$(VERSION) and docker rmi
$(IMAGE_NAME)-debug:latest), ensuring you silence failures with "2>/dev/null ||
true" just like the other rmi lines so stale debug images are removed safely;
keep IMAGE_NAME and VERSION variable usage consistent with the existing lines.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
gateway/docker-compose.yaml (1)

39-40: ⚠️ Potential issue | 🟠 Major

Remove the fallback value for APIP_GW_CONTROLPLANE_HOST to allow the gateway to start without the control plane.

The PR objective to omit the control plane host by default is not implemented. Line 40 still includes the fallback value host.docker.internal:9243, which means an out-of-the-box docker compose up without environment variables will still attempt to reach the control plane. Remove the fallback value (e.g., ${GATEWAY_CONTROLPLANE_HOST} without :-host.docker.internal:9243) so the gateway can start independently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/docker-compose.yaml` around lines 39 - 40, Remove the fallback
default for APIP_GW_CONTROLPLANE_HOST so the container uses the raw env var;
replace the current value that includes ":-host.docker.internal:9243" with just
${GATEWAY_CONTROLPLANE_HOST} (leave APIP_GW_GATEWAY_REGISTRATION_TOKEN as-is) to
allow the gateway to start without a control plane set.
gateway/gateway-runtime/Dockerfile (1)

165-216: ⚠️ Potential issue | 🟠 Major

Update .github/copilot-instructions.md and gateway/README.md to remove references to non-existent make build-local target.

The copilot instructions at .github/copilot-instructions.md (line 23) and the gateway README.md (lines 28, 31-34) prescribe cd gateway && make build-local and its variants (build-local-controller, build-local-policy-engine, etc.) as the rebuild command for developers. However, the gateway/Makefile contains no such targets—only build, build-controller, build-gateway-builder, and build-gateway-runtime exist. Any developer following these guidelines will immediately encounter a make: *** No rule to make target 'build-local' error.

Update the documentation to reference the correct target names, or restore the build-local aliases in the Makefile for backward compatibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 165 - 216, Docs reference a
non-existent Makefile target build-local causing "No rule to make target
'build-local'"; either update .github/copilot-instructions.md and
gateway/README.md to use the actual Makefile targets (build, build-controller,
build-gateway-builder, build-gateway-runtime) or add compatibility aliases in
the gateway/Makefile (phony targets build-local, build-local-controller,
build-local-policy-engine that invoke the corresponding real targets) so the
documentation and build commands match; update the specific mentions of
build-local in the two docs to the chosen solution.
🧹 Nitpick comments (2)
gateway/gateway-runtime/Dockerfile (1)

129-163: Consider asserting ENABLE_DEBUG=true inside the debug stage to prevent silent symbol-less debug builds.

The ENABLE_DEBUG ARG is only declared and consumed in Stage 2 (policy-compiler). Stage 3b always copies the policy-engine binary from Stage 2, regardless of whether it was compiled with debug symbols. If make build-debug (or a manual docker build) omits --build-arg ENABLE_DEBUG=true, the binary lands in the debug image stripped of debug info — dlv attaches but source-level stepping is broken, with no warning.

♻️ Proposed guard
 # Stage 3b: Debug Runtime (policy-engine wrapped in dlv, port 2346)
 FROM envoyproxy/envoy:v1.35.3 AS debug
+
+# Fail fast if the debug stage was targeted without enabling debug symbols
+ARG ENABLE_DEBUG=false
+RUN if [ "$ENABLE_DEBUG" != "true" ]; then \
+      echo "ERROR: debug stage requires --build-arg ENABLE_DEBUG=true" && exit 1; \
+    fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 129 - 163, Add an explicit
guard in the debug stage to ensure the policy-engine was built with debug
symbols: when in the "debug" stage (stage name debug / COPY from
policy-compiler), assert ENABLE_DEBUG is set to true (e.g., declare ARG
ENABLE_DEBUG and then set ENV ENABLE_DEBUG=${ENABLE_DEBUG:-} and fail the image
build if empty/false) or perform a quick runtime check that the copied
/app/policy-engine contains debug symbols (and exit non‑zero otherwise). Update
the Dockerfile's debug stage to declare ARG ENABLE_DEBUG, convert it to an ENV,
and produce a failing command (so the build fails) if ENABLE_DEBUG is not "true"
to prevent producing a symbol-less debug image.
.vscode/launch.json (1)

117-119: Hardcoded SDK version @v0.3.9 will silently break source mapping when the SDK is bumped.

When sdk is upgraded in policy-engine/go.mod, this substitutePath will stop resolving and debugger stepping into SDK code will quietly fail. The inline comment already acknowledges this, but there's no mechanism to enforce the update.

Consider adding an explicit reminder in DEBUG_GUIDE.md (or a CONTRIBUTING note) to synchronize this version whenever policy-engine/go.mod's sdk dependency is bumped, e.g.:

-                    "to": "/go/pkg/mod/github.com/wso2/api-platform/sdk@v0.3.9" // find the exact version in policy-engine/go.mod
+                    "to": "/go/pkg/mod/github.com/wso2/api-platform/sdk@v0.3.9" // KEEP IN SYNC with sdk version in policy-engine/go.mod
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.vscode/launch.json around lines 117 - 119, The "to" mapping in launch.json
hardcodes the SDK version string "@v0.3.9" which will break source mapping when
the sdk dependency in policy-engine/go.mod is bumped; update DEBUG_GUIDE.md (or
CONTRIBUTING) with a clear step telling contributors to synchronize the "to"
value in launch.json (the JSON keys "from": "${workspaceFolder}/sdk" and "to":
"/go/pkg/mod/github.com/wso2/api-platform/sdk@<version>") with the sdk version
found in policy-engine/go.mod, and update the inline comment next to the "to"
mapping to explicitly reference where to find the correct version and to remind
maintainers to update it whenever policy-engine/go.mod changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.vscode/launch.json:
- Around line 121-124: The committed launch.json contains a developer-specific
"from" substitutePath entry ("~/git/gateway-controllers/policies/api-key-auth")
that must be removed; edit the JSON block containing the "from"/"to" object and
replace the hardcoded "from" value with either a commented-out template line or
an empty placeholder string so no local absolute path is committed, and keep the
"to" target ("/go/pkg/mod/.../api-key-auth@v0.8.0") intact as the reference;
ensure the object remains valid JSON (no dangling paths) and add a short
commented template placeholder developers can copy/adjust locally.

---

Outside diff comments:
In `@gateway/docker-compose.yaml`:
- Around line 39-40: Remove the fallback default for APIP_GW_CONTROLPLANE_HOST
so the container uses the raw env var; replace the current value that includes
":-host.docker.internal:9243" with just ${GATEWAY_CONTROLPLANE_HOST} (leave
APIP_GW_GATEWAY_REGISTRATION_TOKEN as-is) to allow the gateway to start without
a control plane set.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 165-216: Docs reference a non-existent Makefile target build-local
causing "No rule to make target 'build-local'"; either update
.github/copilot-instructions.md and gateway/README.md to use the actual Makefile
targets (build, build-controller, build-gateway-builder, build-gateway-runtime)
or add compatibility aliases in the gateway/Makefile (phony targets build-local,
build-local-controller, build-local-policy-engine that invoke the corresponding
real targets) so the documentation and build commands match; update the specific
mentions of build-local in the two docs to the chosen solution.

---

Duplicate comments:
In `@gateway/DEBUG_GUIDE.md`:
- Around line 22-25: The README/guide shows running docker compose -f
docker-compose.debug.yaml up as a standalone command but the header comment
inside docker-compose.debug.yaml still tells users to use it as an overlay,
which is misleading and causes port-binding conflicts; open
docker-compose.debug.yaml, update its top-of-file usage comment to state it is
intended to be run standalone with docker compose -f docker-compose.debug.yaml
up (or explain how to combine safely if overlay behavior is supported), remove
or correct the “overlay” instruction, and ensure the usage example matches the
DEBUG_GUIDE.md instructions (referencing docker-compose.debug.yaml in the
comment and any example commands in the file).

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Around line 202-204: The script currently uses set -e so wait -n "$PE_PID"
"$ENVOY_PID" will abort the script on a non-zero child exit and skip cleanup;
before calling wait -n, disable errexit (e.g., run set +e) or use wait -n
"$PE_PID" "$ENVOY_PID" || true so the shell does not exit, then capture the exit
code in EXIT_CODE=$? and proceed to terminate the sibling, perform socket
cleanup and finally exit with exit $EXIT_CODE; if you disabled errexit,
re-enable it (set -e) afterwards if the script relies on it elsewhere.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 125-127: The Dockerfile stage debug-dlv-builder installs delve
using a moving target "github.com/go-delve/delve/cmd/dlv@latest"; change the RUN
in that stage to pin a specific released tag (e.g., use "@v1.26.0" or another
chosen stable version) so builds are reproducible and deterministic (update the
RUN go install github.com/go-delve/delve/cmd/dlv@... in the debug-dlv-builder
stage accordingly).

---

Nitpick comments:
In @.vscode/launch.json:
- Around line 117-119: The "to" mapping in launch.json hardcodes the SDK version
string "@v0.3.9" which will break source mapping when the sdk dependency in
policy-engine/go.mod is bumped; update DEBUG_GUIDE.md (or CONTRIBUTING) with a
clear step telling contributors to synchronize the "to" value in launch.json
(the JSON keys "from": "${workspaceFolder}/sdk" and "to":
"/go/pkg/mod/github.com/wso2/api-platform/sdk@<version>") with the sdk version
found in policy-engine/go.mod, and update the inline comment next to the "to"
mapping to explicitly reference where to find the correct version and to remind
maintainers to update it whenever policy-engine/go.mod changes.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 129-163: Add an explicit guard in the debug stage to ensure the
policy-engine was built with debug symbols: when in the "debug" stage (stage
name debug / COPY from policy-compiler), assert ENABLE_DEBUG is set to true
(e.g., declare ARG ENABLE_DEBUG and then set ENV ENABLE_DEBUG=${ENABLE_DEBUG:-}
and fail the image build if empty/false) or perform a quick runtime check that
the copied /app/policy-engine contains debug symbols (and exit non‑zero
otherwise). Update the Dockerfile's debug stage to declare ARG ENABLE_DEBUG,
convert it to an ENV, and produce a failing command (so the build fails) if
ENABLE_DEBUG is not "true" to prevent producing a symbol-less debug image.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
gateway/gateway-runtime/Makefile (1)

131-136: clean does not remove the debug image variants.

$(IMAGE_NAME)-debug:$(VERSION) and $(IMAGE_NAME)-debug:latest are never cleaned, leaving debug images behind after make clean.

🛠️ Proposed fix
 clean: ## Clean Docker images
 	`@echo` "Cleaning Docker images..."
 	`@docker` rmi $(IMAGE_NAME):$(VERSION) 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME):latest 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME)-coverage:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:latest 2>/dev/null || true
 	`@rm` -rf target
 	`@echo` "Clean completed"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` around lines 131 - 136, The clean target
currently removes normal and coverage images but not debug variants; update the
Makefile's clean recipe (target named clean) to also remove
$(IMAGE_NAME)-debug:$(VERSION) and $(IMAGE_NAME)-debug:latest by adding docker
rmi commands for those two tags (following the same 2>/dev/null || true pattern
used for the other images) so debug images are removed when running make clean.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Around line 166-180: The timeout branch currently logs that the Policy Engine
(PE_PID) is still alive and then exits without cleanup; modify the timeout
handling inside the while loop (the branch checking SOCKET_WAIT_COUNT >=
SOCKET_WAIT_TIMEOUT and using POLICY_ENGINE_SOCKET, SOCKET_WAIT_COUNT,
SOCKET_WAIT_TIMEOUT, and PE_PID) to perform orderly shutdown: if PE_PID is
alive, first call the same shutdown logic invoked by your trap (invoke
shutdown() or replicate its steps), or send a termination signal to PE_PID
(e.g., kill -TERM "$PE_PID") and wait for it to exit (wait "$PE_PID" or loop on
kill -0) to ensure dlv/debugger and PE are stopped, then exit 1; ensure the code
path mirrors the trap behavior so resources are cleaned up even though EXIT is
not trapped.

---

Duplicate comments:
In @.vscode/launch.json:
- Around line 122-125: The committed launch.json contains a developer-specific
mount object using the "from" and "to" keys with a hard-coded
`${env:HOME}/git/gateway-controllers/policies/api-key-auth` path; replace that
live object with a commented-out template or remove it so no machine-specific
path is active: locate the JSON object containing "from" and "to" (the mount
entry shown in the diff) and either comment the entire object out as an example
template or change the "from" value to a neutral placeholder (e.g.,
`<LOCAL_PATH_TO_GATEWAY_CONTROLLERS>`) and keep a brief inline comment
explaining it is optional local configuration, ensuring no developer-specific
absolute paths remain in the committed JSON.

In `@gateway/DEBUG_GUIDE.md`:
- Around line 22-25: Update the header "Usage" comment inside
docker-compose.debug.yaml to show standalone usage (docker compose -f
docker-compose.debug.yaml up) instead of the overlay form (docker compose -f
docker-compose.yaml -f docker-compose.debug.yaml up); locate the "Usage" or
header comment near the top of docker-compose.debug.yaml and replace the overlay
example text with the single-file command to prevent confusing readers and avoid
port-binding conflicts.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Around line 202-204: The script's set -e causes the script to abort when wait
-n returns a non-zero exit, skipping the cleanup; change the logic around wait
-n so we temporarily disable errexit, run wait -n for PE_PID and ENVOY_PID,
capture its exit status into EXIT_CODE, then restore errexit so the subsequent
cleanup always runs. Concretely, around the current wait -n "$PE_PID"
"$ENVOY_PID" line use a temporary disable (e.g., set +e), run wait -n "$PE_PID"
"$ENVOY_PID", assign EXIT_CODE=$?, then restore errexit (e.g., set -e) before
proceeding with the sibling termination and socket removal code that follows.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 125-127: The Dockerfile currently installs dlv with an unstable
reference `go install github.com/go-delve/delve/cmd/dlv@latest` in the
debug-dlv-builder stage; change this to a pinned version and make it
configurable: add an ARG (e.g., DELVE_VERSION) at the top of the
debug-dlv-builder stage and replace the RUN line with `go install
github.com/go-delve/delve/cmd/dlv@${DELVE_VERSION}` (set a concrete default
DELVE_VERSION value matching the project-approved delve tag), so the
debug-dlv-builder stage and its RUN command no longer rely on `@latest` and the
version can be audited/locked.

---

Nitpick comments:
In `@gateway/gateway-runtime/Makefile`:
- Around line 131-136: The clean target currently removes normal and coverage
images but not debug variants; update the Makefile's clean recipe (target named
clean) to also remove $(IMAGE_NAME)-debug:$(VERSION) and
$(IMAGE_NAME)-debug:latest by adding docker rmi commands for those two tags
(following the same 2>/dev/null || true pattern used for the other images) so
debug images are removed when running make clean.

The build-local targets used `DOCKER_BUILDKIT=1 docker build` with the
claim of being faster due to "no buildx", but both invocations use the
same underlying BuildKit daemon. The distinction was misleading and the
targets were redundant with the standard `build` targets.

Removes build-local, build-local-controller, build-local-gateway-builder,
and build-local-gateway-runtime from the root gateway Makefile and from
the gateway-controller, gateway-builder, and gateway-runtime component
Makefiles. Updates the gateway/it/Makefile error message to reference
`make build` instead of `make build-local`.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
gateway/gateway-builder/internal/compilation/options_test.go (1)

31-53: ⚠️ Potential issue | 🟡 Minor

TestBuildOptions_Default and TestBuildOptions_CoverageEnvFalse are missing os.Unsetenv("DEBUG")

Both tests assert that -s -w is present in LDFlags, but neither resets the DEBUG env var. Since BuildOptions now reads DEBUG, if a developer has DEBUG=true in their environment (very plausible in a debug-session context), both tests will fail. Add os.Unsetenv("DEBUG") to both.

🛠️ Proposed fix
 func TestBuildOptions_Default(t *testing.T) {
     // Ensure COVERAGE and TARGETARCH env are not set
     os.Unsetenv("COVERAGE")
     os.Unsetenv("TARGETARCH")
+    os.Unsetenv("DEBUG")
 func TestBuildOptions_CoverageEnvFalse(t *testing.T) {
     os.Setenv("COVERAGE", "false")
     defer os.Unsetenv("COVERAGE")
+    os.Unsetenv("DEBUG")

Also applies to: 107-121

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/compilation/options_test.go` around lines 31
- 53, The tests TestBuildOptions_Default and TestBuildOptions_CoverageEnvFalse
must explicitly unset the DEBUG environment variable before calling BuildOptions
so LDFlags assertions aren't influenced by a developer's environment; add
os.Unsetenv("DEBUG") alongside the existing os.Unsetenv("COVERAGE") and
os.Unsetenv("TARGETARCH") calls at the start of each test (referencing the
TestBuildOptions_Default and TestBuildOptions_CoverageEnvFalse functions) to
ensure deterministic LDFlags (e.g., the presence of "-s -w").
gateway/gateway-runtime/Makefile (1)

131-136: ⚠️ Potential issue | 🟡 Minor

clean does not remove debug images.

make build-debug produces $(IMAGE_NAME)-debug:$(VERSION) and $(IMAGE_NAME)-debug:latest, but the clean target only removes the standard and coverage variants. Debug images will accumulate until manually removed.

🧹 Proposed fix
 clean: ## Clean Docker images
 	`@echo` "Cleaning Docker images..."
 	`@docker` rmi $(IMAGE_NAME):$(VERSION) 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME):latest 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME)-coverage:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:latest 2>/dev/null || true
 	`@rm` -rf target
 	`@echo` "Clean completed"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` around lines 131 - 136, The clean target in
the Makefile doesn't remove debug images produced by make build-debug, so add
removal of $(IMAGE_NAME)-debug:$(VERSION) and $(IMAGE_NAME)-debug:latest to the
clean recipe; update the clean target (referencing the clean target and
variables IMAGE_NAME and VERSION) to docker rmi those debug tags (with the same
2>/dev/null || true pattern) so debug images are deleted alongside the standard
and coverage variants and target cleanup remains idempotent.
🧹 Nitpick comments (9)
kubernetes/gateway-operator/Dockerfile.debug (3)

4-4: Builder base image is not on the latest 1.25.x patch.

golang:1.25.7-alpine is available, so golang:1.25.5-alpine is two patch releases behind. Trivial for a debug image, but worth a bump for any security fixes included in those patches.

♻️ Update to latest 1.25 patch
-FROM --platform=$BUILDPLATFORM golang:1.25.5-alpine AS builder
+FROM --platform=$BUILDPLATFORM golang:1.25.7-alpine AS builder
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kubernetes/gateway-operator/Dockerfile.debug` at line 4, The Dockerfile's
builder stage uses an outdated Go patch image in the FROM line (FROM
--platform=$BUILDPLATFORM golang:1.25.5-alpine); update that tag to the latest
1.25 patch (e.g., golang:1.25.7-alpine) so the builder base image is current and
includes any security/bug fixes.

20-20: Delve version pinning looks good.

Delve v1.26.0 is a confirmed release (December 19, 2025), and pinning to an explicit tag instead of @latest is the right call for reproducible debug images.

One minor inconsistency: GOOS=linux is hardcoded here while the manager binary build above uses GOOS=${TARGETOS:-linux} (line 17) to respect cross-compile targets. For a debug-only tool this is acceptable, but aligning them would be cleaner.

♻️ Align GOOS with the manager build
-RUN CGO_ENABLED=0 GOOS=linux go install github.com/go-delve/delve/cmd/dlv@v1.26.0
+RUN CGO_ENABLED=0 GOOS=${TARGETOS:-linux} go install github.com/go-delve/delve/cmd/dlv@v1.26.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kubernetes/gateway-operator/Dockerfile.debug` at line 20, Update the debug
image Delve install step so it respects the same target OS variable as the
manager build: replace the hardcoded GOOS=linux in the RUN line that installs
Delve (the line with "RUN CGO_ENABLED=0 GOOS=linux go install
github.com/go-delve/delve/cmd/dlv@v1.26.0") to use GOOS=${TARGETOS:-linux} so
both builds honor the TARGETOS cross-compile variable used earlier (the manager
build uses GOOS=${TARGETOS:-linux}).

31-33: Redundant COPY on line 33.

/workspace/internal/controller/resources/ is already fully covered by the preceding COPY --from=builder /workspace/internal /workspace/internal on line 31. The second copy creates a redundant image layer without adding anything.

♻️ Remove redundant COPY
 COPY --from=builder /workspace/internal /workspace/internal
-COPY --from=builder /workspace/internal/controller/resources/ /workspace/internal/controller/resources/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kubernetes/gateway-operator/Dockerfile.debug` around lines 31 - 33, Remove
the redundant COPY that duplicates the internal resources layer: delete the line
containing "COPY --from=builder /workspace/internal/controller/resources/
/workspace/internal/controller/resources/" since "/workspace/internal" is
already copied by "COPY --from=builder /workspace/internal /workspace/internal";
keep the COPY of /workspace/internal and /workspace/api only to avoid creating
an unnecessary image layer.
gateway/gateway-builder/internal/compilation/options_test.go (1)

31-53: TestBuildOptions_Default doesn't assert opts.EnableDebug is false

TestBuildOptions_Default asserts opts.EnableCoverage is false but omits the symmetric assert.False(t, opts.EnableDebug). Also consider adding a TestBuildOptions_WithDebugEnabled (mirroring TestBuildOptions_WithCoverageEnabled) to cover the BuildOptions path end-to-end with DEBUG=true.

♻️ Suggested additions
     assert.False(t, opts.EnableCoverage)
+    assert.False(t, opts.EnableDebug)
     assert.Contains(t, opts.LDFlags, "-s -w")
func TestBuildOptions_WithDebugEnabled(t *testing.T) {
    os.Setenv("DEBUG", "true")
    defer os.Unsetenv("DEBUG")

    metadata := &types.BuildMetadata{
        Version:   "v2.0.0",
        GitCommit: "def456",
        Timestamp: time.Now(),
    }

    opts := BuildOptions("/output/binary", metadata)

    assert.True(t, opts.EnableDebug)
    assert.NotContains(t, opts.LDFlags, "-s -w")
    assert.Contains(t, opts.LDFlags, "-X main.Version=v2.0.0")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/compilation/options_test.go` around lines 31
- 53, The test TestBuildOptions_Default is missing an assertion that
opts.EnableDebug is false; update TestBuildOptions_Default to assert.False(t,
opts.EnableDebug) after constructing opts to cover the default debug flag
behavior, and add a new test TestBuildOptions_WithDebugEnabled that sets
DEBUG=true (and defers Unsetenv), calls BuildOptions("/output/binary",
metadata), asserts opts.EnableDebug is true, and verifies LDFlags reflect debug
mode (e.g., no "-s -w" and that version/linker flags still exist) so
BuildOptions and its debug path are exercised end-to-end.
gateway/gateway-controller/Makefile (1)

90-101: build-debug skips tests unlike build and build-coverage-image.

Both build (via generate test) and build-coverage-image (via test) run unit tests before building. build-debug has no prerequisites, so it can silently create a debug image from broken code. This is likely intentional for fast iteration, but worth documenting in the target comment.

♻️ Optional: add note to target comment
-build-debug: ## Build debug image for remote debugging with dlv (VS Code attach on port 2345)
+build-debug: ## Build debug image for remote debugging with dlv (VS Code attach on port 2345) — skips tests for fast iteration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/Makefile` around lines 90 - 101, The build-debug
target currently builds a debug image without running tests; update the Makefile
by adding a clear comment for the build-debug target (referencing the
build-debug target) that it intentionally skips unit tests (unlike build and
build-coverage-image which run tests via generate test / test), so maintainers
know this is for fast iteration and not for CI-quality images; alternatively, if
you prefer enforcement, add the same prerequisite used by build (generate test
or test) to the build-debug target to run tests before building.
.vscode/launch.json (1)

116-119: Hardcoded SDK version v0.3.9 will silently break source-mapping after an SDK bump.

When policy-engine/go.mod upgrades the SDK, this substitutePath mapping won't match the actual module cache path, and breakpoints inside SDK code will stop resolving — with no error from VS Code.

Consider adding a comment that this value must be kept in sync with the version in policy-engine/go.mod, or document the update step in DEBUG_GUIDE.md.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.vscode/launch.json around lines 116 - 119, The launch.json substitutePath
entry hardcodes the SDK module version in the "to" value
("/go/pkg/mod/github.com/wso2/api-platform/sdk@v0.3.9"), which will break
source-mapping when policy-engine/go.mod updates the SDK; update the
substitutePath mapping logic by adding a clear inline comment next to the "to"
field (or add a note in DEBUG_GUIDE.md) instructing developers to keep this
version in sync with policy-engine/go.mod and include the exact file reference
("from": "${workspaceFolder}/sdk", "to": ".../sdk@<version>") so future SDK
bumps are remembered and the mapping is updated.
gateway/gateway-controller/Dockerfile (1)

73-94: Security settings for dlv debugging are already properly configured in docker-compose.debug.yaml.

The cap_add: [SYS_PTRACE] and security_opt: [seccomp:unconfined] are correctly set for both gateway-controller and gateway-runtime services (lines 48–51 and 76–79). Consider adding explicit documentation to DEBUG_GUIDE.md explaining these capabilities—currently it mentions containers run as root for ptrace (line 104) but doesn't detail why SYS_PTRACE capability and seccomp exemption are specifically required for dlv operation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/Dockerfile` around lines 73 - 94, Add a short
section to DEBUG_GUIDE.md describing why dlv requires the SYS_PTRACE capability
and seccomp:unconfined (used in docker-compose.debug.yaml for gateway-controller
and gateway-runtime) and how those settings interact with running dlv as root:
explain that dlv uses ptrace to attach to processes, that SYS_PTRACE must be
granted and seccomp must be relaxed to allow ptrace syscalls, show the exact
docker-compose keys to enable (cap_add: [SYS_PTRACE], security_opt:
[seccomp:unconfined]) and include a brief security note advising these are for
local/dev debugging only and should not be used in production.
gateway/DEBUG_GUIDE.md (2)

160-169: Clarify that these are temporary local edits — not to be committed.

Option 2 asks the developer to modify gateway/docker-compose.yaml (set GATEWAY_CONTROLLER_HOST to host.docker.internal, comment out PE ports). A developer might accidentally commit these changes. Consider adding a prominent note (e.g., a > ⚠️ callout) reminding readers to revert the edits before committing, similar to the existing warning pattern used for docker-compose.test.yaml.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/DEBUG_GUIDE.md` around lines 160 - 169, Add a prominent
temporary-edit warning to the gateway/docker-compose.yaml instructions: update
the section that instructs changing GATEWAY_CONTROLLER_HOST under the
gateway-runtime service (and commenting out PE ports) to include a clear callout
(e.g., "⚠️ Temporary local edit — do not commit; revert before committing")
analogous to the existing docker-compose.test.yaml warning so developers know
these changes are local-only and must be reverted.

49-63: Hardcoded SDK version @v0.3.9 will silently drift.

The substitutePath table (line 49) and the note (line 63) both embed a specific SDK version. When policy-engine/go.mod bumps the SDK, debugging will silently fail to map sources until someone remembers to update both places. Consider adding a shell snippet or Makefile helper that extracts the current version from go.mod, e.g.:

grep 'github.com/wso2/api-platform/sdk' gateway/gateway-runtime/policy-engine/go.mod

This way developers can quickly find the right value rather than relying on the doc staying in sync.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/DEBUG_GUIDE.md` around lines 49 - 63, The docs hardcode the SDK
version (`@v0.3.9`) in the substitutePath entry in `.vscode/launch.json` which
will drift when `policy-engine/go.mod` updates; add a small helper to derive the
sdk module version from `policy-engine/go.mod` and use it to populate the
substitutePath value (or document this command), e.g., create a Makefile target
or shell snippet that greps the `github.com/wso2/api-platform/sdk` version from
`policy-engine/go.mod` and instruct developers to run it before debugging;
update the `DEBUG_GUIDE.md` note to reference the helper and the symbols
`substitutePath` and `policy-engine/go.mod` so maintainers know where to run the
helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@gateway/gateway-builder/internal/compilation/options_test.go`:
- Around line 31-53: The tests TestBuildOptions_Default and
TestBuildOptions_CoverageEnvFalse must explicitly unset the DEBUG environment
variable before calling BuildOptions so LDFlags assertions aren't influenced by
a developer's environment; add os.Unsetenv("DEBUG") alongside the existing
os.Unsetenv("COVERAGE") and os.Unsetenv("TARGETARCH") calls at the start of each
test (referencing the TestBuildOptions_Default and
TestBuildOptions_CoverageEnvFalse functions) to ensure deterministic LDFlags
(e.g., the presence of "-s -w").

In `@gateway/gateway-runtime/Makefile`:
- Around line 131-136: The clean target in the Makefile doesn't remove debug
images produced by make build-debug, so add removal of
$(IMAGE_NAME)-debug:$(VERSION) and $(IMAGE_NAME)-debug:latest to the clean
recipe; update the clean target (referencing the clean target and variables
IMAGE_NAME and VERSION) to docker rmi those debug tags (with the same
2>/dev/null || true pattern) so debug images are deleted alongside the standard
and coverage variants and target cleanup remains idempotent.

---

Duplicate comments:
In `@gateway/gateway-controller/Makefile`:
- Line 31: The copilot-instructions.md reference to the removed Makefile target
is stale; update .github/copilot-instructions.md to replace any instructions
that say "make build-local" with the current build targets such as "make build"
(or "make build-debug" where debug workflows are intended) and ensure any
examples or README steps that mention build-local are similarly updated; search
for occurrences of "build-local" and replace them with the appropriate target
names so they match the Makefile's .PHONY entries like "build" and
"build-debug".

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 125-127: The Dockerfile's debug-dlv-builder stage installs Delve
with a floating tag; update the RUN instruction in the debug-dlv-builder stage
to pin Delve to a specific release (e.g., change the go install line that
currently references github.com/go-delve/delve/cmd/dlv@latest to use a fixed tag
like `@v1.26.0`) so debug builds are reproducible and deterministic.

---

Nitpick comments:
In @.vscode/launch.json:
- Around line 116-119: The launch.json substitutePath entry hardcodes the SDK
module version in the "to" value
("/go/pkg/mod/github.com/wso2/api-platform/sdk@v0.3.9"), which will break
source-mapping when policy-engine/go.mod updates the SDK; update the
substitutePath mapping logic by adding a clear inline comment next to the "to"
field (or add a note in DEBUG_GUIDE.md) instructing developers to keep this
version in sync with policy-engine/go.mod and include the exact file reference
("from": "${workspaceFolder}/sdk", "to": ".../sdk@<version>") so future SDK
bumps are remembered and the mapping is updated.

In `@gateway/DEBUG_GUIDE.md`:
- Around line 160-169: Add a prominent temporary-edit warning to the
gateway/docker-compose.yaml instructions: update the section that instructs
changing GATEWAY_CONTROLLER_HOST under the gateway-runtime service (and
commenting out PE ports) to include a clear callout (e.g., "⚠️ Temporary local
edit — do not commit; revert before committing") analogous to the existing
docker-compose.test.yaml warning so developers know these changes are local-only
and must be reverted.
- Around line 49-63: The docs hardcode the SDK version (`@v0.3.9`) in the
substitutePath entry in `.vscode/launch.json` which will drift when
`policy-engine/go.mod` updates; add a small helper to derive the sdk module
version from `policy-engine/go.mod` and use it to populate the substitutePath
value (or document this command), e.g., create a Makefile target or shell
snippet that greps the `github.com/wso2/api-platform/sdk` version from
`policy-engine/go.mod` and instruct developers to run it before debugging;
update the `DEBUG_GUIDE.md` note to reference the helper and the symbols
`substitutePath` and `policy-engine/go.mod` so maintainers know where to run the
helper.

In `@gateway/gateway-builder/internal/compilation/options_test.go`:
- Around line 31-53: The test TestBuildOptions_Default is missing an assertion
that opts.EnableDebug is false; update TestBuildOptions_Default to
assert.False(t, opts.EnableDebug) after constructing opts to cover the default
debug flag behavior, and add a new test TestBuildOptions_WithDebugEnabled that
sets DEBUG=true (and defers Unsetenv), calls BuildOptions("/output/binary",
metadata), asserts opts.EnableDebug is true, and verifies LDFlags reflect debug
mode (e.g., no "-s -w" and that version/linker flags still exist) so
BuildOptions and its debug path are exercised end-to-end.

In `@gateway/gateway-controller/Dockerfile`:
- Around line 73-94: Add a short section to DEBUG_GUIDE.md describing why dlv
requires the SYS_PTRACE capability and seccomp:unconfined (used in
docker-compose.debug.yaml for gateway-controller and gateway-runtime) and how
those settings interact with running dlv as root: explain that dlv uses ptrace
to attach to processes, that SYS_PTRACE must be granted and seccomp must be
relaxed to allow ptrace syscalls, show the exact docker-compose keys to enable
(cap_add: [SYS_PTRACE], security_opt: [seccomp:unconfined]) and include a brief
security note advising these are for local/dev debugging only and should not be
used in production.

In `@gateway/gateway-controller/Makefile`:
- Around line 90-101: The build-debug target currently builds a debug image
without running tests; update the Makefile by adding a clear comment for the
build-debug target (referencing the build-debug target) that it intentionally
skips unit tests (unlike build and build-coverage-image which run tests via
generate test / test), so maintainers know this is for fast iteration and not
for CI-quality images; alternatively, if you prefer enforcement, add the same
prerequisite used by build (generate test or test) to the build-debug target to
run tests before building.

In `@kubernetes/gateway-operator/Dockerfile.debug`:
- Line 4: The Dockerfile's builder stage uses an outdated Go patch image in the
FROM line (FROM --platform=$BUILDPLATFORM golang:1.25.5-alpine); update that tag
to the latest 1.25 patch (e.g., golang:1.25.7-alpine) so the builder base image
is current and includes any security/bug fixes.
- Line 20: Update the debug image Delve install step so it respects the same
target OS variable as the manager build: replace the hardcoded GOOS=linux in the
RUN line that installs Delve (the line with "RUN CGO_ENABLED=0 GOOS=linux go
install github.com/go-delve/delve/cmd/dlv@v1.26.0") to use
GOOS=${TARGETOS:-linux} so both builds honor the TARGETOS cross-compile variable
used earlier (the manager build uses GOOS=${TARGETOS:-linux}).
- Around line 31-33: Remove the redundant COPY that duplicates the internal
resources layer: delete the line containing "COPY --from=builder
/workspace/internal/controller/resources/
/workspace/internal/controller/resources/" since "/workspace/internal" is
already copied by "COPY --from=builder /workspace/internal /workspace/internal";
keep the COPY of /workspace/internal and /workspace/api only to avoid creating
an unnecessary image layer.

@renuka-fernando renuka-fernando merged commit 6aa2908 into wso2:main Feb 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway Remote Debug

2 participants

Comments