Skip to content

Conversation

@tadasant
Copy link
Member

@tadasant tadasant commented Dec 9, 2025

Written by Claude Code, reviewed by me. When we get this in main I'll test it out against staging to make sure it's acting as intended.

Summary

  • Implements IP-based rate limiting middleware to address service disruptions caused by excessive requests (Implement rate limits to address mis-use causing downtime #826)
  • Enforces dual rate limits: 60 requests/minute and 1000 requests/hour per IP address
  • Fully configurable via environment variables (MCP_REGISTRY_RATE_LIMIT_*)
  • Excludes health/ping/metrics endpoints from rate limiting
  • Returns RFC 7807 problem+json responses with Retry-After header when rate limited

Configuration Options

Variable Default Description
MCP_REGISTRY_RATE_LIMIT_ENABLED true Enable/disable rate limiting
MCP_REGISTRY_RATE_LIMIT_REQUESTS_PER_MINUTE 60 Max requests per minute per IP
MCP_REGISTRY_RATE_LIMIT_REQUESTS_PER_HOUR 1000 Max requests per hour per IP

Implementation Details

  • Uses token bucket algorithm via golang.org/x/time/rate (already a dependency)
  • Properly handles X-Forwarded-For and X-Real-IP headers (since registry is deployed behind NGINX ingress with use-forwarded-headers=true)
  • All IPs are validated using net.ParseIP() to handle malformed input
  • Memory protection via MaxVisitors limit (100k) with LRU eviction
  • Background cleanup of stale visitor entries every 10 minutes

Test plan

  • Unit tests cover rate limiting logic, middleware behavior, and IP extraction
  • Tests verify both minute and hourly limits work correctly
  • Tests verify skip paths are not rate limited
  • Tests verify X-Forwarded-For and X-Real-IP header handling
  • Tests verify invalid IP handling (empty, malformed, IPv6)
  • Tests verify memory limit enforcement and eviction
  • Tests verify concurrent access safety
  • golangci-lint passes with no issues
  • All tests pass locally
  • CI pipeline passes

Known Limitations

  • Multi-pod deployment: In-memory rate limiting means each pod has independent limits. For strict per-IP limits across the cluster, consider Redis-based rate limiting in a future iteration.

Closes #826

🤖 Generated with Claude Code

tadasant and others added 5 commits December 9, 2025 07:20
Implements rate limiting middleware to address service disruptions caused by
excessive requests. The rate limiter enforces per-IP limits of 60 requests
per minute and 1000 requests per hour (configurable via environment variables).

Features:
- Dual rate limiting (per-minute and per-hour) using token bucket algorithm
- Proper handling of X-Forwarded-For and X-Real-IP headers for proxied requests
- Skip paths for health, ping, and metrics endpoints
- Background cleanup of stale visitor entries
- Configurable via MCP_REGISTRY_RATE_LIMIT_* environment variables
- Returns RFC 7807 problem+json responses with Retry-After header

Closes #826

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add TrustProxy config option (defaults to false for security)
  - When false, only RemoteAddr is used (prevents IP spoofing)
  - When true, X-Forwarded-For and X-Real-IP headers are trusted
- Add IP validation using net.ParseIP() to handle invalid/malformed IPs
- Add MaxVisitors config to prevent memory exhaustion attacks
- Implement LRU-style eviction when MaxVisitors limit is reached
- Optimize lock granularity with read locks for existing visitors
- Add comprehensive tests for:
  - TrustProxy enabled/disabled behavior
  - IP spoofing prevention
  - Invalid IP handling
  - IPv6 support
  - Memory limit enforcement
  - Concurrent access

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Since the registry is always deployed behind NGINX ingress with
use-forwarded-headers=true, there's no need for a TrustProxy config option.
The NGINX ingress is trusted infrastructure that correctly sets
X-Forwarded-For headers.

Removed:
- TrustProxy config option and environment variable
- Tests for TrustProxy enabled/disabled behavior

The implementation now always checks X-Forwarded-For/X-Real-IP headers
first, then falls back to RemoteAddr. This matches the deployment setup.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add mcp_registry.http.rate_limited counter to track when requests are
blocked by rate limiting. This enables monitoring rate limit activity
over time via Prometheus.

- Add RateLimitedRequests counter to telemetry.Metrics
- Add OnRateLimited callback to rate limiter config
- Wire up metrics in server initialization
- Add test for callback behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add rate limit section to official-registry-api.md documenting the
  429 response format, limits, and client guidance
- Add note in .env.example about per-pod rate limit behavior
- Add code comment in server.go explaining multi-replica approximation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@tadasant tadasant marked this pull request as ready for review December 9, 2025 19:10
@tadasant tadasant requested a review from a team December 9, 2025 19:10
return v
}

// Enforce max visitors limit (memory protection)
Copy link
Contributor

@pree-dew pree-dew Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tadasant if there are request coming from unique ips then it will iterate over all exisiting entries to find 1 ip then remove while occupying lock. After 100K, every request will take good amount of time.

Please feel free to correct me if I am missing something here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also implement LRU based structure here. or may be remove some % of entries if the limit is hit instead of removing 1 on every new request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be more than happy if I can help in anyway here.

}

// Need to create new visitor - acquire write lock
rl.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be very slow when throughput is high, as per the stats on grafana over per hour throughput is always above 15K and has reached max upto ~35K. Cleanup won't happen till the time an entry is not older than an hour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be shard based rate limiter will suit better if we are expecting per min throughput to increase.

Copy link
Member

@rdimitrov rdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would want to solve this in a different way and handle this in the infrastructure layer rather than inside the registry 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement rate limits to address mis-use causing downtime

4 participants