Skip to content

[Enhancement] Prohibit CallerRunsPolicy in NettyRemotingServer to prevent blocking IO threads #9983

@QiuYucheng2003

Description

@QiuYucheng2003

Before Creating the Enhancement Request

  • I have confirmed that this should be classified as an enhancement rather than a bug/feature.

Summary

Add defensive checks in NettyRemotingServer#registerProcessor to detect and prohibit/warn against the usage of CallerRunsPolicy in custom user executors. Using this policy triggers blocking operations on Netty EventLoop threads when the thread pool is exhausted, leading to severe performance degradation.

Motivation

In NettyRemotingServer.java, the registerProcessor method allows users (e.g., Broker developers) to register a custom ExecutorService for specific request codes.

Currently, there is no validation on the RejectedExecutionHandler of the provided executor.
In high-concurrency scenarios (common in Cloud/K8s environments), if a user mistakenly configures a ThreadPoolExecutor with CallerRunsPolicy and the pool becomes full:

  1. The task execution is rejected.
  2. CallerRunsPolicy forces the task to run in the caller's thread.
  3. In this context, the caller thread is the Netty IO Thread (EventLoop) (invoked from NettyServerHandler.channelRead0).

Consequence:
The Netty IO thread gets blocked processing business logic. This stops the server from reading/writing other requests or handling heartbeats on that channel, potentially causing the node to be marked as "down" by the cluster or causing a "Stop-the-World" like pause in network throughput (DPSE - Death by Poorly Scheduled Executor).

Adding a check will prevent this silent misconfiguration.

Describe the Solution You'd Like

Modify org.apache.rocketmq.remoting.netty.NettyRemotingServer#registerProcessor to include a validation step.

Logic:

  1. Check if the passed executor is an instance of ThreadPoolExecutor.
  2. If yes, check if ((ThreadPoolExecutor) executor).getRejectedExecutionHandler() is an instance of CallerRunsPolicy.
  3. If detected, throw an IllegalArgumentException (fast fail) or at least log a generic ERROR/WARN to alert the user that this configuration is dangerous.

Sample Code Logic:

if (executor instanceof ThreadPoolExecutor) {
    if (((ThreadPoolExecutor) executor).getRejectedExecutionHandler() instanceof ThreadPoolExecutor.CallerRunsPolicy) {
        throw new IllegalArgumentException("CallerRunsPolicy is strictly forbidden in RocketMQ Remoting as it blocks Netty IO threads.");
    }
}

### Describe Alternatives You've Considered

### 6. Describe Alternatives You've Considered 
```text
1. **Documentation warning only:** We could just add a warning in Javadoc, but users often miss documentation details when configuring custom thread pools.
2. **Wrapper Executor:** Wrapping the user executor to intercept the rejection. This is overly complex and might introduce overhead.
3. **Log only:** Just logging a WARN message. This is a softer approach but might be ignored during startup, leading to runtime failures later. Fast-fail is preferred for such critical configuration errors.

### Additional Context

Relevant Code Location:
- Registration: `org.apache.rocketmq.remoting.netty.NettyRemotingServer#registerProcessor`
- Execution path: `org.apache.rocketmq.remoting.netty.NettyRemotingServer.NettyServerHandler#channelRead0` calling `remotingAbstract.processMessageReceived`

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions