Bug #120470 Add REJECT mode to Connection Control Component to prevent connection exhaustion attacks
Submitted: 13 May 20:04
Reporter: Chelluru Vidyadhar Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: Security: Firewall Severity:S4 (Feature request)
Version:8.0.46, 8.4, 9.7 OS:Any
Assigned to: CPU Architecture:Any

[13 May 20:04] Chelluru Vidyadhar
Description:
The component_connection_control (and legacy connection_control plugin) currently implements only a "delay" (DETER) strategy when the failed connection threshold is exceeded. The delayed connection thread remains in the processlist, consumes a max_connections slot, and holds memory for the entire duration of the sleep. This allows an attacker to trivially exhaust all available connections by triggering the delay mechanism across many threads simultaneously, effectively turning the anti-brute-force feature into a denial-of-service amplification vector.

In `connection_delay.cc`, the `Connection_delay_action::notify_event()` method calls `conditional_wait(wait_time)` which performs a `mysql_cond_timedwait()`. During this wait:

- The OS thread remains allocated
- The THD (thread handle) remains in the processlist
- The connection counts against `max_connections`
- Memory buffers (thread stack, net buffer, etc.) remain allocated

The component was designed to deter brute-force password guessing by making each attempt slow. However, an attacker does not need to succeed — they only need to trigger enough sleeping threads to exhaust the connection pool. The "security" feature becomes the attack surface.

This feature request proposes adding a configurable throttle_action system variable with a new REJECT mode that immediately refuses connections that exceed the threshold — without sleeping or holding the thread.

How to repeat:
1. Configure the connection control component:
```
INSTALL COMPONENT 'file://component_connection_control';
SET GLOBAL component_connection_control.failed_connections_threshold = 3;
SET GLOBAL component_connection_control.min_connection_delay = 5000;
SET GLOBAL component_connection_control.max_connection_delay = 60000;
```

2. Set a low `max_connections` to simulate resource exhaustion:
```
SET GLOBAL max_connections = 20;
```

3. From an external script, fire rapid failed login attempts from multiple hosts/users to exceed the threshold, then continue sending connections:
```
for i in $(seq 1 50); do
  mysql -u victim_db_user -pwrong_password -h 127.0.0.1 &
done
```

4. Observe that once the threshold is crossed, each subsequent connection attempt sleeps for `min_connection_delay` (5+ seconds) while **holding a connection slot**. Within seconds, all 20 `max_connections` slots are occupied by sleeping threads.

5. Legitimate users now receive `ERROR 1040 (HY000): Too many connections` and cannot connect — even through `admin_address`.

6. Run `SHOW PROCESSLIST` (if you can connect) — all slots show sleeping connection_control threads.

**This is documented in MySQL Bug #89155 but was never addressed with a solution.**

Suggested fix:
Add a new system variable `component_connection_control.throttle_action` with two modes:

| Value | Mode | Behavior |
|-------|------|----------|
| 0 | DETER (default) | Current behavior — sleep for delay duration, hold thread |
| 1 | REJECT | Immediately return error, free the thread/connection slot |

**In REJECT mode:**
- When a connection exceeds the failed threshold, the component returns `true` from the event tracking callback
- The server aborts the connection immediately (same mechanism as other audit plugin rejections)
- No sleep occurs — the thread is freed instantly
- Failed attempt tracking continues (penalty window persists)
- A new status variable `Component_connection_control_rejected_connections` tracks rejections

**Key design points:**
- Default is DETER (0) — fully backward compatible, no behavior change for existing users
- REJECT (1) is opt-in for deployments facing connection exhaustion attacks
- The failed attempts counter still increments in REJECT mode, so the penalty window persists
- A successful login from the same user@host still resets the counter (once the penalty expires or threshold is reset)

**Implementation requires changes to:**
1. `connection_control_data.h` — new enum values for `OPT_THROTTLE_ACTION` and `STAT_CONNECTION_REJECTED`
2. `connection_delay.h/.cc` — add `m_throttle_action` member, branch in `notify_event()` 
3. `connection_control_coordinator.h/.cc` — change `notify_event()` from `void` to `bool` return to propagate rejection
4. `connection_control.cc` — capture and propagate return value from coordinator in the event tracking callback; register new sysvar/status var

**The event tracking callback change is critical:**
```
// Before (current code - always allows connection):
g_connection_event_coordinator->notify_event(thd, data);
return false;

// After (propagate rejection signal from REJECT mode):
bool result = g_connection_event_coordinator->notify_event(thd, data);
return result;
```

This address the issue because `mysql_event_tracking_connection_notify()` in `sql_connect.cc` already checks the return value and aborts the connection if non-zero:
```
if (mysql_event_tracking_connection_notify(thd, ...)) {
    return 1;  // connection aborted
}
```