-
Notifications
You must be signed in to change notification settings - Fork 590
Add per-host connection limits to DestinationRule #3614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
🤔 🐛 You appear to be fixing a bug in Go code, yet your PR doesn't include updates to any test files. Did you forget to add a test? Courtesy of your friendly test nag. |
Add perHostLimits field to ConnectionPoolSettings to enable per-host connection limits for individual endpoints in a cluster.This aligns with Envoy's per_host_thresholds capability and is useful for preventing endpoint overload in autoscaling scenarios. Currently only maxConnections field is supported per Envoy's implementation. Fixes #57697
| message PerHostLimits { | ||
| // Maximum number of connections to each individual host in the upstream cluster. | ||
| // If not specified, there is no per-host limit. | ||
| int32 max_connections = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses a simplified structure with a single max_connections field rather than reusing TCPSettings to avoid exceeding Kubernetes CRD validation cost limits. The TCPSettings message contains multiple Duration fields with CEL validation rules that would double the validation cost when reused here, causing the total schema validation cost to exceed Kubernetes limits by ~9%. Since Envoy only supports max_connections for per-host limits currently, this simplified structure provides the same functionality while staying within CRD validation budgets.
|
@ramaraochavali, @howardjohn , @therealmitchconnors , @keithmattix can you please share your thoughts when you get sometime. |
keithmattix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no concerns with this in principle, but we might want to discuss how we expect DestinationRule to evolve generally. Especially as Backend evolves in Kubernetes
|
IMO PerHostThresholds make sense for DFP type of cluster where each host can point to a different upstream service/external endpoint but not very useful for regular clusters. The cluster load balancing algorithm will almost ensure requests are sent to hosts that can handle them with low latency(Least Request) still ensuring the cluster level circuit breakers are honored to prevent from cascading failures . Adaptive concurrency is a better choice to handle auto scaling behaviors. Also PerHostThresholds only support max connections. Even with connections under the limit, the backend host can be overwhelmed with lot of requests in some protocols like HTTP/2. |
|
Thank you for the thoughtful review, I agree with several of the points raised and appreciate the perspective. I agree that per-host thresholds are most obviously useful for DFP-style clusters, where individual endpoints may represent very different upstream services. I also agree that for regular service clusters, Envoy’s Least Request load balancing and cluster-level circuit breakers already do a good job of steering traffic away from overloaded endpoints and preventing cascading failures. That said, I believe per-host connection limits still provide complementary value, even for regular Kubernetes service clusters, particularly in the following cases: Proactive protection vs reactive steeringLeast Request and outlier detection are reactive mechanisms, they rely on latency or failures that have already occurred. Per-host connection limits provide a hard, proactive guardrail that prevents an endpoint from being overwhelmed in the first place, especially during sudden traffic spikes or cold-start scenarios. Autoscaling lag and uneven capacityIn autoscaling environments, new pods often start with:
During these windows, per-host connection limits help ensure that a newly added or temporarily degraded pod is not immediately saturated, even if it technically passes readiness checks. Connection pressure still matters, even with HTTP/2I agree that
While not a complete concurrency control solution, per-host limits act as a coarse but effective safety mechanism. Parity with Envoy and ecosystem toolingSince Envoy already supports I fully agree that adaptive concurrency control is a superior long-term solution for handling autoscaling and dynamic load, and I see per-host connection limits as complementary rather than competing with that direction. Adaptive concurrency addresses request-level saturation, while per-host connection limits provide a simpler, deterministic control at the connection level. If the concern is around:
I would be happy to:
I am happy to go in either direction, just wanted to add my point of view. |
I somewhat agree @ramaraochavali, even though I am really keen on (at least) getting some per host connection limit in (I opened the issue istio/istio#57697 after all). This kind of "adaptive concurrency" as suggested in istio/istio#25991 and is supported by Envoy via https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/adaptive_concurrency_filter. But in the end it all comes down to managing the effects of Little's Law. Higher latency very quickly causes a concurrency explosion, causing the latency per request to go higher and higher until there is timeouts of the backend cannot take it anymore (if it does not implement some limit itself). Only using and trusting on Outlier Detection makes managing concurrency a reactive task (there have to be traffic affecting issues already for Envoy to take action), even though Envoy has capabilities in regarding to maintaining a healthy flow of requests without static configuration of connection counts or even rate limits (which never hit the right number). To me these capabilities are a) adaptive LB algos like PEWMA (envoyproxy/envoy#20907 -> istio/proxy#6690, or Prequal (envoyproxy/envoy#42091). Which send requests to the backend likely resulting in the lowest latency and when looking again at Little's Law, server the user best (low latency) while also maximizing overall rps (at the same concurrency). b) adaptive concurrency to not overwhelm individual backends with connections / requests as suggested in istio/istio#25991 |
These type of issues are better solved by warm-up configuration than circuit breakers
This is true whether you do it at cluster level or at host level. But practically for HTTP2, Envoy would not create more than one connection per worker thread unless max_concurrent_streams is reached. Even high throughput apps wont create more connections in Http2.
I am not advocating outlier detection to handle this. I am just saying static per host connection limits wont help much. |
|
@ramaraochavali I agree that, in practice, this change is unlikely to have a significant impact for most standard Kubernetes service clusters, especially given Envoy’s existing load-balancing behavior and cluster-level circuit breakers. That said, I also don’t see any real downside or risk in exposing this capability, since it is already supported by Envoy and aligns Istio with the broader Envoy ecosystem. While per-host connection limits may not materially improve concurrency management in many HTTP/2-heavy workloads, they can still serve as a simple, deterministic guardrail in certain scenarios and provide configuration parity with tools like Envoy Gateway and Contour. As long as the feature is clearly documented with its limitations and positioned as an advanced or niche control rather than a primary concurrency mechanism, I don’t think it introduces confusion or harm. Given that, I’m comfortable with this moving forward and would defer to the broader perspective and judgment of @ramaraochavali, @howardjohn, @therealmitchconnors, and @keithmattix for the final decision. Please do let me know. |
|
One additional point worth calling out is that per-host connection limits are significantly more valuable for pure TCP workloads (e.g., Redis, DBs, Kafka) than for HTTP/2 services. For these protocols, connections map directly to backend resource consumption (threads/event loops, memory, file descriptors). There is no request multiplexing or adaptive concurrency signal, so limiting connections at the per-host level is one of the few effective and generic guardrails available to prevent individual endpoints from being overwhelmed, especially during bursts, restarts, or scaling events. Given that Envoy already supports per_host_thresholds, and similar tooling has exposed this capability, adding this to Istio seems reasonable as a complementary, opt-in mechanism, particularly for TCP services, provided the limitations are clearly documented. |
Add per-host connection limits to DestinationRule
Description
This PR adds support for per-host connection limits in the API by introducing a new
perHostLimitsfield to ConnectionPoolSettings. This feature aligns with Envoy'sper_host_thresholdscapability for circuit breakers.Motivation
Currently, Istio's DestinationRule only allows setting a global connection limit for the entire cluster, independent of the number of endpoints. This makes it difficult to properly manage concurrency for destination services, especially in autoscaling scenarios where the number of replicas changes dynamically.
Per-host connection limits allow controlling connections to each individual endpoint, which:
The underlying Envoy cluster circuit breaker already has this
per_host_thresholdscapability. Other Envoy-based tools (Envoy Gateway, Contour) have already added this support.Example Usage
Limitations
Currently only the maxConnections field is supported for per-host limits, as per Envoy's circuit breaker implementation. Other fields in TCPSettings will be ignored.
Testing
Generated CRD files successfully with make gen
Test case not included due to CRD validation cost limits (will be validated in istio/istio integration)
Fixes #57697