Envoy as an Inter DC Traffic Manager (Part 2)
This is Part 2 of a multipart blog describing how to use Envoy as an “Inter DC Traffic Manager”. The previous part sets up the context and walks us through how to setup a mock environment to test several uses cases around managing traffic between multiple DCs (Data Centers). So if you haven’t already gone through Part 1, please try to skim through it and gain context before proceeding. Assuming that you have context from Part 1, let us proceed.
When it comes to HTTP traffic management, the main tool that Envoy provides is the HttpConnectionManager listener filter. Most traffic management use cases can be dealt with by using a list of Route configs nested within this filter. Each
Route can be configured with combinations of the
match config and any one of the
direct_response configs to produce various conditional HTTP traffic routing strategies.
I will be continuing with the scenario from the previous part of the blog, where we are managing flow of incoming traffic between two separate clusters called dc-1 and dc-2.
In Part 1, we have seen how to split traffic between DCs based on the API path of the request. Now lets explore a few other strategies for traffic management with Envoy:
- Routing based on the upstream service
- Routing based on percentage split
- Routing based on regex matches on request properties
- Directly responding without forwarding the request
- Redirecting to another route
Assuming that you have a local Kubernetes cluster, kubectl and Envoy installed, you can simply run the following commands to get started.
Each strategy mentioned below has a corresponding envoy bootstrap configuration file added under the
config directory in the blogs repository.
1. Routing based on the Upstream Service
When there is a need to route traffic to a different cluster just based on the upstream service to which the request is made, Envoy’s
prefix route matching can be employed simply by specifying the common API prefix for that service. This can be used in simple scenarios such as when a service exists only in one cluster or when you want to start routing all traffic to a service after migrating/replicating it to another cluster.
Let us take a look at how a sample route config will be in this scenario.
The above configuration will replace the routes config of our current envoy configuration. There will be no apparent change in Envoy’s behavior. This is because we had earlier defined only a single path/method in our dummy services. But in the real world scenario of having multiple paths/methods being supported by a single service, this config is more convenient as it avoids the need to have one entry per API path. It must be also noted that the order of the routes matter in the route config. The first match is considered and the rest of the routes are ignored. In a scenario where there are multiple services that share a common prefix, ensure that the service with the longest prefix is added first. For example, if a park management system that has both a
/park/ API, the
/parking/ prefix must be added first.
As seen below, Envoy now routes all requests with the
/hello-rest-service/ prefix to
dc-1 and requests with
2. Routing based on Percentage Split
In a scenario where there are multiple instances of the same service running on different clusters, you may choose to divide the traffic between all the different instances. Due to reasons such as difference in the capacity of each cluster, this division of traffic may need to be skewed. The
weighted_clusters config of
route allows us to configure all the traffic to a matched route to be distributed by weight to multiple upstream clusters. The route matching can be based on any of Envoys HTTP route matching criteria. Shown below is such a config which splits traffic between multiple instances of the
By now it should be clear that the distribution is not exactly percentage based but rather weight based. The weights specified per cluster should add up to form the
total_weight configured. The fraction of traffic going to a cluster is therefore the
weight of that cluster divided by the
total_weight. So ⅓ rd of the traffic goes to
dc-1 and ⅔ rd of of the traffic goes to
It may be easier to understand these fractions better if we use weights that add up to
100, which is, as a matter of fact, the default value of
total_weight. Then the weights can be directly read as the percentage of traffic going to each cluster. 33:67 would be a good rounded off alternative to 1:2. This is of course merely a suggestion of a convention I personally find preferable.
There can be any number of weighted clusters, provided their weights add up exactly to the
total_weight and the clusters specified by
name are added to the
clusters config. The
runtime_key_prefix mentioned in the config is explained in detail here.
Above is a sample response from one instance of testing. As you can see, after the first 3 requests, the responses immediately did not seem to be respecting the weights. But after repeating the requests 6 more times, we can see that the proportion of responses from dc-1 and dc-2 have converged towards the expected proportion set by their weights.
3. Routing based on regex matches on request properties
Another useful strategy supported by Envoy for routing traffic, is to choose an upstream server depending on a regex match on one of the incoming request properties, namely the headers, path and query parameters.
Let us consider a scenario where we want requests from all users with an even
userID to be routed to
dc-1 and the rest of the requests (from odd
dc-2. Thus the traffic will be uniformly split between the two clusters. This might be particularly useful when the concerned service is stateful, and you must ensure that, subsequent requests from the same user, goes to the same server instance.
The route config is based on the normal prefix based routing config, but we also have an additional config called
headers to list down the set of request headers and their corresponding match criteria. Each item will be a
HeaderMatcher config. The HeaderMatcher config supports a handful of match criteria, of which we will be choosing the
safe_regex_match is a production friendly regex matcher that pre-compiles the regex and sets a safe upper limit on its runtime program size using the
max_program_size config. You can learn more about it here.
The first and the second regexes used in the above config matches even and odd numbers respectively. The config expects the requests to have a header with the key
user-id, which we assume will be populated with the corresponding user’s
userID, by the downstream service making the requests.
After bootstrapping the local Envoy instance with the
envoy-regex.yaml config which has the above mentioned route config, we can see the expected results from the following curl requests. The
-H flag is used in the curl command to add the
user-id header to the requests. All requests with
userIDs ending with an even digit are routed to
dc-1 and the rest (ending with odd digits) to
4. Directly responding without forwarding the request
In the previous scenario, if the downstream client forgets to add a
user-id header to the request, Envoy would return a 404 status code as it cannot find a suitable route match. To solve this, if we tweak the Envoy config a bit and default such requests to one of the servers, it could probably result in a bad request response from the server due to lack of the expected header. To handle such cases, where we already know how the server will respond to a particular request OR to filter out unnecessary or bad traffic to the upstream servers, Envoy provides the
direct_response we can configure an immediate response code and body, for all the available request match criteria. Here, this config is used to give an immediate 400 (Bad Request) response code and a string response body that says
user-id header not found. when the
user-id header is absent or nil.
In the route config below, we have similar route match items to the previous section. Additionally. there is a redundant match route that matches to
/hello-rest-service/ prefix. This third route match will be hit only when the
user-id is neither even nor odd, i.e when its value is empty or invalid. It has a
direct-response with a 400
status code and an
Envoy can be run with the above route config added to the bootstrap config using the following command.
Executing a curl request with the
-v flag would reveal that the request to
/hello-rest-service/ without a
user-id header is resulting in the expected
400 Bad Request response code and the
user-id header not found response body.
Redirecting to another route
Envoy supports different types of redirects on matched routes. The
RedirectAction config can be used to specify the type of redirect needed for a particular route match. Some of the redirects supported are
Lets use the
path redirection to redirect requests to
/hello-rest-service/hello as our imaginary downstream clients are constantly making this typo in their requests.
The route config is very straightforward. We add a separate route match for the new
/helo path and then set its
path_redirect to the original
Since we are using a redirect, Envoy will respond with a
301 status code first to indicate that the path we are requesting for, has been
Moved Permanently. If the downstream client chooses to follow the redirect, it will finally receive the expected response from the
Lets run Envoy with the following command and test the redirection.
Below is the response in verbose after executing the curl request with the
-L (location) flag so that it follows any redirects. It clearly shows the redirection and the expected response from the redirected route.
Other Routing Strategies
There are still a lot more HTTP routing options available in Envoy, allowing request matches based on a combination of criteria, and then selecting various kinds of routes, to forward those requests to. As already mentioned at the start of this blog, the majority of these options are configured using the Route config. However, there are also some strategies that require manipulation of the upstream Cluster Configuration as well.
I hope that this blog is found useful for those who are trying to explore Envoy’s capabilities in traffic management. In the next part of the blog, I will cover an advanced routing strategy with practical applications, and the ways of implementing it using Envoy. I will add a link here once the next part is published. Thanks for reading!