gzip on AWS Lambda and API Gateway

Christoph Neijenhuis
commercetools tech
Published in
4 min readNov 16, 2017

--

gzip is a compression format widely used over HTTP for text-based files like JSON, HTML and CSS. Depending on the repetition in your data, the gzipped file is usually 2 to 10 times smaller than the original. For a user on a slow (e.g. mobile) connection this can make a huge difference.

I’ll show to generate gzip responses on AWS Lambda. In a previous post, I discussed how an existing API that we don’t control can be customized by proxying the traffic through the API Gateway and (selectively) through Lambda, so I’ll also show how we can pass already gizpped responses through.

API Gateway

Unfortunately, the API Gateway is currently oblivious of gzip. If we’re using a HTTP proxy, and the other HTTP endpoint returns a gzipped response, it’ll try to reencode it, garbling the response.

We’ll have to tell the API Gateway to treat our responses as binary files — not touching it in any way.

Since all my API responses will be gzipped, I’ve done so for all content-types (using */*) in the screenshot on the left. But you can also limit it (e.g. application/json if you only want to gzip JSON files).

With that change, the HTTP proxy will successfully pass the gzip response from the origin server on to the client.

Lambda: Working with Binary Support enabled

Now that we’ve enabled binary support on the API Gateway, our Lambda function will receive its input differently. The API Gateway expects binary data, but it passes a String to the Lambda function. Therefore, Athe Lambda will receive the binary data in a base64-encoded String. If the client is POSTing a JSON file, we can’t parse it directly anymore, but first have to decode it.

Passing on a gzipped response

In a previous post, I had a Lambda function modify the JSON of a request, send the modified request to an API and pass on the response to the client.

What if the API returns a gzipped response? First, we have to collect the binary data. Again, the API Gateway does not take binary data directly, but needs it base64-encoded. We’ll also have to set a special flag when we call it back with a base64-encoded response.

This will work for both gzipped and regular responses. A small optimization is to check if the Content-Encoding header is set in the response. If not, we don’t have to base64-encode the response.

Creating our own gzipped response

First, we should check if the client can decompress gzip. If so, the Accept-Encoding header should contain “gzip”. If it’s present, we’ll use zlib to compress our response, and use base64-encoding as above. We’ll also have to set the Content-Encoding header, so that the client knows what to expect.

The above gist is kept short for clarity, you probably want to set additional headers. Headers are case-insensitive, you may want to normalize them first.

Disabling gzip when proxying requests

Especially when writing tests for the Lambda functions, it becomes a bit cumbersome as one has to base64-encode the JSON request first.

Also, the performance gains will be limited if the request only originated from within AWS (e.g. from another webserver) and not from an outside client (like a mobile app).

To disable it on the API Gateway, you can overwrite the Accept-Encoding header. Unfortunately, you have to do so on every HTTP proxy that you create.

Similarly, the Lambda-function should unset the Header.

Conclusion

Proxying or returnig gzipped responses with the API Gateway and Lambda is not as straight-forward as one would expect. Hopefully in the future, the API Gateway will honor the Content-Encoding header when proxying requests, and gzip (or otherwise compress) responses on its own.

In the mean time, with a few lines of code, we can make it work. It’s also not hard to add other compression formats, such as deflate. You can find complete Gists for the proxy scenario here: Gateway, Lambda.

--

--