My organization runs a Pyramid web application that we deploy to AWS Elastic Container Service (Fargate). Waitress serves the Pyramid application with nginx as a reverse proxy on a Debian buster image. We found upon taking waitress 2.1.0 we saw an immediate performance degradation. Specifically for response sizes above ~50kb (compressed by nginx), we see a ~5-10x slowdown on responses, resulting in numerous 504s for our various APIs. Upon rolling back to version 2.0.0, performance is restored. This is on Python 3.7.12.
I use locust to run performance tests on the web application, which allows me to quickly isolate and identify the root cause of such dramatic performance changes. All system level dependencies are identical (those installed by apt), as are other Python libraries.
Pre-update (waitress 2.0.0):
Response time percentiles (approximated in ms)
 Type     Name            50%    66%    75%    80%    90%    95%    98%    99%   99.9%  99.99%   100%
--------|------------|---------|------|------|------|------|------|------|------|------|------|------|
 All     Aggregated       750    980   1200   1400   1800   2300   4000   4600   6500   6500   6500 
Post update (waitress 2.1.0):
Response time percentiles (approximated in ms)
 Type     Name             50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100%
-------|------------|---------|------|------|------|------|------|------|------|------|------|------|
 All     Aggregated      780    1500   2600   5300  21000  57000  60000  61000  61000  61000  61000
This test was run on the same environment - only difference is the waitress version. Most routes that return small responses are not affected, but as soon as response size gets beyond a certain threshold the performance falls off a cliff.
Any thoughts why this might be? Something obvious I am missing? It is entirely possible there is a mistake or misconfiguration on my end, but such a slowdown is very suspicious and I believe isolated to this version.
Dockerfile
nginx.conf