Summary

  • The number of workers for Gunicorn, Uvicorn and Hypercorn should be CPU * 2 + 1
  • The number of threads for WSGI apps should be WORKERS * 2 as a minimum unless you don’t have a lot of memory. If you have more memory you can increase this number
  • The number of threads for ASGI apps is irrelevant because it uses coroutines to dispatch requests
  • ASGI can be faster and scale better than WSGI but only if you’ve written your application as async from the framework to the backend
  • Test your setup using Locust or a similar tool to verify these assumptions
  • Don’t forget about modelling static content in your load tests