Table of contents

Nginx in front of FastAPI

Another topic not strictly FastAPI performance optimization, but has a great benefit for the overall service if used. Nginx is a versatile service, web server, reverse proxy, WAF and so on. By using in front of the FastAPI application some functions can be decoupled from Gunicorn (request sanity check, timeout handling, backlogging, extra logging in case of error, app latency measurement and so on) Typical reverse proxy configuration uses TCP ports between the app and the web server / reverse proxy but it has negative impact on the concurrency as web server - app communication requires a pair of ports allocated for each connection. There are 65535 port all together, but some ranges are not for this purpose so the concurrency is limited, furthermore a port doesn’t become free immediately after a connection is closed The alternative solution which is mentioned in Uvicorn documentation suggests using sockets which indeed a better solution with some challenges.

FastAPI as non-root user

If you run your application as non-root user you need to be sure nginx user can read and write the socket. Fortunately Gunicorn supports umask. The most secure option is dedicating a group to this communication, making nginx’s and app’s user part of that group and limiting the communication to group read and write (umask 717)

Measurements

Synchronous API endpoint with small request / response

Nginx - APP via port

Test attribute	Test run 1	Test run 2	Test run 3	Average
Requests per second	1375.63	1349.77	1305.75	1343.72
Time per request [ms]	72.694	74.087	76.584	74.455

Nginx - APP via socket

Test attribute	Test run 1	Test run 2	Test run 3	Average	Difference to baseline
Requests per second	1321.34	1357.39	1367.66	1348.8	0.38 %
Time per request [ms]	75.681	73.671	73.118	74.1567	0.3 ms

Asynchronous API endpoint with small request / response

Nginx - APP via port

Test attribute	Test run 1	Test run 2	Test run 3	Average
Requests per second	1852.67	1918.59	1817.32	1862.86
Time per request [ms]	53.976	52.122	55.026	53.708

Nginx - APP via socket

Test attribute	Test run 1	Test run 2	Test run 3	Average	Difference to baseline
Requests per second	1943.55	1919.56	1923.68	1928.93	3.55 %
Time per request [ms]	51.452	52.095	51.984	51.8437	1.86 ms

Observations

FastAPI queries per second is 1923 which is slightly better than using ports
API latency improved as well

Synchronous API endpoint with 1MB response

Nginx - APP via port

Test attribute	Test run 1	Test run 2	Test run 3	Average
Requests per second	1863.81	818.1	1980.61	1554.17
Time per request [ms]	53.654	122.235	50.489	75.4593

Nginx - APP via socket

Test attribute	Test run 1	Test run 2	Test run 3	Average	Difference to baseline
Requests per second	1941.33	2197.9	1884.56	2007.93	29.2 %
Time per request [ms]	51.511	45.498	53.063	50.024	25.44 ms

Observations

FastAPI requests per second was above 2000
FastAPI laterncy is lower with nginx communication via socket

Asynchronous API endpoint with 1MB response

Nginx - APP via port

Test attribute	Test run 1	Test run 2	Test run 3	Average
Requests per second	803.7	1927.06	967.89	1232.88
Time per request [ms]	124.425	51.893	103.317	93.2117

Nginx - APP via socket

Test attribute	Test run 1	Test run 2	Test run 3	Average	Difference to baseline
Requests per second	1804.45	1829.6	1799.19	1811.08	46.9 %
Time per request [ms]	55.418	54.657	55.581	55.2187	37.99 ms

Verdict

Numbers talk by themselves. Almost any case it well worth changing to socket communication Sample config is here

Pro tip:

If you use nginx-light instead of nginx in your Docker build you can save ~100MB container image size.
This is a full Nginx config for FastAPI:

user nobody nogroup;
pid /var/run/nginx.pid;
worker_processes 1;  # 1/CPU, to be configured, but Nginx is so powerful, 1 worker can easly handle 1-2k QPS 
events {
  worker_connections 4096; # increase in case of lot of clients
  accept_mutex off; # set to 'on' if nginx worker_processes > 1
  use epoll; # for Linux 2.6+
}

http {
  include mime.types;
  # fallback in case we can't determine a type
  default_type application/octet-stream;
  tcp_nodelay on; # avoid buffer
  access_log off;
  error_log stderr;
  upstream gunicorn {
    # fail_timeout=0 means we always retry an upstream even if it failed
    # to return a good HTTP response
    server unix:/tmp/gunicorn.sock fail_timeout=0;
    keepalive 8;
  }

  server {
    listen                                80;
    server_tokens                         off;
    client_max_body_size                  20M;

    gzip                                  on;
    gzip_proxied                          any;
    gzip_disable                          "msie6";
    gzip_comp_level                       6;
    gzip_min_length                       200; # check your average response size and configure accordingly

    location / {
      proxy_pass                          http://gunicorn;
      proxy_set_header Host               $host;
      proxy_set_header X-Real-IP          $remote_addr;
      proxy_set_header X-Forwarded-For    $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto  $http_x_forwarded_proto;
      proxy_pass_header                   Server;
      proxy_ignore_client_abort           on;
      proxy_connect_timeout               65s; #  65 here and 60 sec in gconf in order to time out at app side first
      proxy_read_timeout                  65s;
      proxy_send_timeout                  65s;
      proxy_redirect                      off;
      proxy_http_version                  1.1;
      proxy_set_header Connection         "";
      proxy_buffering                     off;
    }
    keepalive_requests                    5000;  
    keepalive_timeout                     120;
    set_real_ip_from                      10.0.0.0/8;
    set_real_ip_from                      172.16.0.0/12;
    set_real_ip_from                      192.168.0.0/16;
    real_ip_header                        X-Forwarded-For;
    real_ip_recursive                     on;
  }
}