HAProxy supports many balancing algorithms which may be used in many different type of scenarios.
We've collected the best tips and tricks to optimize HAProxy.

High mysql request rate and TCP source port exhaustion

Increasing source port range

By default, on a Linux box, you have around 28K source ports available (for a single destination IP:port):

sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    61000

In order to get 64K source ports, just run:

sudo sysctl net.ipv4.ip_local_port_range="1025 65000"

And don’t forget to update your /etc/sysctl.conf file.

Note: this should definitively be applied also on the web servers….

Allow usage of source port in TIME_WAIT

A few sysctls can be used to tell the kernel to reuse faster the connection in TIME_WAIT:

net.ipv4.tcp_tw_reuse
net.ipv4.tcp_tw_recycle

tw_reuse can be used safely, be but careful with tw_recycle.
It could have side effects. Same people behind a NAT might be able to get connected on the same device. So only use if your HAProxy is fully dedicated to your MySql setup.

anyway, these sysctls were already properly setup (value = 1) on both HAProxy and web servers.

  • Note: this should definitively be applied also on the web servers….
  • Note 2: tw_reuse should definitively be applied also on the web servers….

Using multiple IPs to get connected to a single server

In HAProxy configuration, you can precise on the server line the source IP address to use to get connected to a server, so just add more server lines with different IPs.
In the example below, the IPs 10.0.0.100 and 10.0.0.101 are configured on the HAProxy box:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101
[...]

This allows us to open up to 128K source TCP port…
The kernel is responsible to affect a new TCP port when HAProxy requests it. Dispite improving things a bit, we still reach some source port exhaustion… We could not get over 80K connections in TIME_WAIT with 4 source IPs…

Let HAProxy manage TCP source ports

You can let HAProxy decides which source port to use when opening a new TCP connection, on behalf of the kernel. To address this topic, HAProxy has built-in functions which make it more efficient than a regular kernel.
Let’s update the configuration above:

[...]
  server mysql1     10.0.0.1:3306 check source 10.0.0.100:1025-65000
  server mysql1_bis 10.0.0.1:3306 check source 10.0.0.101:1025-65000
[...]

We managed to get 170K+ connections in TIME_WAIT with 4 source IPs… and not source port exhaustion anymore !

HAProxy and gzip compression

Compilation

Get the latest HAProxy git version, by running a “git pull” in your HAProxy git directory.
If you don’t already have such directory, then run the a:

git clone http://git.1wt.eu/git/haproxy.git

Once your HAProxy sources are updated, then you can compile HAProxy:

make TARGET=linux26 USE_ZLIB=yes

Configuration

this is a very simple configuration test:

listen ft_web
 option http-server-close
 mode http
 bind 127.0.0.1:8090 name http
 default_backend bk_web
 
backend bk_web
 option http-server-close
 mode http
 compression algo gzip
 compression type text/html text/plain text/css
 server localhost 127.0.0.1:80

Web traffic limitation

Basically, we’ll manage two webfarm, one with as much as capacity as we need, and an other one where we’ll redirect people we want to slow down.
The routing decision can be taken using a header, a cookie, a part of the url, source IP address, etc…

Configuration

The configuration below would do the job.
There are only two webservers in the farm, but we want to slow down some virtual host or old and almost never used applications in order to protect and let more capacity to the regular traffic.

you can play with the inspect-delay time to be more or less aggressive.

frontend www
  bind :80
  mode http
  acl spiderbots hdr_cnt(User-Agent) eq 0
  acl personnal hdr(Host) www.personnalwebsite.tld www.oldname.tld
  acl oldies path_beg /old /foo /bar
  use_backend limited_www if spiderbots or personnal or oldies
  default_backend www
 
backend www
 mode http
 server be1  192.168.0.1:80 check maxconn 100
 server be1  192.168.0.2:80 check maxconn 100
 
backend limited_www
 mode http
 acl too_fast be_sess_rate gt 10
 acl too_many be_conn gt 10
 tcp-request inspect-delay 3s
 tcp-request content accept if ! too_fast or ! too_many
 tcp-request content accept if WAIT_END
 server be1  192.168.0.1:80 check maxconn 100
 server be1  192.168.0.2:80 check maxconn 100

Results

Without the example above, an apache bench would be able to go up to 3600 req/s on the regular farm and only 9 req/s on the limited one.

Sysctls tuning

The most important sysctls are:

net.ipv4.ip_local_port_range = "1025 65534"
net.ipv4.tcp_max_syn_backlog = 100000
net.core.netdev_max_backlog = 100000
net.core.somaxconn = 65534
ipv4.tcp_rmem = "4096 16060 64060"
ipv4.tcp_wmem = "4096 16384 262144"

Depending on the workload:

tcp_slow_start_after_idle = 0

iptables tuning:

net.netfilter.nf_conntrack_max = 131072

when improperly configured, conntrack will prevent HAProxy from reaching high performance.
NOTE: just enabling iptables with connection tracking takes 20% of CPU, even with no rules.

HAProxy multi-process

DON'T RUN IN PRODUCTION, THERE ARE NO TIMEOUTS

global
	nbproc 2
	cpu-map 1 1
	cpu-map 2 2
	stats socket /var/run/haproxy/socket_web process 1
	stats socket /var/run/haproxy/socket_mysql process 2
defaults HTTP
	bind-process 1
	mode http

frontend f_web
	bind 192.168.10.1:9000
	default_backend b_web

backend b_web
	server w1 192.168.10.21:8000 check
	defaults MYSQL
	bind-process 2
	mode tcp

frontend f_mysql
	bind 192.168.10.1:3306
	default_backend b_mysql
	backend b_mysql
	server m1 192.168.10.11:3306 check

Logging

HAProxy logs are very verbose
When the traffic workload allows it, they should be enabled all the time! Otherwise, we must have the
ability to enable them on demand
HAProxy can be configured to selectively log part of the traffic
the log line can be customized to your needs (beware to not break the comptibility with halog)
Logging can be enabled either in the global or in the defaults/frontend section
Log format is configured per frontend, but log level must be reported in the backend too
To setup your own log format, use the .... log-format directive

Example in global section

global
	log 127.0.0.1:514 local1
defaults
	log global # 'pointer' to the global section
	option httplog

Example in frontend/backend section

frontend fe
	log 127.0.0.1:514 local1
	option httplog
	default_backend be
backend be
	option httplog

Split traffic and events logs:

global
 log 127.0.0.1:514 local1 # traffic logs
 log 127.0.0.1:514 local2 notice # event logs

Log only errors:

defaults
	option dontlog-normal

Don't log empty connections or browser's pre-connect

defaults
	option dontlognull
	option http-ignore-probes

Log only dynamic traffic:

frontend fe
	http-request set-log-level silent unless { path_end .php }

Timeouts

Bear this in mind: timeouts are not the problem!!!!
Without any timeouts, a public facing HAProxy won't last too long and run out of connections quickly.
Must set up at least the following timeouts:
timeout client : client side inactivity
timeout connect : time to establish the TCP connection on the server
timeout server :
in TCP mode: server side inactivity
in HTTP mode: time for the server to process the response (504 returned)
Other important, but facultative, timeouts
timeout client-fin : maximum time to wait in FIN_WAIT state on the client
timeout server-fin : maximum time to wait in FIN_WAIT state on the server side

In http mode, the following timeouts are important too:
timeout http-request : timeout for the client to send a whole request (protection against
slowlowris-like attacks)
timeout http-keep-alive : maximum time to wait for the next request when doing HTTP
keep-alive
timeout tunnel : inactivity timeout for tunnel mode and websockets
Other timeouts:
timeout queue : how long a request can remain in the queue
timeout tarpit : how long the tarpitted connection is maintained

Configuration example for an HTTP service

defaults HTTP
	mode http
	timeout http-request 10s
	timeout client 20s
	timeout connect 4s
	timeout server 30s
	timeout http-keep-alive 4s
	# for websockets:
	timeout tunnel 2m
	timeout client-fin 1s
	timeout server-fin 1s

Configuration example for a TCP service with long time connections (POP, IMAP, etc)

defaults HTTP
	mode http
	timeout client 1m
	timeout connect 4s
	timeout server 1m
	timeout client-fin 1s
	timeout server-fin 1s