What is haproxy?
HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with today's hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net.
Debian Install
Simply use apt-get to install haproxy, as this will be available within the default repository. Run the following command on the load balancer server.
apt-get install haproxy
Now that you have installed haproxy it needs to be configured. The location of the configuration file is found below.
/etc/haproxy/haproxy.cfg
First thing you want to do is to make a backup of the default cfg file. Do this with the simple command below.
cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
Once done we need to replace the entire content of that file with the data below.
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
#log loghost local0 info
maxconn 4096
#debug
#quiet
user haproxy
group haproxy
defaults
log global
mode http
option httplog
option dontlognull
retries 3
redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen NAME_OF_YOUR_WEBFARM LOAD_BLANCER_IP_HERE:80
mode http
stats enable
balance roundrobin
cookie JSESSIONID prefix
option httpclose
option forwardfor
option httpchk HEAD /check.txt HTTP/1.0
server webA FIRST_WEB_SERVER_IP_HERE:80 cookie A check
server webB SECOND_WEB_SERVER_IP_HERE:80 cookie B check
Now you should shut down your local apache server, as the load balancer will be operating on port 80, which is being used by apahce. Do this by doing the following.
cd /etc/init.d
./apache2 stop
The server should now respond with confirming that the service is stopped.
Now start haproxy with the following command
./haproxy start
You can confirm that the proxy service is up and running by checking the following url which will be pointed to the load balance.
http://yourloadbalancerdomain/haproxy?stats
When you do check this it will show that your servers are down. This is because we have not installed the check file on our webservers yet. The following steps referrs to how to prepare the webservers that will be balanced by our load balancer haproxy.
The first thing you would want to do is to set the log system on the destination servers to manage the access comming from the load balancer. IF this step is not taken you wont get any relevant visotor logs, All the logs on the webservers will show are access from the load balancer, and not the true source of the request.
Edit your apache2.conf file and replace your LogFormat section with this.
#LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
Now haproxy uses a check file that use as a server heart beat. That being said you don't want this in your log file, as it will bloat your access logs. You need to modify the conf file for the default site on each webserver you are targeting. Do the following to edit.
vi /etc/apache2/sites-available/default
And insert the following. You must also comment out all other CustomLog directives in your vhost configuration
SetEnvIf Request_URI "^/check\.txt$" dontlog
CustomLog /var/log/apache2/access.log combined env=!dontlog
Now you are good to restart apache, and bring them online with your haproxy load balancer. To do this create the check.txt file, and place it in the location of your default site on the server. You should be able to check the haproxy status page to check if they are online. If so check the site by either using the loadbalancer IP, or domain name if one is assigned.
Centos Install
First download haproxy as you will need to compile it for this distrobution.
wget http://haproxy.1wt.eu/download/1.4/src/haproxy-1.4.20.tar.gz /download
Once done simply untar it in your download location
tar -zxvf haproxy-1.4.20.tar.gz
Now before you compile make sure you have the latest one, use yum to find and install gcc for example
Simply switch to unpacked folder and make
cd haproxy-1.4.20 make TARGET=linux26
Now copy the app over your sbin, and then edit your haproxy conf file. After you backup the conf file ofcourse.
cp haproxy /usr/sbin/haproxy
vi /etc/haproxy.cfg</nowiki
Now replace the entire file with below:
<nowiki> global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
#log loghost local0 info
maxconn 4096
#debug
#quiet
user haproxy
group haproxy
defaults
log global
mode http
option httplog
option dontlognull
retries 3
redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
listen NAME_OF_YOUR_WEBFARM LOAD_BLANCER_IP_HERE:80
mode http
stats enable
balance roundrobin
cookie JSESSIONID prefix
option httpclose
option forwardfor
option httpchk HEAD /check.txt HTTP/1.0
server webA FIRST_WEB_SERVER_IP_HERE:80 cookie A check
server webB SECOND_WEB_SERVER_IP_HERE:80 cookie B check
Now you should shut down your local apache server, as the load balancer will be operating on port 80, which is being used by apahce. Do this by doing the following:
cd /etc/init.d
./httpd stop
The server should now respond with confirming that the service is stopped.
Now start haproxy with the following command:
cd /usr/sbin/
./haproxy start
You can confirm that the proxy service is up and running by checking the following url which will be pointed to the load balance.
http://yourloadbalancerdomain/haproxy?stats
When you do check this it will show that your servers are down. This is because we have not installed the check file on our webservers yet. The following steps referrs to how to prepare the webservers that will be balanced by our load balancer haproxy.
The first thing you would want to do is to set the log system on the destination servers to manage the access comming from the load balancer. IF this step is not taken you wont get any relevant visotor logs, All the logs on the webservers will show are access from the load balancer, and not the true source of the request.
Now you must do the rest on your web servers you are targeting