Working With Haproxy

Although I have worked with enterprise envrionments running Oracle and SQL Server for quite a few years, I’ve yet to be involved in a real high-availability deployment.  This has been for a variety of technical reasons in our company’s application, and a lack of interest from most customers.
Recently, I explored options and had the opportunity to test some load balancing setups for our application servers at a customer site.  I was impressed with the reliability and reputation of the HAproxy software, so had a look at that to begin with.  In the past this software has been unable to terminate SSL connections, and had to deploy help from other applications in order to achieve that.  The latest versions have SSL support though, and aside from a bit of compiling from source, are easy to install.
Here’s how I did it, on Centos 6.4.

Intro

This guide covers how to configure Haproxy in a Java/Web infrastructure, to provide a redundant load balancer solution.

The advantages of this setup are:

  • Protection against failure of the load balancer
  • Protection of failure of an application server or web server
  • Protection against the failure of a JVM process

This scenario does not protect against database failure. However, that is the rarest of downtime scenarios, and can be mitigated with a standby database or database clustering.

Prerequisites

This guide covers installation on Centos 6.4, and as such requires that version or greater.

You will need two identical Centos servers or VMs. It is possible to complete the setup on one node, and then clone, but it’s quite easy to do on each. So, it will be covered here assuming you just have two separate minimal Centos installations.

As part of the guide, you will need to download and compile the haproxy source, in order to use the SSL features, which are not yet in the packaged versions which come with Centos.

You will also need two application/web servers configured as the backends. This setup can also be configured with one web server, and have another added later if required.

All instructions below should be run as the root user, unless otherwise stated. So it is best to start up two root shells, one on each haproxy node before beginning.

Installing Required packages

First install required Centos packages as so:

yum -y install gcc make pcre-devel openssl-static keepalived setools-console unzip
chkconfig --add httpd
chkconfig --add keepalived
chkconfig keepalived on 
chkconfig httpd on

Run this on both haproxy nodes.

Allow IP Bind

Run the following on both nodes to allow the virtual IP to bind to a real interface:

echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf
sysctl -p

Allow Firewall Ports

Run the following on both nodes to open the required firewall ports:

iptables -I INPUT -m tcp -p tcp --dport 80 -j ACCEPT
iptables -I INPUT -m tcp -p tcp --dport 443 -j ACCEPT

Creating Config Files

The following creates the config files, and adds the appropriate IPs.

Firstly, set the IPs as required in the shell environment like so:

# IP addresses for the web servers
APACHE1=x.x.x.x
APACHE2=x.x.x.x
\\
# Password for the haproxy stats interface
STATSPWD=mypassword
# The virtual IP across the two nodes
VIP=x.x.x.x

You need the IP addresses for both of the backend web servers, which are assumed are listening on port 80, the virtual IP you wish to be use, which can be any statically assigned IP in the same subnet as the load balancers, and a password for the haproxy stats web page. This defaults to username admin, password mypassword above.

Run this on the shells on both haproxy nodes. You will need to rerun it if you create a new shell or switch users.

Next, generate the files:

cat > /etc/keepalived/keepalived.conf << EOF
vrrp_script chk_haproxy {
   script "killall -0 haproxy"   # verify the pid existance
   interval 2                    # check every 2 seconds
   weight 2                      # add 2 points of prio if OK
}

vrrp_instance VI_1 {
   interface eth0                # interface to monitor
   state MASTER
   virtual_router_id 51          # Assign one ID for this route
   priority 101                  # 101 on master, 100 on backup
   virtual_ipaddress {
       $VIP	             # the virtual IP
   }
   track_script {
       chk_haproxy
   }
}
EOF

# Setup haproxy
mkdir /etc/haproxy
cat > /etc/haproxy/haproxy.cfg << EOF
global
    daemon
    nbproc 1
    maxconn 100000
    node $HOSTNAME

defaults
    option http-server-close
    mode http
    timeout http-request 5s
    timeout connect 5s
    timeout server 10s
    timeout client 10s
    balance roundrobin

listen stats *:8080
    mode http
    stats enable
    stats hide-version
    stats realm Haproxy Statistics
    stats uri /
    stats auth admin:$STATSPWD

frontend httpFrontEnd $VIP:80
    maxconn 100000
    option forwardfor header x-forwarded-for
    default_backend web_server

frontend https_frontend
    bind $VIP:443 ssl crt /etc/ssl/certs/ssl-combined.pem
    mode http
    option httpclose
    option forwardfor
    reqadd X-Forwarded-Proto:\\ https
    default_backend web_server

backend web_server
    mode http
    server apache1 $APACHE1:80 check maxconn 500
    server apache2 $APACHE2:80 check maxconn 500
    appsession JSESSIONID len 52 timeout 12h

EOF

The default configuration creates two haproxy listeners on ports 80 and 443. It looks for a combined SSL certificate at /etc/ssl/certs/ssl-combined.pem. This should contain the SSL cert, the private key, and any intermediate certificates.

Compiling HAProxy

Next, you need to download and compile haproxy. Please check the webpage and download the latest available version of the 1.5 branch.

# Compile haproxy
wget http://haproxy.1wt.eu/download/1.5/src/devel/haproxy-1.5-dev19.tar.gz
tar xzf haproxy*
cd haproxy*
make TARGET=linux2628 USE_EPOLL=1 USE_OPENSSL=1 ARCH=native USE_ZLIB=1 USE_PCRE=1 && make install

Assuming you installed the above packages and nothing else has changed, it should compile and install the haproxy binaries.

Generate init script

Run the following to generate and install the haproxy init script, to allow it to start when the server reboots:

cat > /etc/init.d/haproxy << EOF
#!/bin/sh
#
# chkconfig: 235 90 36
# description: starts and stops haproxy
#
#
. /etc/rc.d/init.d/functions

RETVAL=0
SERVICE="haproxy"
HAPROXY_HOME="/etc/haproxy/"
VAR_RUN="/var/run"

start() {
	echo "Starting HAProxy"
	/usr/local/sbin/haproxy -f \${HAPROXY_HOME}/\${SERVICE}.cfg -p \${VAR_RUN}/\${SERVICE}.pid
	RETVAL=\$?
	[ \$RETVAL -eq 0 ] && success || failure
	echo
	[ \$RETVAL -eq 0 ] && touch /var/lock/subsys/\${SERVICE}
	return \$RETVAL
} 

stop() {
	echo "Stopping HAProxy"
	kill \`cat \${VAR_RUN}/\${SERVICE}.pid\`
	RETVAL=\$?
	[ \$RETVAL -eq 0 ] && success || failure
	echo
	[ \$RETVAL -eq 0 ] && rm -r /var/lock/subsys/\${SERVICE}
	[ \$RETVAL -eq 0 ] && rm -f \${VAR_RUN}/\${SERVICE}.pid
	return \$RETVAL
} 

restart() {
	stop
	start
}

check() {
	/usr/local/sbin/haproxy -c -q -V -f \${HAPROXY_HOME}/\${SERVICE}.conf
	status \${SERVICE}
	netstat -anp|grep \${SERVICE}
} 

case "\$1" in
	start)
		start
		;;
	stop)
		stop
		;;
	restart)
		restart
		;;
	check)
		check
		;;
	*)
		echo \$"Usage: \$0 {start|stop|restart}"
		exit 1
esac 

exit \$?
EOF

chmod 755 /etc/init.d/haproxy
chkconfig --add haproxy
chkconfig haproxy on

Alter The Config on the Slave Node

There’s a minor change to make on the slave node now – you need to change the keepalived priority on this, so it will always be a slave, unless the master goes down. Run the following:

Run on slave ONLY
sed -i 's/priority 101/priority 100/' /etc/keepalived/keepalived.conf

Check and Test

The final step is to test config, and bring up the daemons:

Test Haproxy Config
haproxy -f /etc/haproxy/haproxy.conf -c
service keepalived start
service haproxy start

Testing Keepalived Failover

You need to verify that the VIP is properly assigned on the master node.

ip addr sh eth0

N.B. The ifconfig command will NOT show this virtual IP address, so use the command above instead.

You should see two IPs assigned to the interface, the real and the virtual IP. Running the same command on the slave should show only the real IP.

To review what keepalived is doing, you can do:

grep keepalived /var/log/messages

To test the VIP failover, you can take down the interface on the master node. The IP should failover onto the slave node, as shown by the commands above. Of course to test this, you need to have a console open either through a VM or directly on the server!

To take down the interface you can do this (see caveat above):

ifdown eth0

To bring it back up:

ifup eth0

Testing Haproxy Failover

You can look at the interface at http://virtual-ip:8080 – this shows the stats for the haproxy. Username and password are as specified earlier

A normal setup with two healthy web servers will look something like below:

healthy-webserver

If a server goes down, you’ll see something like this instead:

unhealthy-webserver

You can see that the apache1 web_server line has turned red.

You can test that haproxy is seeing this correctly, by taking down an apache service on one of your web servers:

service httpd stop

Refreshing that page should then show the server going yellow, and then red. The default setting is to failover the web server if it does not respond, or responds with an error, to two successive tests, 2 seconds apart.

Improving Haproxy Failover for the Application Server

By default in the above configuration, haproxy just checks for a working webserver on port 80. Therefore, it can’t detect if the web application is running correctly or not. This may not be a problem if you are using a cluster aware configuration, using the weblogic plugins, for instance, as apache will be able to reroute requests itself, so long as it is running. That means, should a JDK fail on one application server, apache will automatically send the requests to the other application server in the cluster.

However, when there is no application server cluster, haproxy can ensure that users don’t get sent to a dead JVM accidentally. To do this, add something like this to the configuration under the backend section:

option httpchk yourapp

This will check the /yourapp URL on the web server, and will assume it is working correctly if it gets a normal HTTP response. If it receives a 404 or 500 error, this will trigger a failure, and after several of these, HAproxy will stop routing traffic to the host.

2 thoughts on “Working With Haproxy

  1. Excellent article. Do you think this setup will work to load balance/failover several nonclustered weblogic instances. In my current setup, i have apache rp listening on port 443 with ssl termination and forwarding everything to weblogics, listening on port 80. Unfortunately the failover is not very smooth. HAProxy sounds like a more robust solution. Can this be accomplished with HAproxy?

    1. Hi Andy.

      Yes, HAproxy will work in front of a couple of standalone managed servers. In the case of a non-clustered environment though, you might find the problem is to do with the session failover – that won’t work without clustering features enabled. If you aren’t worried about session failover, and have a mostly public web application, then it should work fine.

      Say you are running two standalone weblogic servers, and you don’t care about session state, then HAproxy can do a decent job of detecting when one is down, and straight away routing traffic to the other. It’s actually a general purpose proxy, it works with any protocol, not just HTTP.

      If users do have sessions though, then they will have to log in again on the second node. That’s always going to be the case without a cluster though, as the Java servlets hold the session state, and if you don’t replicate them, then the user will lose anything in the session when they switch nodes (this could be shopping cart contents, etc).

      However, you might also want to look at the Weblogic plugins for Apache, which could help you if you are not already using them. These help route requests to only active managed servers, and have the added advantage that they can automatically resubmit a failed request to another managed server. I have not tried this set up without a cluster, but I think it possibly works OK.

      Chris

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s