How to Enable High Availability for Web ADC

This is an old revision of the document!

LiteSpeed Web ADC's High Availability (HA) configuration provides a failover setup for two ADC Nodes. When one node is temporarily unavailable, the other one will automatically detect and take over the traffic.

Two ADC nodes will need to be set up individually.

Once they are set up, LiteSpeed Web ADC HA will use Keepalived to detect the failover.

We will set up two nodes as an example:

Node1: 10.10.30.96

Node2: 10.10.30.97

Virtual IP: 10.10.31.31

Before you configure ADC HA, You should install Keepalived on both node 1 and node 2. On CentOS, you can do yum install:

yum install keepalived

On Ubuntu/Debian, you can do apt-get:

apt-get install keepalived

Then start Keepalived:

service keepalived start

You also need to set up autorestart during the system reboot:

systemctl enable keepalived

or

chkconfig keepalived on

The Keepalived configuration file is located at /etc/keepalived/keepalived.conf, but you should not edit this configuration file directly. Instead, you should use ADC Web Admin GUI > HA config to add or configure a Virtual IP. If you manually add a VIP to Keepalived config, it won't be picked up by ADC HA. The VIP configuration under the ADC's HA tab is just a GUI to update the Keepalived config file.

You should alwaus use the WebAdmin GUI to manage VIPs if you want to see them in the status.

Node 1

Login to node 1 ADC Web Admin Console. Sample configuration:

Server Address	10.10.30.96:11122
Replication Cluster	10.10.30.96:11122,10.10.30.97:11122
Heart Beat Interval (secs)	10
Heart Beat Timeout (secs)	30
Is Gzip Stream	       Yes
Enable incremental sync      Yes
Is File Cache Enabled	       Yes
File Cache Server Address  10.10.30.96:1447

Click Add in HA Interfaces to add a Virtual IP:

After a VIP has been added through the GUI, the configuration will be added to the Keepalived configuration and it will look like this:

vi /etc/keepalived/keepalived.conf

###### start of VI_5 ######
vrrp_instance VI_5 {
  state BACKUP
  interface ens160
  lvs_sync_daemon_inteface ens160
  garp_master_delay 2
  virtual_router_id 110
  priority 170
  advert_int 1
  authentication {
      auth_type PASS
      auth_pass test123
  }
  virtual_ipaddress {
      10.10.31.31
  }
}
###### end of VI_5 ######

Node 2

Login to node 1 ADC Web Admin Console. Sample configuration:

Server Address	10.10.30.97:11122
Replication Cluster	10.10.30.96:11122,10.10.30.97:11122
Heart Beat Interval (secs)	10
Heart Beat Timeout (secs)	30
Is Gzip Stream	       Yes
Enable incremental sync      Yes
Is File Cache Enabled	       Yes
File Cache Server Address  10.10.30.97:1447

Click Add in HA Interfaces to add a Virtual IP:

After a VIP has been added through the GUI, the configuration will be added to the Keepalived configuration, and it will look like this:

###### start of VI_5 ######

vrrp_instance VI_5 {
  state BACKUP
  interface ens160
  lvs_sync_daemon_inteface ens160
  garp_master_delay 2
  virtual_router_id 110
  priority 150
  advert_int 1
  authentication {
      auth_type PASS
      auth_pass test123
  }
  virtual_ipaddress {
      10.10.31.31
  }
}
###### end of VI_5 ######

Note:

node1 virtual_router_id should be the same as node2
state MASTER/BACKUP doesn't really matter, since the higher priority one will always be MASTER.

IP failover is completely managed by Keepalived. The ADC just adds a configuration management interface. IP failover only happens when one server is completely down. The other server will then take over the IP. Shuting down LS ADC won't trigger an IP failover.

It's a good idea to test IP failover.

1. check the master node, which currently is node 1, 10.10.30.96

root@ha1-ubuntu:~# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
     valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  link/ether 00:0c:29:c4:09:80 brd ff:ff:ff:ff:ff:ff
  inet 10.10.30.96/16 brd 10.10.255.255 scope global ens160
     valid_lft forever preferred_lft forever
  inet 10.10.31.31/32 scope global ens160
     valid_lft forever preferred_lft forever
  inet6 fe80::20c:29ff:fec4:980/64 scope link 
     valid_lft forever preferred_lft forever

You can see the VIP 10.10.31.31.

2. Test backup node, node 2, 10.10.30.96

root@ha2-ubuntu:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
     valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  link/ether 00:0c:29:95:67:6d brd ff:ff:ff:ff:ff:ff
  inet 10.10.30.97/16 brd 10.10.255.255 scope global ens160
     valid_lft forever preferred_lft forever
  inet6 fe80::20c:29ff:fe95:676d/64 scope link 
     valid_lft forever preferred_lft forever

You don't see VIP on node 2 when VIP is active on node 1, which is correct.

3. Shut down the master node 1.

The VIP 10.10.31.31 should be migrated to the backup server node 2. You can check:

root@ha2-ubuntu:~# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000                                                                            
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00                                                                                                              
  inet 127.0.0.1/8 scope host lo                                                                                                                                     
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
     valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  link/ether 00:0c:29:95:67:6d brd ff:ff:ff:ff:ff:ff
  inet 10.10.30.97/16 brd 10.10.255.255 scope global ens160
     valid_lft forever preferred_lft forever
  inet 10.10.31.31/32 scope global ens160
     valid_lft forever preferred_lft forever
  inet6 fe80::20c:29ff:fe95:676d/64 scope link 
     valid_lft forever preferred_lft forever

You can see VIP 10.10.31.31 is assigned to node 2 now.

The HA Status page will look like the following when running:

On Node 1:

On Node 2:

Sometimes you may see replication is out of sync:

Check the following:

If one ADC instance is down, the replication will be out of sync, That's expected. The ADC will try to restore synchronization in a short time.
Make sure node 1 and node 2 are configurated the same way. If they are configurated differently, you can not expect HA/Replication to work.

We assume you have configured the listener, virtual host and backend clusterHTTP on both Node 1 and node 2 seperately. They should look something like this:

Listener: With IP failover, we recommend listening on *:<port>, instead of individual <IP>:<port>.

Virtual Host:

ClusterHTTP setup:

Try accessing 10.10.31.31 (VIP) from the browser. You will see the backend server page. Disable one node, and you can still see the webpage. Check ADC HA status. The live node will become Master when the other one down.

Problems After Configuration Changes

When making changes to the configuration, such as changing the listener from <IP>:443 to *:443, a full stop/start is required.

Inconsistent HA Configuraton Between Boxes

When you see an error similar to the following:

2018-08-03 16:24:20.099467 [WARN] [REPL] HA configures are inconsistent between boxes, full replicaton can't proceed
2018-08-03 16:24:20.099520 [ERROR] [REPL] peer HA replication config is inconsistent, it must be fixed!

It is because the configuration is out of sync between two LS ADC instances. Replication only works if the two ADCs are serving the exact same sites. You need to keep ADC configuration in sync. If it is out of sync temporarily, it will break the synchronization. Once the configurations are synced, ADC will restore replication synchronization.

Keepalived Indicates Multiple Masters

When the configured VIP is shown on multiple nodes, it usually indicates a Split-Brain issue with keepalived.

Keepalived defaults to using multicast packets. Please verify that multicast packets are not filtered/blocked by your firewall.

For a more fancy failover method, you may want to try BGP, like what Cloudflare does, but it is not controlled by ADC. https://blog.cloudflare.com/cloudflares-architecture-eliminating-single-p/

How to Enable High Availability for Web ADC

Two Example Nodes

Install and configure Keepalived

Configure HA on LiteSpeed Web ADC

Node 1

Node 2

Test IP Failover

1. check the master node, which currently is node 1, 10.10.30.96

2. Test backup node, node 2, 10.10.30.96

3. Shut down the master node 1.

Check HA Status for Both Nodes

Fixing Out of Sync Replication

Verify Listener, Virtual Host, ClusterHTTP are Set up Correctly

Testing VIP

Troubleshooting

Problems After Configuration Changes

Inconsistent HA Configuraton Between Boxes

Keepalived Indicates Multiple Masters

Alternative