How to enable HA for Web ADC?

This is an old revision of the document!

LiteSpeed Web ADC HA setup provides a failover setup for two ADC Nodes. When one node is temporarily unavailable, the other one will automatically detect and take over the traffic.

LiteSpeed Web ADC HA will use keepalived to detect the failover.

We will setup two nodes as an example:

Node1: 10.10.30.96

Node2: 10.10.30.97

Virtual IP: 10.10.31.31

Before you configure ADC HA, You should install keepalived on both node 1 and node 2. On CentOS, you can do yum install:

yum install keepalived

or on Ubuntu/Debian, you can do apt-get:

apt-get install keepalived

Then Start keepalived:

service keepalived start

Also need to setup autorestart during the system reboot:

systemctl enable keepalived

or

chkconfig keepalived on

The keepalive configuration file is located at /etc/keepalived/keepalived.conf, but you should not edit this configuration file directly, instead, you should use ADC Web Admin GUI → HA config to add/config VIP. The manually added VIP to keepalived config won't be picked up by ADC HA config. The VIP configure under ADC HA tab is just a GUI to update the keepalived config file. So you should just use the WebAdmin GUI to manage VIP if they want to see it in the status. We will explain on later steps on how to add VIP in GUI.

Node 1

login to node 1 ADC Web Admin Console: sample configuration

Server Address	10.10.30.96:11122
Replication Cluster	10.10.30.96:11122,10.10.30.97:11122
Heart Beat Interval (secs)	10
Heart Beat Timeout (secs)	30
Is Gzip Stream	       Yes
Enable incremental sync      Yes
Is File Cache Enabled	       Yes
File Cache Server Address  10.10.30.96:1447

then “Add” HA interface:

After VIP has been added through GUI, the configuration will be added to keepalived configuration and you will see keepalive configuration like:

vi /etc/keepalived/keepalived.conf

###### start of VI_5 ######
vrrp_instance VI_5 {
  state BACKUP
  interface ens160
  lvs_sync_daemon_inteface ens160
  garp_master_delay 2
  virtual_router_id 110
  priority 170
  advert_int 1
  authentication {
      auth_type PASS
      auth_pass test123
  }
  virtual_ipaddress {
      10.10.31.31
  }
}
###### end of VI_5 ######

Node 2

login to node 1 ADC Web Admin Console: sample configuration

Server Address	10.10.30.97:11122
Replication Cluster	10.10.30.96:11122,10.10.30.97:11122
Heart Beat Interval (secs)	10
Heart Beat Timeout (secs)	30
Is Gzip Stream	       Yes
Enable incremental sync      Yes
Is File Cache Enabled	       Yes
File Cache Server Address  10.10.30.97:1447

then add HA interface:

After VIP has been added through GUI, the configuration will be added to keepalived configuration and you will see keepalive configuration like:

###### start of VI_5 ######

vrrp_instance VI_5 {
  state BACKUP
  interface ens160
  lvs_sync_daemon_inteface ens160
  garp_master_delay 2
  virtual_router_id 110
  priority 150
  advert_int 1
  authentication {
      auth_type PASS
      auth_pass test123
  }
  virtual_ipaddress {
      10.10.31.31
  }
}
###### end of VI_5 ######

Note:

node1 virtual_router_id should be the same as node2;
“state MASTER/BACKUP” doesn't really matter, since Higher priority one will be MASTER.

For IP failover, it is completely managed by keepalived, ADC just add a configuration management interface. So you should test IP failover:

1. check the master node, which currently is node 1, 10.10.30.96

root@ha1-ubuntu:~# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
     valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  link/ether 00:0c:29:c4:09:80 brd ff:ff:ff:ff:ff:ff
  inet 10.10.30.96/16 brd 10.10.255.255 scope global ens160
     valid_lft forever preferred_lft forever
  inet 10.10.31.31/32 scope global ens160
     valid_lft forever preferred_lft forever
  inet6 fe80::20c:29ff:fec4:980/64 scope link 
     valid_lft forever preferred_lft forever

You can see the VIP 10.10.31.31.

2. Then test backup node, node 2, 10.10.30.96:

root@ha2-ubuntu:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
     valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  link/ether 00:0c:29:95:67:6d brd ff:ff:ff:ff:ff:ff
  inet 10.10.30.97/16 brd 10.10.255.255 scope global ens160
     valid_lft forever preferred_lft forever
  inet6 fe80::20c:29ff:fe95:676d/64 scope link 
     valid_lft forever preferred_lft forever

You don't see VIP on node 2 when VIP is active on node 1, which is correct.

3. Shutdown the master node 1, the VIP 10.10.31.31 should be migrated to the backup server node 2, you can check:

root@ha2-ubuntu:~# ip a                                                                                                                                                
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000                                                                            
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00                                                                                                              
  inet 127.0.0.1/8 scope host lo                                                                                                                                     
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
     valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
  link/ether 00:0c:29:95:67:6d brd ff:ff:ff:ff:ff:ff
  inet 10.10.30.97/16 brd 10.10.255.255 scope global ens160
     valid_lft forever preferred_lft forever
  inet 10.10.31.31/32 scope global ens160
     valid_lft forever preferred_lft forever
  inet6 fe80::20c:29ff:fe95:676d/64 scope link 
     valid_lft forever preferred_lft forever

You can see VIP 10.10.31.31 is assigned to node 2 now.

IP failover is completely handled by keepalived, it only happens when one server is completely down, the other server will take over the IP. Shutingdown LS ADC won't trigger an IP failover.

For more fancy failover method, you may want to try BGP, like what cloudflare does, but it is not controlled by ADC. https://blog.cloudflare.com/cloudflares-architecture-eliminating-single-p/

HA Status page will look like the following when running:

On Node 1:

On Node 2:

Sometime you may see replication out of sync.

You will need to make sure node 1 and node 2 are configurated the same way. If they are configurated different way, you can not expect HA/Replication working.
If one ADC instance is down, the replication will be out of sync, that's expected, the ADC will try to restore synchronization in short time.

Assuming you have configured the listener, virtual Host and backend clusterHTTP on both Node 1 and node 2 seperately.

Listener: With IP failover, we recommend listener configuration listening on *:<port>, instead of individual <IP>:<port>.

Virtual Host:

ClusterHTTP setup:

Try access 10.10.31.31 (VIP) from the browser, you will see the backend server page. Disable one node, you can still see the webpage. Check ADC HA status, live node will become Master when the other one down.

Whe make configuration changes, it may need a full stop/start

When making changes to the configuration, such as changing the listener from <IP>:443 to “*:443”, that requires a full stop/start.

HA configures are inconsistent between boxes

When you see similar error to the following: 2018-08-03 16:24:20.099467 [WARN] [REPL] HA configures are inconsistent between boxes, full replicaton can't proceed 2018-08-03 16:24:20.099520 [ERROR] [REPL] peer HA replication config is inconsistent, it must be fixed!

It is because the configuration is out of sync between two LS ADC instances. replication only works if two ADC are serving the exact same sites. You need to keep ADC configuration in sync. If it is out of sync temporarily, it will break the synchronization. Once config synced, ADC will restore replication synchronization.

How to enable HA for Web ADC?

Two example Nodes

Install and configure keepalived

Configure HA on LiteSpeed Web ADC

Node 1

Node 2

Test IP failver

Check HA Status for both nodes

Replication our of sync? What required?

Verify your listener,virtualhost, ClusterHTTP setup correctly

Testing VIP

Troubleshooting

Whe make configuration changes, it may need a full stop/start

HA configures are inconsistent between boxes