Differences

This shows you the differences between two versions of the page.

--- litespeed_wiki:lslb:ha [2018/08/03 17:57]
Jackson Zhang
+++ litespeed_wiki:lslb:ha [2018/08/03 18:28]
Jackson Zhang [Test IP failver]
@@ Line 117: / Line 117: @@
   - node1 virtual_router_id should be the same as node2;
   - **"state MASTER/BACKUP"** doesn't really matter, since Higher priority one will be MASTER.
+===== Test IP failver =====
+For IP failover, it is completely managed by keepalived, ADC just add a configuration management interface. So you should test IP failover:
+. check the master node, which currently is node 1, 10.10.30.96
+ root@ha1-ubuntu:~# ip a
+: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
+    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
+    inet 127.0.0.1/8 scope host lo
+       valid_lft forever preferred_lft forever
+    inet6 ::1/128 scope host
+       valid_lft forever preferred_lft forever
+: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
+    link/ether 00:0c:29:c4:09:80 brd ff:ff:ff:ff:ff:ff
+    inet 10.10.30.96/16 brd 10.10.255.255 scope global ens160
+       valid_lft forever preferred_lft forever
+    inet 10.10.31.31/32 scope global ens160
+       valid_lft forever preferred_lft forever
+    inet6 fe80::20c:29ff:fec4:980/64 scope link
+       valid_lft forever preferred_lft forever
+You can see the VIP 10.10.31.31.
+.  Then test backup node, node 2, 10.10.30.96:
+  root@ha2-ubuntu:~# ip a
+: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
+    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
+    inet 127.0.0.1/8 scope host lo
+       valid_lft forever preferred_lft forever
+    inet6 ::1/128 scope host
+       valid_lft forever preferred_lft forever
+: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
+    link/ether 00:0c:29:95:67:6d brd ff:ff:ff:ff:ff:ff
+    inet 10.10.30.97/16 brd 10.10.255.255 scope global ens160
+       valid_lft forever preferred_lft forever
+    inet6 fe80::20c:29ff:fe95:676d/64 scope link
+       valid_lft forever preferred_lft forever
+You don't see VIP on node 2 when VIP is active on node 1, which is correct.
+. Shutdown the master node 1, the VIP 10.10.31.31 should be migrated to the backup server node 2, you can check:
+  root@ha2-ubuntu:~# ip a
+: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
+    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
+    inet 127.0.0.1/8 scope host lo
+       valid_lft forever preferred_lft forever
+    inet6 ::1/128 scope host
+       valid_lft forever preferred_lft forever
+: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
+    link/ether 00:0c:29:95:67:6d brd ff:ff:ff:ff:ff:ff
+    inet 10.10.30.97/16 brd 10.10.255.255 scope global ens160
+       valid_lft forever preferred_lft forever
+    inet 10.10.31.31/32 scope global ens160
+       valid_lft forever preferred_lft forever
+    inet6 fe80::20c:29ff:fe95:676d/64 scope link
+       valid_lft forever preferred_lft forever
+You can see VIP 10.10.31.31 is assigned to node 2 now.
+IP failover is completely handled by keepalived, it only happens when one server is completely down, the other server will take over the IP. Shutingdown LS ADC won't trigger an IP failover.
+For more fancy failover method, you may want to try BGP, like what cloudflare does, but it is not controlled by ADC.
+https://blog.cloudflare.com/cloudflares-architecture-eliminating-single-p/
@@ Line 128: / Line 194: @@
 On Node 2:
 {{ :litespeed_wiki:lslb:adc-ha-configuration-ha2-status.png?800 |}}
+===== Replication our of sync? What required? =====
+Sometime you may see replication out of sync.
+You will need to make sure node 1 and node 2 are configurated the same way. If they are configurated different way, you can not expect HA/Replication working.
 ===== Verify your listener,virtualhost, ClusterHTTP setup correctly =====
@@ Line 134: / Line 207: @@
 Listener:
+With IP failover, we recommend listener configuration listening on *:<port>, instead of individual <IP>:<port>.
 {{ :litespeed_wiki:lslb:adc-ha-configuration-listener.png?800 |}}
@@ Line 154: / Line 229: @@
 Try access 10.10.31.31 (VIP) from the browser, you will see the backend server page. Disable one node, you can still see the webpage. Check ADC HA status, live node will become Master when the other one down.
+===== Troubleshooting =====
+==== Whe make configuration changes, it may need a full stop/start ====
+When making changes to the configuration, such as changing the listener from <IP>:443 to "*:443", that requires a full stop/start.
+==== HA configures are inconsistent between boxes  ====
+When you see similar error to the following:
+-08-03 16:24:20.099467 [WARN] [REPL] HA configures are inconsistent between boxes, full replicaton can't proceed
+-08-03 16:24:20.099520 [ERROR] [REPL] peer HA replication config is inconsistent, it must be fixed!
+It is because the configuration is out of sync between two LS ADC instances. replication only works if two ADC are serving the exact same sites.  You need to keep ADC configuration in sync. If it is out of sync temporarily, it will break the synchronization. Once config synced, ADC will restore replication synchronization.