Forum
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: VM's failed to switch when one host failed with HA enabled

VM's failed to switch when one host failed with HA enabled 2 months 3 weeks ago #1767

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 549
Ok. I see the situation now and why an HA event may not have triggered since the master may have been partially responsive, even with no O/S.

Regarding the old master. If you manage to restore the 7.5 image, it will join the pool as a slave. Basically, when XAPI starts at boot time, it performs a check of the other pool members to see if there is a pool master and it will demote itself to slave when it discovers that xen08 is now the master. So, there is no risk of data corruption.

To be totally safe, you should consider backing up all of your VMs before re-introducting the old master.

This would be the preferred approach. If you decide to start from scratch, you could follow this procedure (only 1/2 of the procedure in your case). It requires that you upgrade to our latest iscsi-ha release, 2.2, which stores ALL setting in a shared DB so that a host can be more easily introduced into a cluster.

halizard.com/forum/general-discussion/31...h-xenserver-7-6#1744

Please Log in or Create an account to join the conversation.

VM's failed to switch when one host failed with HA enabled 2 months 3 weeks ago #1768

.

Please Log in or Create an account to join the conversation.

Last edit: by ray detwiler. Reason: answered my own question

VM's failed to switch when one host failed with HA enabled 2 months 3 weeks ago #1769

  • Sherbin George
  • Sherbin George's Avatar Topic Author
  • Offline
  • Posts: 5
Hi,

We have reinstalled Xen07(Slave server) and followed your documentation for adding the slave back to Hal. It went fine. Also Hal is now running with the latest version, 2.2

We are now seeing an alert within Xencenter. Can you please advice if there is something more needed to be done. Let me know if you need results of any commands or logs attached.

*****************************
"HA-Lizard - check_xapi","check_xapi: Pool Host on Server: 10.200.2.121 not responding to ICMP ping - manual intervention may be required","45AIR-C03","Jan 5, 2019 11:18 AM",""
"HA-Lizard - Core","[ 1] Disk errors detected [Found device /dev/sda","45AIR-C03","Jan 5, 2019 11:30 AM",""
"HA-Lizard - Core","[ 1] Disk errors detected [Found device /dev/sda","45AIR-C03","Jan 5, 2019 11:30 AM",""
*****************************

Please Log in or Create an account to join the conversation.

VM's failed to switch when one host failed with HA enabled 2 months 3 weeks ago #1770

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 549
The latest release performs an hourly disk health check and reports any SMART errors. If you feel that this is not a real HW problem, you can disable the disk alerts with
ha-cfg set disk_monitor 0
. You can run this from either of the 2 hosts and it will disable the disk alerts for the entire pool

Please Log in or Create an account to join the conversation.

VM's failed to switch when one host failed with HA enabled 2 months 3 weeks ago #1771

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 549
I just noticed the ping error alert too. Any chance you are running pre 2.2 on one of your hosts? that was a problem specific to xcp-ha 7.6 when running halizard < 2.2

Please Log in or Create an account to join the conversation.

  • Page:
  • 1
  • 2