Forum
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1

TOPIC:

One more iscsi-ha question 6 years 3 months ago #1469

  • Rob Hall
  • Rob Hall's Avatar Topic Author
  • Offline
  • Posts: 42
So, I got ha-lizard and iscsi-ha configured and working in my test environment, and so far it seems to be working well. I have simulated full power failures on each host individually (and recovered them), and a full power outage earlier today (wasn't simulated, actually happened haha).
I do have another question though - I know the reason for bonded replication links is to avoid a failure in the replication network. My question is though, "what if"? Consider this:
- 2 hosts, ha-lizard and iscsi-ha running.
- Both replication links are on the same chipset (such as a 2 port x520 10Gbe card).
- The card, or chipset on the card, fails, taking both replication links down.

Now we have a split-brain scenario, as from my understanding, both ha-lizard and iscsi-ha use the same external IP for witness. In this scenario, the witness is still reachable by both hosts because the management networks are still active, but replication can't happen because the replication networks are out on one host.
Is there any way to prevent this scenario from happening? I admit that it's a rarity that something like this would happen, but it's still something to think about.

Thoughts?

Please Log in or Create an account to join the conversation.

One more iscsi-ha question 6 years 3 months ago #1471

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
The solution is not coded to deal with that exact scenario. In your example you would not necessarily get split brain. The slave, in this case, would lose connectivity to the disk images. It would not constitute an HA Event and VMs would not be migrated. The master would immediately raise an alert when the DRBD resource changes state. The management link is key here since it drives ha events and is not lost in this test case.

It seems that immediately shutting down vms on the slave would be the best course of action in this case since the master would then immediately start them, however, the VM's on the slave would need to be abruptly powered off since a clean shutdown is not possible while the disks are inaccessible. It's something to consider and worth looking into. I'll give it some thought and get back to you this week
The following user(s) said Thank You: Rob Hall

Please Log in or Create an account to join the conversation.

One more iscsi-ha question 6 years 3 months ago #1475

  • Rob Hall
  • Rob Hall's Avatar Topic Author
  • Offline
  • Posts: 42
I also thought of another instance that I have a question about.
Sorry, I keep coming up with questions - I just want to understand all possible failure modes.

Say you have a 2 node cluster, each with a RAID array for storage, running iscsi-ha to replicate said storage.
What happens if you have disk failures in the master node?

Please Log in or Create an account to join the conversation.

One more iscsi-ha question 6 years 3 months ago #1476

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
There are several possibilities in this scenario.

- If a disk fails in a RAID volume, then nothing will happen (relative to HA). It is really up to the administrator to watch the disks (more on this below). Assuming that the RAID volume is OK, an administrator can swap the failed drive and things continue as normal. There would be no downtime in this case. There would also be nothing to trigger an event.

- If the RAID volume fails due to the disk failure tolerance being exceeded, then the cluster is in trouble as is your data. Here too, HW monitoring and preemptive maintenance is key to avoiding this situation. The result of this condition could trigger an HA event if the master sufficiently crashes as a result. There is no guarantee of an HA event though since dom0 may be left running in some degraded state.

My preferred method of handling this is through active monitoring of the disks such that you never end up in this situation. This coupled with proper design.

For example, say you have a cluster, each with 2 RAID arrays.
Array1 - used for Xenserver (small RAID 1+0)
Array2 - used for shared storage (large RAID 1+0)

In this case, DRBD is stacked on top of the storage Array2. You would literally be able to lose 3 of the 4 disks and still have all of your data intact.

With all this said, we are working on bundling active disk monitoring as part of the solution. This is intended to alert an admin that a disk has either failed or has exceeded a SMART threshold. You'll find a script in /etc/iscsi-ha/scripts/check_disk_smart_status. You can run this on any of your hosts. It will return the status of any SMART enabled disk and a non 0 value for any errors or thresholds exceeded. Until we get this into a release, you can call this script via cron or some other automation and trigger an alert if you get a non-0 exit status.

As an aside, HA solutions are generally a compromise and require several HA solutions stacked. For example, a typical HA-lizard deployment would have redundant power, RAID, DRBD, HA-Lizard. Each is responsible for one piece of the overall deployment. By design, HA-Lizard is intended to primarily protect against 2 scenarios: 1) failed host 2) failed VM, as is the case with any hypervisor based solution.

Please Log in or Create an account to join the conversation.

One more iscsi-ha question 6 years 3 months ago #1477

  • Rob Hall
  • Rob Hall's Avatar Topic Author
  • Offline
  • Posts: 42
Right - I guess my primary question is if the array exceeds the number of acceptable disk failures for one reason or another, would drbd stop replication or would it corrupt both copies?

I wasn't aware of the SMART check script ; thank you for that.

Please Log in or Create an account to join the conversation.

Last edit: by Rob Hall.

One more iscsi-ha question 6 years 3 months ago #1478

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
DRBD configuration would be the best place to handle this IMO.

For example, if DRBD encounters an I/O error on the primary none, call a script that abruptly reboots the master (forcing the slave to promote itself).

Please Log in or Create an account to join the conversation.

  • Page:
  • 1