Forum
Welcome, Guest
Username: Password: Remember me
This is the optional category header for the Suggestion Box.

TOPIC:

HA-lizard Version 2.2.0 Trial 5 years 3 months ago #1731

  • Nathan Scannell
  • Nathan Scannell's Avatar Topic Author
  • Offline
  • Posts: 38
Test 3:

Simulate total environmental power failure:
Cleanly shut down both hosts as done by UPS then simulate environmental power restore.

Result:

Both hosts power up together and the pool is OK BUT the iSCSI target is offline and VMs cannot boot.

* No problem pinging each others replication ports.
* ha-cfg status is good.
* iscsi-cfg status is good.

* I can right click and Repair the SR in XenCenter no problem and the devices plug back in and VMs boot.

Any ideas?
Race condition when starting tgtd?

Please Log in or Create an account to join the conversation.

Last edit: by Nathan Scannell.

HA-lizard Version 2.2.0 Trial 5 years 3 months ago #1732

  • Nathan Scannell
  • Nathan Scannell's Avatar Topic Author
  • Offline
  • Posts: 38
Manually plugging in the PBD brings the SR back online.
xe pbd-list
xe pbd-plug uuid=xxx

Please Log in or Create an account to join the conversation.

HA-lizard Version 2.2.0 Trial 5 years 3 months ago #1733

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Debug would tell us whether there is a rece condition, but I don't think that is the case.

We have seen xenserver fail to plug an iscsi Sr on boot before. It happens even with external iscsi Sr and does so sporadically. Years ago we added auto plugging of any Sr that should be plugged, but isn't. That logic is delayed on initial startup of Iscsi-ha.

We tested that exact scenario in our dev environment this week and things eventually sorted themselves out, after 2 minutes or so.

Please Log in or Create an account to join the conversation.

HA-lizard Version 2.2.0 Trial 5 years 3 months ago #1734

  • Nathan Scannell
  • Nathan Scannell's Avatar Topic Author
  • Offline
  • Posts: 38
None of my test cases have recovered... I can't seem to make sense of it but it really seems like a timing problem. I'm sure this did not occur when testing without the new RPM upgrade.

Physical Block Device fails to plug because the VG is unavailable.

Have a read of my xensource.log attached. It has some clues. This is a snippet of my master during boot.

See problems starting at line 1493 where SR backend failure begins to occur.


File Attachment:

File Name: xensource.zip
File Size:272 KB

File Attachment:

File Name: user.zip
File Size:37 KB
Attachments:

Please Log in or Create an account to join the conversation.

Last edit: by Nathan Scannell.

HA-lizard Version 2.2.0 Trial 5 years 3 months ago #1735

  • Nathan Scannell
  • Nathan Scannell's Avatar Topic Author
  • Offline
  • Posts: 38
LVM handling might need inspection...

The VG that cannot be found at boot is in the lvm backup folder.

[root@ks1 master]# ls -l /etc/lvm/backup/
total 8
-rw
1 root root 2761 Nov 28 12:31 VG_XenStorage-2b6e949b-753c-a0a6-2de8-5ffffc6ac912

Please Log in or Create an account to join the conversation.

HA-lizard Version 2.2.0 Trial 5 years 3 months ago #1736

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Can you send the user.log for the same time period?

Please Log in or Create an account to join the conversation.