Forum
Welcome, Guest
Username: Password: Remember me
This is the optional category header for the Suggestion Box.
  • Page:
  • 1

TOPIC:

Slave cant connect to iSCSI after Master is turned off 1 month 20 hours ago #2235

Hello, i installed HA-Lizard on a 2 node cluster with a internal storage. I use xcp ng 81 as my hypervisor and want to make this cluster HA so when my nr.1 node crashes that the other one takes over. My problem now is that if my main node crashes my secondary node cant take over and the server(Pool) doesnt show up anymore until i add the secondary server manually over the Xen-Center. I looked in the logs what actually happens if i disconnect the mainserver and what my secondary does and it seems like he has some problems to find the VID and also i can't ping my 10.10.10.3 per ssh which i implemented as my iSCSI Storage.It also trys to migrate the VM to the primary server but obviously cant find it.

Nov 4 15:50:55 xcp-ng-secondary ha-lizard: 8351 Ready to exec [find /dev/shm/ha-lizard-mail/ -name *.msg -type f -mmin +60 -delete]
Nov 4 15:50:55 xcp-ng-secondary ha-lizard: 8351 FLUSH_MAIL_EXEC returned [0]
Nov 4 15:50:55 xcp-ng-secondary ha-lizard: 8351 check_email_enabled: Email enabled for vm_mon
Nov 4 15:50:55 xcp-ng-secondary ha-lizard: 8351 email: Duplicate message - not sending. Content = vm_mon: Error starting failed VM: Windows 10 (64-bit) (1) UUID: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:50:55 xcp-ng-secondary ha-lizard: 8351 email: Message barred for 60 minutes
Nov 4 15:50:58 xcp-ng-secondary ha-lizard: 2547 ha-lizard Watchdog: ha-lizard running - OK
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 31214 Spawning new instance of ha-lizard
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 31214 check_logger_processes: Checking logger processes
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 31214 check_logger_processes: No processes to clear
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 LOG_TERMINAL = [false]
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 Mail Spool Directory Found /dev/shm/ha-lizard-mail
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 This iteration is count 39
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 Checking if this host is a Pool Master or Slave
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 This host's pool status = master
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 Checking if ha-lizard is enabled for this pool
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 check_ha_enabled: Checking if ha-lizard is enabled for pool: e43e608e-4ead-c0df-8f26-9d52f628752a
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 check_ha_enabled: ha-lizard is enabled
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 check_ha_enabled: checking whether maintenance mode enabled
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 ha-lizard is enabled
Nov 4 15:51:02 xcp-ng-secondary ha-lizard: 10078 check_xs_ha: Checking XenServer HA status
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 update_global_conf_params: Successfully updated global pool configuration settings in /etc/ha-lizard/ha-lizard.pool.conf.
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 update_global_conf_params: DISABLED_VAPPS=()#012DISK_MONITOR=1#012ENABLE_ALERTS=1#012ENABLE_LOGGING=1#012FENCE_ACTION=stop#012FENCE_ENABLED=1#012FENCE_FILE_LOC=/etc/ha-lizard/fence#012FENCE_HA_ONFAIL=0#012FENCE_HEURISTICS_IPS=192.168.255.254#012FENCE_HOST_FORGET=0#012FENCE_IPADDRESS=#012FENCE_METHOD=POOL#012FENCE_MIN_HOSTS=2#012FENCE_PASSWD=#012FENCE_QUORUM_REQUIRED=1#012FENCE_REBOOT_LONE_HOST=0#012FENCE_USE_IP_HEURISTICS=1#012GLOBAL_VM_HA=1#012HOST_SELECT_METHOD=0#012MAIL_FROM="root@localhost"#012MAIL_ON=1#012MAIL_SUBJECT="SYSTEM_ALERT-FROM_HOST:$HOSTNAME"#012MAIL_TO="root@localhost"#012MGT_LINK_LOSS_TOLERANCE=5#012MONITOR_DELAY=15#012MONITOR_KILLALL=1#012MONITOR_MAX_STARTS=20#012MONITOR_SCANRATE=10#012OP_MODE=2#012PROMOTE_SLAVE=1#012SLAVE_HA=1#012SLAVE_VM_STAT=0#012SMTP_PASS=""#012SMTP_PORT="25"#012SMTP_SERVER="127.0.0.1"#012SMTP_USER=""#012XAPI_COUNT=2#012XAPI_DELAY=10#012XC_FIELD_NAME='ha-lizard-enabled'#012XE_TIMEOUT=10
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_master_mgt_link_state: Checking management interface link state
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_master_mgt_link_state: Link State = [ true ] for management interface IP [ 192.168.10.2 ]
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_master_mgt_link_state: Link [ xenbr0 ] state UP
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 Master management link OK - checking prior link state
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 This host detected as pool Master
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 Found 2 hosts in pool
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 validate_vm_ha_state: Validating VM HA-state
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 validate_vm_ha_state: VM [ 312abe84-9704-a079-3dc2-02cb08a1bf1f ] state [ false ] = OK
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 Calling function write_pool_state
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 Calling function autoselect_slave
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 Calling function check_slave_status
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_master_mgt_link_state: Checking management interface link state
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: MASTER UUID found: f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 autoselect_slave: This host UUID found: f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: MASTER UUID: f543502f-1445-429e-8220-b360cd2a6946 written to local state storage
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 autoselect_slave: MASTER host UUID found: f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Calling function get_vms_on_host for UUID: b6dcd4f7-860c-4dfc-8e7c-22c00ef08b33
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_master_mgt_link_state: Link State = [ true ] for management interface IP [ 192.168.10.2 ]
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_master_mgt_link_state: Link [ xenbr0 ] state UP
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: Management link OK - continue
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 get_vms_on_host: No VMs found on host: b6dcd4f7-860c-4dfc-8e7c-22c00ef08b33
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 autoselect_slave: Removing Slave UUID from list of Hosts - Slave: b6dcd4f7-860c-4dfc-8e7c-22c00ef08b33 is disabled or in maintenance mode
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Writing VM array to local state file host.b6dcd4f7-860c-4dfc-8e7c-22c00ef08b33.vmlist.uuid_array
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 autoselect_slave: f543502f-1445-429e-8220-b360cd2a6946 is Master UUID - excluding from list of available slaves
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 autoselect_slave: 0 available Slave UUIDs found:
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Calling function get_vms_on_host for UUID: f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 get_pool_host_list: returned b6dcd4f7-860c-4dfc-8e7c-22c00ef08b33#012f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 get_vms_on_host: No VMs found on host: f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Writing VM array to local state file host.f543502f-1445-429e-8220-b360cd2a6946.vmlist.uuid_array
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 No slaves available to become pool master
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: Removing Slave UUID from list of Hosts - Slave: b6dcd4f7-860c-4dfc-8e7c-22c00ef08b33 is disabled or in maintenance mode
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: Removing Master UUID from list of Hosts
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: Host IP Address check Status Array for Slaves = ()
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: Quorum check called
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Checking host IPs: 192.168.10.2
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 pool autopromote_uuid = [none_available]
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Pool autopromote_uuid=none_available
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Host IP: 192.168.10.2 Response = OK
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: autopromote_uuid unchanged - not updating
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: LIVE HOSTs = 1
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Using network points: 192.168.255.254 as possible additional vote
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Heuristic IP: 192.168.255.254 Response = OK
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Successful Replies = 1
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Total enpoints checked = 1 with total successful replies = 1
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Additional heuristic vote success. Incremeting vote by 1
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: Minimum number of hosts needed to allow fencing = 0 + 1
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_quorum: 2 Hosts found. Minimum needed = 0 + 1. Fencing allowed
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: Failed slave count = 0
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_slave_status: No Failed slaves detected
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_ha_enabled: Checking if ha-lizard is enabled for pool: e43e608e-4ead-c0df-8f26-9d52f628752a
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 Function check_slave_status reported no failures: calling vm_mon
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_ha_enabled: ha-lizard is enabled
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 check_ha_enabled: checking whether maintenance mode enabled
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: ha-lizard is operating mode 2 - managing pool VMs
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: Retrived list of VMs for this poll: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: Removing Control Domains from VM list
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: VM list returned = 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_state: Machine state for 312abe84-9704-a079-3dc2-02cb08a1bf1f returned: halted
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: VM 312abe84-9704-a079-3dc2-02cb08a1bf1f state = halted
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: GLOBAL_VM_HA is enabled. Adding VM: 312abe84-9704-a079-3dc2-02cb08a1bf1f to list of failed VMs on this run.
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: 1 Eligible Halted VMs found
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: Halted VMs found: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 vm_mon: Attempting to start VMs in halted state
Nov 4 15:51:03 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Pool contains 1 hosts. Writing to /etc/ha-lizard/state/pool_num_hosts
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 get_pool_host_list: enabled flag set - returning only hosts with enabled=true
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 validate_vm_safe_to_start_here: VM [312abe84-9704-a079-3dc2-02cb08a1bf1f] home pool validated [e43e608e-4ead-c0df-8f26-9d52f628752a] - safe to start here
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 get_pool_host_list: returned f543502f-1445-429e-8220-b360cd2a6946
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 get_pool_ip_list: returned 192.168.10.2
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 write_pool_state: Host IP List = 192.168.10.2
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 write_status_report: Writing status report
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 vm_mon: Starting VM: Windows 10 (64-bit) (1) UUID: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 email: Mail Spool Directory Found /dev/shm/ha-lizard-mail
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 Ready to exec [find /dev/shm/ha-lizard-mail/ -name *.msg -type f -mmin +60 -delete]
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 FLUSH_MAIL_EXEC returned [0]
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 check_email_enabled: Email enabled for vm_mon
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 email: Duplicate message - not sending. Content = vm_mon: Starting VM: Windows 10 (64-bit) (1) UUID: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 email: Message barred for 60 minutes
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 vm_mon: HOST_SELECT_METHOD set to [ 0 ] - checking for a healthy host
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 vm_mon: This host [ f543502f-1445-429e-8220-b360cd2a6946 ] start on serial [ 970 ]
Nov 4 15:51:04 xcp-ng-secondary ha-lizard: 10078 vm_mon: Host [ f543502f-1445-429e-8220-b360cd2a6946 ] health status = [ master ]
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 vm_mon: VM start exit result = [ 1 ]
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 vm_mon: VM start returned messages = [ Error code: SR_BACKEND_FAILURE_46#012Error parameters: , The VDI is not available [opterr=Command failed (5): /dev/sdc: open failed: No such device or address#012 Volume group "VG_XenStorage-3aeb126f-7d32-39ac-1626-a334dc5404ff" not found#012 Cannot process volume group VG_XenStorage-3aeb126f-7d32-39ac-1626-a334dc5404ff], ]
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 vm_mon: Error code: SR_BACKEND_FAILURE_46#012Error parameters: , The VDI is not available [opterr=Command failed (5): /dev/sdc: open failed: No such device or address#012 Volume group "VG_XenStorage-3aeb126f-7d32-39ac-1626-a334dc5404ff" not found#012 Cannot process volume group VG_XenStorage-3aeb126f-7d32-39ac-1626-a334dc5404ff],
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 reset_vm_vdi: Resetting VDI(s) for VM [ 312abe84-9704-a079-3dc2-02cb08a1bf1f ]
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 reset_vm_vdi: Found VDI [ 85a74300-0ce0-4a65-b21d-34184c4a2e8b ]
Nov 4 15:51:07 xcp-ng-secondary ha-lizard-NOTICE-/etc/ha-lizard/init/ha-lizard.mon: VDI 85a74300-0ce0-4a65-b21d-34184c4a2e8b is not marked as attached anywhere, nothing to do
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 reset_vm_vdi: VDI [ 85a74300-0ce0-4a65-b21d-34184c4a2e8b ] reset
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 reset_vm_vdi: No VDI found for VBD [ 27bc9965-8a28-ed69-b3c5-bf9a9179f879 ]
Nov 4 15:51:07 xcp-ng-secondary ha-lizard: 10078 vm_mon: Reattempting vm [ 312abe84-9704-a079-3dc2-02cb08a1bf1f ] start
Nov 4 15:51:08 xcp-ng-secondary ha-lizard: 2547 ha-lizard Watchdog: ha-lizard running - OK
Nov 4 15:51:10 xcp-ng-secondary ha-lizard-ERROR-/etc/ha-lizard/init/ha-lizard.mon: Error code: SR_BACKEND_FAILURE_46
Nov 4 15:51:10 xcp-ng-secondary ha-lizard-ERROR-/etc/ha-lizard/init/ha-lizard.mon: Error parameters: , The VDI is not available [opterr=Command failed (5): /dev/sdc: open failed: No such device or address
Nov 4 15:51:10 xcp-ng-secondary ha-lizard-ERROR-/etc/ha-lizard/init/ha-lizard.mon: Volume group "VG_XenStorage-3aeb126f-7d32-39ac-1626-a334dc5404ff" not found
Nov 4 15:51:10 xcp-ng-secondary ha-lizard-ERROR-/etc/ha-lizard/init/ha-lizard.mon: Cannot process volume group VG_XenStorage-3aeb126f-7d32-39ac-1626-a334dc5404ff],
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 vm_mon: Error starting failed VM: Windows 10 (64-bit) (1) UUID: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 email: Mail Spool Directory Found /dev/shm/ha-lizard-mail
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 Ready to exec [find /dev/shm/ha-lizard-mail/ -name *.msg -type f -mmin +60 -delete]
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 FLUSH_MAIL_EXEC returned [0]
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 check_email_enabled: Email enabled for vm_mon
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 email: Duplicate message - not sending. Content = vm_mon: Error starting failed VM: Windows 10 (64-bit) (1) UUID: 312abe84-9704-a079-3dc2-02cb08a1bf1f
Nov 4 15:51:10 xcp-ng-secondary ha-lizard: 10078 email: Message barred for 60 minutes

If your interestes in the full Log i will attach it also but heres just a bit of the logs maybe its enough for you to understandw what i mean.

Thank you for your help!
Attachments:

Please Log in or Create an account to join the conversation.

Slave cant connect to iSCSI after Master is turned off 1 month 19 hours ago #2236

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 653
Looks like your slave has transitioned to master, but has not attached the storage.
Can on confirm whether iscsi-ha is running?
service iscsi-ha status

If it is running, can you provide the iscsi-ha logs for the slave for the time leading up to the event.

The general logic works like this.

Master host fails -> slave becomes master -> iscsi-ha on the slave will connect the storage and expose it over iscsi. It appears that this last step is not occurring in your case

Please Log in or Create an account to join the conversation.

  • Page:
  • 1