Forum
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1

TOPIC: Some Pool Slaves Not Responding

Some Pool Slaves Not Responding 4 months 3 weeks ago #1904

Hi. I've run into an issue with a pool - XPC-ng 7.6 that I'm not sure how to resolve. Any help would be appreciated.

I receive emails from XEN1 the Pool Master containing:
check_slave_status: Server XEN1: Some Pool Slaves not not responding , ac2003fa-e767-4313-bf72-9f9cffe119aa
check_xapi: Pool Host on Server: **.**.**.12 not responding to HTTP - manual intervention may be required

I receive an email from XEN2 the host that should be the slave:
check_xapi: Pool Host on Server: **.**.**.11 not responding to HTTP - manual intervention may be required

ha-cfg status on XEN1 - Master
| ha-lizard Version: 2.1.4 |
| Operating Mode: Mode [ 2 ] Managing All VMs in Pool |
| Host Role: master |
| Pool UUID: 62289581-3905-2627-1369-dc043a8beab8 |
| Host UUID: 471acdac-19fc-4c88-82eb-89343c7d9fe9 |
| Master UUID: 471acdac-19fc-4c88-82eb-89343c7d9fe9 |
| Daemon Status: ha-lizard is running [ OK ] |
| Watchdog Status: ha-lizard-watchdog is running [ OK ] |
| HA Enabled: true |
Pool HA Status: ENABLED

ha-cfg status on XEN2 - Slave
| ha-lizard Version: 2.1.4 |
| Operating Mode: Mode [ 2 ] Managing All VMs in Pool |
| Host Role: slave |
| Pool UUID: 62289581-3905-2627-1369-dc043a8beab8 |
| Host UUID: |
| Master UUID: |
| Daemon Status: ha-lizard is running [ OK ] |
| Watchdog Status: ha-lizard-watchdog is running [ OK ] |
| HA Enabled: true |
Pool HA Status: ENABLED

ha-cfg log on XEN2 - the slave
Sep 30 13:50:34 XEN2 ha-lizard: check_xapi: Pool Host **.**.**.11 xapi status = 0
Sep 30 13:50:34 XEN2 ha-lizard: Mail Spool Directory Found /dev/shm/ha-lizard-mail
Sep 30 13:50:34 XEN2 ha-lizard: check_email_enabled: Email enabled for check_xapi
Sep 30 13:50:34 XEN2 ha-lizard: email: Duplicate message - not sending. Content = check_xapi: Pool Host on Server: **.**.**.11 not responding to HTTP - manual intervention may be required
Sep 30 13:50:34 XEN2 ha-lizard: email: Message barred for 60 minutes
Sep 30 13:50:34 XEN2 ha-lizard: Pool Master NOT OK - Checking if ha-lizard is enabled in latest state file
Sep 30 13:50:34 XEN2 ha-lizard: Checking if ha-lizard is enabled
Sep 30 13:50:34 XEN2 ha-lizard: Statefile /etc/ha-lizard/state/ha_lizard_enabled found: checking if ha-lizard is enabled
Sep 30 13:50:34 XEN2 ha-lizard-ERROR-/etc/ha-lizard/init/ha-lizard.mon: /etc/ha-lizard/ha-lizard.sh: line 369: [: =: unary operator expected
Sep 30 13:50:34 XEN2 ha-lizard: ha-lizard is disabled - exiting

ha-log on XEN1 - The Master

Sep 30 14:51:08 XEN1 ha-lizard: ha-lizard Watchdog: ha-lizard running - OK
Sep 30 14:51:13 XEN1 ha-lizard: 24830 Spawning new instance of ha-lizard
Sep 30 14:51:13 XEN1 ha-lizard: Mail Spool Directory Found /dev/shm/ha-lizard-mail
Sep 30 14:51:13 XEN1 ha-lizard: This iteration is count 336
Sep 30 14:51:13 XEN1 ha-lizard: Checking if this host is a Pool Master or Slave
Sep 30 14:51:13 XEN1 ha-lizard: This host's pool status = master
Sep 30 14:51:13 XEN1 ha-lizard: Checking if ha-lizard is enabled for this pool
Sep 30 14:51:13 XEN1 ha-lizard: check_ha_enabled: Checking if ha-lizard is enabled for pool: 62289581-3905-2627-1369-dc043a8beab8
Sep 30 14:51:13 XEN1 ha-lizard: check_ha_enabled: ha-lizard is enabled
Sep 30 14:51:13 XEN1 ha-lizard: check_ha_enabled: checking whether maintenance mode enabled
Sep 30 14:51:13 XEN1 ha-lizard: ha-lizard is enabled
Sep 30 14:51:13 XEN1 ha-lizard: check_xs_ha: Checking XenServer HA status
Sep 30 14:51:13 XEN1 ha-lizard: update_global_conf_params: Successfully updated global pool configuration settings in /etc/ha-lizard/ha-lizard.pool.conf.
Sep 30 14:51:13 XEN1 ha-lizard: update_global_conf_params: DISABLED_VAPPS=()#012ENABLE_LOGGING=1#012FENCE_ACTION=stop#012FENCE_ENABLED=1#012FENCE_FILE_LOC=/etc/ha-lizard/fence#012FENCE_HA_ONFAIL=0#012FENCE_HEURISTICS_IPS=**.**.**.254#012FENCE_HOST_FORGET=0#012FENCE_IPADDRESS=#012FENCE_METHOD=POOL#012FENCE_MIN_HOSTS=2#012FENCE_PASSWD=#012FENCE_QUORUM_REQUIRED=1#012FENCE_REBOOT_LONE_HOST=0#012FENCE_USE_IP_HEURISTICS=1#012GLOBAL_VM_HA=1#012HOST_SELECT_METHOD=0#012MAIL_FROM=*********#012MAIL_ON=1#012MAIL_SUBJECT="SYSTEM_ALERT-FROM_HOST:$HOSTNAME"#012MAIL_TO=********#012MGT_LINK_LOSS_TOLERANCE=5#012MONITOR_DELAY=15#012MONITOR_KILLALL=1#012MONITOR_MAX_STARTS=20#012MONITOR_SCANRATE=10#012OP_MODE=2#012PROMOTE_SLAVE=1#012SLAVE_HA=1#012SLAVE_VM_STAT=0#012SMTP_PASS=*********#012SMTP_PORT=587#012SMTP_SERVER=*********#012SMTP_USER=*********#012XAPI_COUNT=2#012XAPI_DELAY=10#012XC_FIELD_NAME='ha-lizard-enabled'#012XE_TIMEOUT=10
Sep 30 14:51:13 XEN1 ha-lizard: check_master_mgt_link_state: Checking management interface link state
Sep 30 14:51:13 XEN1 ha-lizard: check_master_mgt_link_state: Link State = [ true ] for management interface IP [ **.**.**.11 ]
Sep 30 14:51:13 XEN1 ha-lizard: check_master_mgt_link_state: Link [ xapi0 ] state UP
Sep 30 14:51:13 XEN1 ha-lizard: Master management link OK - checking prior link state
Sep 30 14:51:13 XEN1 ha-lizard: This host detected as pool Master
Sep 30 14:51:14 XEN1 ha-lizard: Found 2 hosts in pool
Sep 30 14:51:14 XEN1 ha-lizard: validate_vm_ha_state: Validating VM HA-state
Sep 30 14:51:14 XEN1 ha-lizard: validate_vm_ha_state: VM [ 1a091151-4173-bddf-66e3-4dddbf992242 ] state [ false ] = OK
Sep 30 14:51:14 XEN1 ha-lizard: validate_vm_ha_state: VM [ fb70a211-e0f6-f5e6-4657-e731076ded40 ] state [ false ] = OK
Sep 30 14:51:14 XEN1 ha-lizard: validate_vm_ha_state: VM [ a96598f0-bafa-2efe-a7a4-a98a1b4faeb9 ] state [ false ] = OK
Sep 30 14:51:14 XEN1 ha-lizard: Calling function write_pool_state
Sep 30 14:51:14 XEN1 ha-lizard: 25878 Calling function autoselect_slave
Sep 30 14:51:14 XEN1 ha-lizard: 25883 Calling function check_slave_status
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_master_mgt_link_state: Checking management interface link state
Sep 30 14:51:14 XEN1 ha-lizard: 25878 autoselect_slave: This host UUID found: 471acdac-19fc-4c88-82eb-89343c7d9fe9
Sep 30 14:51:14 XEN1 ha-lizard: 25878 autoselect_slave: MASTER host UUID found: 471acdac-19fc-4c88-82eb-89343c7d9fe9
Sep 30 14:51:14 XEN1 ha-lizard: get_vms_on_host: Returned 1a091151-4173-bddf-66e3-4dddbf992242#012fb70a211-e0f6-f5e6-4657-e731076ded40#012a96598f0-bafa-2efe-a7a4-a98a1b4faeb9
Sep 30 14:51:14 XEN1 ha-lizard: 25878 autoselect_slave: 471acdac-19fc-4c88-82eb-89343c7d9fe9 is Master UUID - excluding from list of available slaves
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_master_mgt_link_state: Link State = [ true ] for management interface IP [ **.**.**.11 ]
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_master_mgt_link_state: Link [ xapi0 ] state UP
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Management link OK - continue
Sep 30 14:51:14 XEN1 ha-lizard: get_vms_on_host: No VMs found on host: ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25878 autoselect_slave: 1 available Slave UUIDs found: ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25883 get_pool_host_list: returned 471acdac-19fc-4c88-82eb-89343c7d9fe9#012ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Removing Master UUID from list of Hosts
Sep 30 14:51:14 XEN1 ha-lizard: 25878 autoselect_slave: Selected Slave: ac2003fa-e767-4313-bf72-9f9cffe119aa = Current slave: ac2003fa-e767-4313-bf72-9f9cffe119aa - ignoring update
Sep 30 14:51:14 XEN1 ha-lizard: 25883 get_pool_ip_list: returned **.**.**.12
Sep 30 14:51:14 XEN1 ha-lizard: check_ha_enabled: Checking if ha-lizard is enabled for pool: 62289581-3905-2627-1369-dc043a8beab8
Sep 30 14:51:14 XEN1 ha-lizard: check_ha_enabled: ha-lizard is enabled
Sep 30 14:51:14 XEN1 ha-lizard: check_ha_enabled: checking whether maintenance mode enabled
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_xapi: Pool Host **.**.**.12 xapi status = 0
Sep 30 14:51:14 XEN1 ha-lizard: 25883 Mail Spool Directory Found /dev/shm/ha-lizard-mail
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_email_enabled: Email enabled for check_xapi
Sep 30 14:51:14 XEN1 ha-lizard: 25883 email: Duplicate message - not sending. Content = check_xapi: Pool Host on Server: **.**.**.12 not responding to HTTP - manual intervention may be required
Sep 30 14:51:14 XEN1 ha-lizard: 25883 email: Message barred for 60 minutes
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Slave host [ ac2003fa-e767-4313-bf72-9f9cffe119aa ] health status = [ failed ] - break
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Host IP Address check Status Array for Slaves = (0)
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Quorum check called
Sep 30 14:51:14 XEN1 ha-lizard: get_pool_host_list: enabled flag set - returning only hosts with enabled=true
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Checking host IPs: **.**.**.11 **.**.**.12
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Host IP: **.**.**.11 Response = OK
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: LIVE HOSTs = 1
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Host IP: **.**.**.12 Response = OK
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: LIVE HOSTs = 2
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Using network points: **.**.**.254 as possible additional vote
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Heuristic IP: **.**.**.254 Response = OK
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Successful Replies = 1
Sep 30 14:51:14 XEN1 ha-lizard: 25883 Total enpoints checked = 1 with total successful replies = 1
Sep 30 14:51:14 XEN1 ha-lizard: get_pool_host_list: returned 471acdac-19fc-4c88-82eb-89343c7d9fe9#012ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Additional heuristic vote success. Incremeting vote by 1
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: Minimum number of hosts needed to allow fencing = 1 + 1
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_quorum: 3 Hosts found. Minimum needed = 1 + 1. Fencing allowed
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Failed slave count = 1
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Processing failed slave: ac2003fa-e767-4313-bf72-9f9cffe119aa on this iteration
Sep 30 14:51:14 XEN1 ha-lizard: 25883 Mail Spool Directory Found /dev/shm/ha-lizard-mail
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_email_enabled: Email enabled for check_slave_status
Sep 30 14:51:14 XEN1 ha-lizard: get_pool_ip_list: returned **.**.**.11
Sep 30 14:51:14 XEN1 ha-lizard: 25883 email: Duplicate message - not sending. Content = check_slave_status: Server XEN1: Some Pool Slaves not not responding , ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25883 email: Message barred for 60 minutes
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Some Pool Slaves not not responding , ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Calling function get_vms_on_host for UUID(s) ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: get_pool_ip_list: returned **.**.**.11 **.**.**.12
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: Calling function fence_host to remove unresponsive host from pool. Failed Host(s) = ac2003fa-e767-4313-bf72-9f9cffe119aa
Sep 30 14:51:14 XEN1 ha-lizard: 25883 check_slave_status: fence_host ac2003fa-e767-4313-bf72-9f9cffe119aa executed on prior iteration - host already fenced
Sep 30 14:51:14 XEN1 ha-lizard: write_status_report: Writing status report
Sep 30 14:51:14 XEN1 ha-lizard: 25883 Function check_slave_status Host Power = Off, calling vm_mon
Sep 30 14:51:14 XEN1 ha-lizard: 25883 vm_mon: ha-lizard is operating mode 2 - managing pool VMs
Sep 30 14:51:14 XEN1 ha-lizard: 25883 vm_mon: Retrived list of VMs for this poll: 1a091151-4173-bddf-66e3-4dddbf992242#012fb70a211-e0f6-f5e6-4657-e731076ded40#012a96598f0-bafa-2efe-a7a4-a98a1b4faeb9
Sep 30 14:51:14 XEN1 ha-lizard: 25883 vm_mon: Removing Control Domains from VM list
Sep 30 14:51:14 XEN1 ha-lizard: 25883 vm_mon: VM list returned = 1a091151-4173-bddf-66e3-4dddbf992242#012fb70a211-e0f6-f5e6-4657-e731076ded40#012a96598f0-bafa-2efe-a7a4-a98a1b4faeb9
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_state: Machine state for 1a091151-4173-bddf-66e3-4dddbf992242 returned: running
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_mon: VM 1a091151-4173-bddf-66e3-4dddbf992242 state = running
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_state: Machine state for fb70a211-e0f6-f5e6-4657-e731076ded40 returned: running
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_mon: VM fb70a211-e0f6-f5e6-4657-e731076ded40 state = running
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_state: Machine state for a96598f0-bafa-2efe-a7a4-a98a1b4faeb9 returned: running
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_mon: VM a96598f0-bafa-2efe-a7a4-a98a1b4faeb9 state = running
Sep 30 14:51:15 XEN1 ha-lizard: 25883 vm_mon: 0 Eligible Halted VMs found

Please Log in or Create an account to join the conversation.

Last edit: by Patrick.

Some Pool Slaves Not Responding 4 months 3 weeks ago #1905

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 613
This is a known issue that was introduced with xcp 7.6. We added a fix in ha-lizard 2.2. Looks like you are running ha-lizard 2.14. Upgrading to the latest release should solve the issue.

Release notes are here
halizard.org/release/ha-lizard/RELEASE

Latest version of ha-lizard can be downloaded from here
halizard.org/release/ha-lizard/

Please note, our older packaging is still available (tarball with installer script) or you can install from rpm too. Either path will support an upgrade in your case.

Please Log in or Create an account to join the conversation.

Some Pool Slaves Not Responding 4 months 3 weeks ago #1906

Oof. Sorry I missed that in my looking around. Thanks for the pointer.

Please Log in or Create an account to join the conversation.

  • Page:
  • 1