Post vCenter upgrade to 5.5, ESXI host fails to configure HA with the error “There was an error unconfiguring the vSphere HA agent on this host”
Today I was working in one of the customer’s environment where customer managed to upgrade vCenter environment from 5.5 update 3 to a high build version. vCenter upgrade went successfully. During cluster reconfiguration for HA, out of 9 hosts one of the ESXI host failed with the error “There was an error unconfiguring the vSphere HA agent on this host. To solve this problem, connect the host to a vCenter Server of version 5.0 or later. vSphere HA agent on this host is disabled”
Customer tried updating the individual host using the option reconfigure for vsphere HA but no luck , ended with the same error. Event post rebooting the host we still faced the same issue.
Underlying steps were followed in order to troubleshoot and rectify the issue.
Took an SSH session to the impacted host. Ran the command esxcli software vib list | grep –i fdm
The above command would verify the current version of HA agent installed along with the installation date. Based on details , verified the old version of FDM agent was installed.
Tried removing the vib using the command esxcli software vib remove –n vmware-fdm ended up with the below error.
Error in running rm /tardisks/vmware_f.v00:
Return code: 1
Output: rm: can’t remove ‘/tardisks/vmware_f.v00’: No such file or directory
From the above error I determined that during fdm agent removal it tries removing the above mentioned file but since this file is already missing or corrupted the uninstallation fails inturn failing reconfiguration of FDM agent with new version.
Note: To be on a safer side ensure there are no VM’s residing on the ESXI host when you would be performing the below mentioned steps.
Customer was having similar configuration and build on the other host.
Took an ssh session to a good host browsed through the location /tardisks and located for the filename vmware_f.v00 which was available. Ran a cp (copy) command to one of the shared datastore where the impacted host can access the path.
Went back to the impacted host accessed the path and tried copying the file directly to actual path i.e. /tardisks but unfortunately ended with an error message access denied. To overcome this issue copied the file vmware_f.v00 initially to /tmp directory of the impacted host. Ran a mv command directly to the actual path
mv /tmp/vmware_f.00 /tardisks
File was successfully moved. Reconfigured the ESXI host with HA from vsphere client and agent was successfully installed and upgraded to the actual version.
Hope this article was helpful.