Site Recovery Manager fails to start the service with the error message “The host certificate chain is incomplete”
Whenever you think you understand a product to a certain extent the product proves you wrong. So I was in a similar situation where a customer reported that SRM service was failing to start at the recovery site. As my regular practice which I always do, went through the DR logs to validate the error backtrace. I managed to find a interesting backtrace.
Those who still do not like to memorize the path, you can find it below.
C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs\VMware-dr.log.
[11644 verbose ‘LocalPbmServer’ connID=4bfe] Attempting to connect
[09252 error ‘LocalSsoServer.ConnHandler’ connID=sso-admin-2cce] `anonymous-namespace’::ConnectHandler::GetContentComplete: Failed to retrieve admin content while connecting to SSO.Exception:
–> std::exception ‘class Vmacore::Ssl::SSLVerifyException’ “SSL Exception: Verification parameters:
–> PeerThumbprint: 62:01:94:4C:A5:94:D1:2D:CF:BC:83:66:A6:83:63:7A:05:E2:EA:07
–> ExpectedPeerName: vCenter.vmware.com
–> The remote host certificate has these problems:
–> * The host certificate chain is incomplete.
–> * unable to get local issuer certificate”
Now it is certainly not possible the service stopped all by itself without any change in the environment. So during my interaction with the customer he informed me he did witness the issue post replacing the PSC and vCenter certificate in the environment.
Checked and found the thumbprint provided in the stack trace was matching the thumbprint for PSC certificate but yet for some reason we still encountered service being crashed.
Since the above message kept prompting as “The host certificate chain is incomplete” I was certain that SRM is not happy with the PSC certificate for some reason (in this case PSC was external).
So the idea behind to login to the PSC was to retrieve the root certificate and add it to SRM trusted root certificate authorities.
I logged in to the PSC and ran the below command to extract it.
/usr/lib/vmware-vmafd/bin/vecs-cli entry list –store TRUSTED_ROOTS
Windows vCenter Server
“C:\Program Files\VMware\vCenter Server\vmafdd\”vecs-cli entry list –store TRUSTED_ROOTS
Alias : ec5c6f2b6e16a299f5a6d352ca93539812e1d727
Entry type : Trusted Cert
Certificate : —–BEGIN CERTIFICATE—–
Pasted the above content in a notepad that starts from Begin Certificate upto End Certificate and saved the file as root.cer. Went ahead and copied the certificate to the SRM server
Logged in to the SRM server in the recovery site and performed the below procedure.
File–>Add/Remove Snapin–>Certificate–>Add–>Computer Account–>Local Computer–>Finish.
Clicked on Certificates–>Trusted Root Certificate Authorities–>Certificates
All Tasks–>Import–>Next–>Browse and locate the above root certificate–>Next–>Finish.
Went ahead and performed a modify of the SRM installer to start the service successfully.
Hope this article was helpful. Watch out for more.