Quantcast
Channel: VMware Communities : Popular Discussions - VI: VMware ESX® 3.0
Viewing all articles
Browse latest Browse all 60069

Understanding HA and DRS in ESX3.5 and VirtualCenter 2.5

$
0
0

We have ESX3.5 and VC2.5 with HA/DRS enabled.  DRS in one of the clusters is in Partial mode.     

 

We had a situation yesterday with a VM in the HA/DRS cluster (only one resource pool).   One of our administrators needed to replace a bad memory module on one of the 3 hosts in the cluster.  He manually migrated all VMs off that host.  Later, another administrator worked an issue with one of the VMs in the same cluster and powered the VM off and on again.  He either migrated the VM or somehow the VM migrated to the host where there was no VMs located (the one that was to have the memory module replaced by the earlier administrator). 

 

Note: since the cluster was in Partial mode, I don't know why the VM would automatically migrate to the host where no VMs existed.  I believe , at best, a recommendation would have popped up instead.  Regardless, the VM somehow got back onto the host where NO VMs should have existed. 

 

Later that day, the first administrator SHUTDOWN (he did not put the host in maintenance mode) the host to replace the memory module.  He did not check to make sure there were no running VMs on the host....after all, he had manually migrated them all off the host earlier.  From here, things get a bit blurred based on testimony and logs.  But, from what I could tell, the sole VM on the shutdown host remained on the host and was powered off.  This caused an alert and obviously caused issues for our user community who were on the VM.

 

From what I read about HA and DRS, HA uses a "worst case scenario" to determine failover capability.  This is based on the running VMs on the cluster once one or more of the hosts in the cluster fails.  This "worst case scenario" takes the MOST used CPU reservation of any VM running in the cluster and the MOST used Memory reservation and applies that to all the running VMs to calculate total resources for the cluster.......if I read that correctly.  So, if the total is thereby exceeded some migrating VMs will not be allowed to power on because they will not meet Admission Control requirements.

 

If my interpretation of the reading is correct, then that may explain why the VM on the shudown host did not power back up.  Is that correct?


Viewing all articles
Browse latest Browse all 60069

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>