VxRail Upgrades and Controlling VM Actions – Part 2 – Leveraging DRS Settings

As mentioned in the previous post, there are some deployments where customers would like slightly more control over their upgrades with VxRail.  While the automation we’ve built into our minimal click (it’s not quite one click, but it’s also not 10) upgrade process, there are some that don’t want/need the upgrade to be fully automated.  The good news is, there’s another way that we’ve found out.

The other possible option for achieving a similar result is to change the DRS settings to either “partially automated”, “manual” or just disable DRS.  Each of the settings has a similar result, but the difference would really be when the node comes back online.

In the lab, I tested both scenarios, neither really mattered much, but for validation, I wanted to make sure that was actually the case.  Be default, DRS is set to “Fully Automated” on a fresh VxRail install.

DRS-Enabled.png

For the first test, I set DRS to “manual”  mode, which would allow DRS to make recommendations for both power on and migrations for the VMs.  This would also require the recommendations to be either applied or ignored.  The frist step was to make the changes on the cluster.

DRS-Manual

After the changes were complete, upgrade was started on the cluster.  I just continued the upgrade from the previous post, so the first node we’d upgrade here is Node 2.  Once the upgrade started and made it to node 2, VxRail Manager created an alert in the upgrade details as well as across the top of the window stating that “Automatic DRS is not configured” and that I needed to “migrated the virtual machines” for the upgrade to continue.

DRS-Alert

In order for the upgrade to continue, I would need to log into vCenter and manually move the VMs.  One nice benefit of using this method over VM affinity rules is that the maintenance mode request doesn’t time out after 60 minutes.

Once the VMs were successfully moved to another host in the cluster, the maintenance mode request was fulfilled and VxRail was able to continue on with the upgrade.

Migrate-Complete.png

The other option I tested was setting DRS to “Partially Automated”.  This didn’t really change how VxRail Manager reacted during the upgrade.  You saw the same messages in VxRail Manager.  Also, if you happen to look at the task list, you’ll see it took me just over 4 hours to get back to the system and migrate the VMs off Node 3 in the cluster.

In the screenshots above, you’ll see the validation of the DRS mode the cluster was set to, as well as the migration of the VMs from Node 3 to other nodes in the cluster as well as the system going into maintenance mode and VxRail manager showing the node being rebooted after it successfully entered maintenance mode.  When it was all said and done, the cluster was fully upgraded from 4.7.000 to 4.7.001.

Upgrade Complete

In addition to showing different ways we can control VM actions, the other thing that was nice is VxRail is resilient enough that I was able to upgrade the entire cluster while continuing to change the settings of the cluster.  In this scenario, Node 1 was upgraded with VM affinity rules (discussed in this post).  Node 2 was upgraded with DRS set to “Manual” mode and Node 3 was upgraded with DRS set to “Partially Automated”.

Each of the 3 ways to upgrade the systems never caused VxRail Manager to “start over” on the upgrade.  I was able to choose “Retry” using VM Affinity rules as documented in that post.  That did take 60 minutes for it to time out from  the maintenance mode request.  In that scenario, if I was watching it closely, I could have manually vMotioned the VMs.  When it came to Nodes 2 and 3, the system had the alert that DRS wasn’t set to fully automated and some user intervention was required, but the system never canceled the request for maintenance mode.

After the testing, I think the best way to control your VM movement is by leveraging DRS settings.  This allows for you to control the VM movement, but at the same time, it won’t cause the upgrade to time out.  With that being said, I wouldn’t say that either way is “wrong”, it just comes down to what will work best for your environment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s