Introducing Cluster OS Rolling Upgrades in Windows Server 2016


Written by:

Welcome to Microsoft Mechanics. Coming up on the show we take a look at how you can
upgrade any Windows Server 2012 R2 Cluster to Windows Server 2016 with minimal or no downtime or need for additional hardware. We’ll show you how it all works, demonstrate the steps to upgrade, and how you can automate the
process with Virtual Machine Manager. Microsoft Mechanics I’m joined by Rob Hindman from the High
Availability and Storage Team, welcome. Thanks Matt, great to be here. So it used to be that you needed to stand up a current and next-gen cluster to upgrade. So two separate clusters, management overhead, extra hardware and it was quite
challenging to upgrade. How are we solving for that with the new
cluster OS rolling upgrade feature? With this new capability, we’re focusing on minimal or no interruptions to the workload that’s running in the cluster. So customers can upgrade that cluster in place with a Hyper-V or Scale-out File Server Workload, there’s zero downtime. So, you don’t need to buy new
hardware at all. So it sounds like we’re streamlining the whole process making easier for that means to stay current and then show that it can maintain
standardized server configurations. But, how does it work? So in the past you needed to buy a new set of hardware with equal or greater capacity. And, stand up that new cluster and then you need to bring down the workload
and move it over to the new cluster. That was disruptive. So, now in this case you can upgrade in place. If we use an analogy with glasses of water where the glasses are cluster nodes, what you do is you take one of those nodes, you drain the water out of that and shift it over to the other glasses. Then you can move that node
or that glass out of the cluster. You can upgrade it to Windows Server 2016. And then you can move it back into the same cluster. And now you enter Mixed OS Mode, where that 2016 node is fully functional
and compatible with the 2012 R2 nodes in that cluster. And then you can distribute
the water across all the nodes. And then you can bring the water back and the water is balanced across those glasses or across the nodes in the cluster. Right, but in some cases you
might not have the capacity to upgrade. Or, you might want to allow for failover
during the upgrade process. Absolutely right. In that case, customers can actually add additional hardware temporarily into the cluster. So, it would be like adding a glass and you can then drain the workload
into that new glass and then you can pull that 2012 R2 node
or glass out of the cluster, upgrade it and then push it back up into the cluster. So is that the end of the process,
is the upgrade then complete? You have to do that for every node in the Cluster. You have to upgrade it to 2016. Once you’re done there, then you
update the cluster fuctional level to 2016. That way you can take advantage
of any 2016 features and capabilities? Absolutely. Right, can we see it in action? Of course. So, starting here with Windows Server 2016. What I’m going to do is launch the
Failover Cluster Manager UI. And this needs to be done on 2016? Right. Now, I’m connecting to the cluster
that we have running here. This is a 2012 R2 cluster. It has three Windows Server 2012 R2 nodes in it and if we click around we can see, if we click on these nodes we
can see their versions right there. So, these are all 2012 R2 nodes at the moment. These are Hyper-V machines, let’s go look at the VMs. So here you can see we have three VMs
that are running SQL Server. Now let me start an update for you. Here I’ve written this update loop Let’s start it so you can see
that we’re going to run this VM as we upgrade the cluster that’s
underneath it. So a SQL Query executing while we upgrade
the cluster and no downtime? Should be no downtime. Now as we move over, let’s grab a node and we’ll do a pause
drain on that node. And as we do that you can see,
we have live migration occurring. So that’s kicked off automatically for us. And those VMs are now moving
over to the remaining Nodes or, to those remaining glasses. And live migration, there’s no
downtime for the workload? There’s no downtime so that that
update loop is still running for us. And now that’s completed. Now if we look over here at the node,
we can see that is paused. Now we can evict that node, so let’s do that. So now the cluster just has these two
Windows Server 2012 nodes in it. And we can see that everything’s running. Now let’s add a 2016 node into that cluster. Now, we could have gone and upgraded
that node in place with Windows Server 2016 and
re-added it to the cluster. But, were using additional hardware here. For the sake of time, I’ve taken an identical system that has Windows Server 2016 on it. We’re not going to validate as we add
this node into the cluster just for the sake of time. That would be required for supporting commemoration? Yes, absolutely and recommended as well. So adding the node takes about 20 seconds. And then finally we’re done with it. Now this cluster is in mixed OS mode. We have one Windows Server 2016 node in here. with the other two Windows Server 2012 nodes. Ok, so we have this mixed OS node cluster, but can we continue to run like this? Of course. We ask customers to run it up to four weeks. In this case I’m going to live migrate that VM uplevel
to the Windows Server 2016 Node. Right. And there, that succeeded. So this is basically the same process
that we do on all the other nodes. Right. So mixed OS node was still running at
the 2012 R2 functional level, correct? Correct and so the next step is to run PowerShell and once we like it, to upgrade the
cluster fully to Windows Server 2016. At this point, we still could roll back? We absolutely can still roll back,
there’s no problem there. Ok, so can you show us how to
upgrade the cluster functional level? Sure, let’s do that. Remember, once we’ve done the operation, and we’ve drained each node in sequence, we’ve drained it, we evicted it and upgraded 2016 and then re-added it to the cluster
and rebalanced the workloads. Then we’re ready to run the PowerShell. So let’s go ahead and do that. So we just open up a PowerShell prompt here, like this. The first command
i’m going to run is a query. We’re going to get the cluster functional level. So we run that query and you can see that
the cluster functional level is eight. That means its Windows Server 2012 R2, right? The update cluster functional level
command lit itself is very quick. It only takes a few seconds. We’ve just now upgraded the cluster. And now we’re going to look at the
cluster functional level again. We see that it’s nine, which means
Windows Server 2016 compatibility. Ok. What about our SQL Server VM? Let’s go see that. So here on the rolls, let’s just connect to that VM. Ok, here you can see the query is still
running and we just have to scroll down. And there you can see that it should just update. Yep, and we just upgraded the entire
cluster underneath that. Nice, so now that the entire process is complete what 2016 benefits can people expect to see? There are several. So in Windows Server 2016 Failover
Clustering there’s the Cloud Witness, we have VM compute resiliency
and VM storage resiliency. There’s Storage Replica and
of course Storage Spaces Direct. And many more? And others. So, is there anything else that we
need to do now the cluster is fully upgraded? For Scale-out File Server Workload, we’re done. But for Hyper-V Workload, what we should do is we should go and update the VM configuration versions so that we can take advantage of the
new capabilities of Hyper-V. Ok, so like Shielded VMs, Production
Checkpoints and many more? Exactly and so Production Checkpoints get your application consistent snapshots and that’s a big deal. Ok, so that was how to migrate nodes
on a smaller scale. But if you’ve got a 64 node cluster that manual approach is going to take
a significant amount of time. So, could I automate this with PowerShell? Absolutely. So this process is fully automatable. So you could easily write PowerShell
automation to automate this process. Or, you can use a higher level orchestrator to orchestrate and make
those PowerShell calls. Nice. So there is one other alternative that you could use and that could be Virtual Machine Manager. So, let me show you what that looks like to
automate the process of the upgrade. So what we have here is Virtual Machine
Manager in System Center 2016. I’ve got a three node cluster and a selection of
VM’s distributed across that cluster. Now in my fabric view, if we take a look at it a little bit
more depth in those nodes you’ll see they’re all running Windows
Server 2012 R2 datacenter edition. So there’s node two, node three all
configured the same . Now in order to kick off the the upgrade process I’m going to right-click the cluster and upgrade cluster. I’m going to select all of my nodes in this wizard. A very simple approach. Now the next key step is to choose
a physical computer profile. This is a VMM object that essentially allows us to control the customization of the target
physical machine upon deployment. So what storage, what networking
configuration, and so on. We’re going to select a physical
computer profile for 2016. And then we’re going to move through the wizard. So click next. I’m going to communicate with these systems using the out-of-band management control specified here. This is now at the stage in the wizard
when I get to customize the final steps of the process. So VMM has already looked at
all of these three nodes within the cluster. And, it’s captured all of the settings. The network conflict and the storage conflict. So we don’t actually have to change anything
unless we really wanted to change it. I can just leave things as it is and move forward. We’ve got a nice summary of all of the
steps that are gonna be performed. If I enter PowerShell I can view the script
that’s going to be executed under the covers. I could save that or I could
reuse it if I wanted as well. All that’s left for me to do is click finish and the job starts. Now to upgrade three physical servers in a cluster VMM is going to do a series of steps. That could take anywhere from 20 minutes
to an hour to even longer depending on how many cluster nodes you’ve got. It depends on your hardware,
the systems, the network and so on. Now if we look at one that we’ve already got upgraded we can see just the sheer number of steps that VMM goes through to actually orchestrate
this upgrade in an automated way. And I’m just looking at one of the upgraded nodes here. So, if you look at node three of that you’ll see the start maintenance mode is analogous with your drain that you showed earlier. We then evacuate all of the Virtual Machines
using live migration to move them from node three to other nodes in the cluster. We evict the node and we clean it up. We then start the process of
actually deploying a new node. A new operating system to that physical server using that baseboard management
control of communication. You’ll see here that transfer VHD step. That’s important because the VMM
is using boot from VHD, deploying a virtual disk to that target server, and starting up from that native VHD boot. And that the VHD contains Windows Server 2016. You’ll see we continue on enabling the Hyper-V role,
the clustering features functionality. Configuring all of the networking settings
and relevant storage settings as well. And then we finalize the process by updating that cluster functional level and then running validation for
a supported configuration. And that is your cluster upgraded. And VMM controls all of that which is Awesome. Now, it’s important to note that
there are some system requirements for using VMM. We recommend that you review
the documentation for VMM for further guidance for automating your cluster
OS rolling upgrades using the link below. The key question is,
where is all this going Rob? Moving forward this is the process
that we envision that you will use to move your clusters from Windows
Server 2012 R2 to 2016. We’ve developed this approach with a lot of feedback and we’re pretty confident that
we’ve hit the mark. We think that it will really allow you
to smoothly upgrade your clusters from 2012 R2 to 2016 with minimal downtime. Thanks Rob, great overview. You can learn more about cluster OS Rolling
upgrades at the link below. And don’t forget to keep watching Microsoft
Mechanics for the latest tech updates. Bye for now. Microsoft Mechanics

One Reply to “Introducing Cluster OS Rolling Upgrades in Windows Server 2016”

  1. Brian Petersen says:

    Thanks guys, that was a great overview! I'm really excited to get started upgrading my clusters.

Leave a Reply

Your email address will not be published. Required fields are marked *