Tuesday, March 21, 2017

Server 2016 Failover Clustering Hands-On with the New Features

Server 2016 Failover Clustering Hands-On with the New Features

This lab will be taking you through some of the scenarios customers will be running with the release of Windows Server 2016 and Failover Clustering.  The scenarios we will be running through will be:

1.       Creating a Storage Spaces Direct (S2D) Cluster with a Scale-Out File Server (SOFS) role
2.       Rolling Cluster Upgrade of a Windows Server 2012 R2 Hyper-V Failover Cluster to a Windows Server 2016 Failover Cluster
3.       Experiencing Virtual Machine Compute Resiliency
4.       Setting up Virtual Machine Sets
5.       Setting up Virtual Machine Start Ordering
6.       Running a new diagnostic command for collecting data (Get-ClusterDiagnosticInfo)

For these sessions, we have the following machines:

DC = Role of Domain Controller and ISCI Target

2016-NODE3 = Server 2016 Server for Storage Direct SOFS and has four 11gb drives attached
2016-NODE4 = Server 2016 Server for Storage Direct SOFS and has four 11gb drives attached
2016-NODE5 = Server 2016 Server for Storage Direct SOFS and has four 11gb drives attached
2016-NODE6 = Server 2016 Server for Storage Direct SOFS and has four 11gb drives attached

2012R2-NODE1 = Windows 2012 R2 Failover Cluster currently running
2016-NODE1 = Server 2016 with Role of Failover Cluster that will use for upgrade
2016-NODE2 = Server 2016 with Role of Failover Cluster that will use for upgrade

All machines are joined to a CONTOSO.COM domain.  The account we will be using is below and a part of the Domain Admins Group:

Account: Contoso\Ignite
Password: Password1

If necessary, the Domain Administrator account is:

Account:  Contoso\Administrator
Password:  Password1





Scenario 1:  Storage Direct Cluster with SOFS

Storage Direct is a new feature with Windows Server 2016 Failover Clustering where it makes use of direct attached storage instead of shared storage.  For Storage Direct Clusters, use PowerShell to manage and deploy the clusters. 

For this scenario, we will be using the following machines:

2016-NODE3
2016-NODE4
2016-NODE5
2016-NODE6

For this scenario, we will be creating a Storage Direct Cluster using the Scale-Out File Server role.  Each of the 3 Servers has 4 drives. 

1.       Log on to the 2016-NODE3 machine using the Contoso\Ignite account
2.       From the Start Menu, right mouse click on the Windows PowerShell icon and choose Moreand Run as Administrator.
3.       The Failover Clustering feature is already installed on the Servers, so we must create a Cluster but ensure we do not add any storage to it. 
4.       We need to ensure that we can run active scripts and that the proper PowerShell modules are loaded.  Please run the following commands:
Set-ExecutionPolicy –ExecutionPolicy Unrestricted –Confirm –Force
Import-Module Storage
Import-Module FailoverClusters
5.       From the local PowerShell window, run the command:
New-Cluster –Name S2D-CLUSTER –Node 2016-NODE3,2016-NODE4,2016-NODE5,2016-NODE6 –StaticAddress 1.0.0.45
a.       Once the cmdlet completes, you will receive a warning about issues creating the Cluster.  This is fine as the warning is regarding not being able to create a witness resource.
b.       Since the recommendation is to have a witness resource, we should create one.
                                                               i.      Open Failover Cluster Manager
                                                             ii.      In the left column, right mouse click on the name of the Cluster
                                                           iii.      Select More Actions and Configure Cluster Quorum Settings
                                                            iv.      Choose Next on the Before You Begin page
                                                             v.      Choose Select the quorum witness and Next
                                                            vi.      Select Configure a file share witness and Next
                                                          vii.      In the File Share Path box, input \\DC\FSW and Next
                                                        viii.      Choose Next on the Confirmation page
                                                            ix.      Choose the Finish button
6.       Ensure that you have the Cluster created with the following commands:
Get-Cluster –Name S2D-CLUSTER
Get-Cluster –Name S2D-CLUSTER | Get-ClusterResource
7.       Log onto each Cluster node (2016-NODE3, 2016-NODE4, 2016-NODE5, and 2016-NODE6) and go into Disk Management.  Ensure that each of the 4 additional drives are Online.  Once they are online, ensure they are initialized as GPT disks.  Do not create any partitions or volumes.
8.       We now need to set this up to be a Storage Spaces Direct Cluster.  To do this, run the following PowerShell command which will take a little while.  We will be using the –SkipEligibility since this is a virtual and will receive warnings about disks and counts which can be ignored.  It will appear as stalled in the mid 20% time period, but it will continue to run.
Enable-ClusterStorageSpacesDirect –SkipEligibility
9.       In the window, run this command:
Get-StorageSubSystem |where-object {$_.FriendlyName -like "*cluster*"}
10.   You will see the name as Clustered Windows Storage on S2D-CLUSTER and what we will be using.  To see how many disks are available, the command to run is below and should show 16 (4 nodes with 4 drives each).
(Get-StorageSubsystem –FriendlyName “*cluster*” | Get-PhysicalDisk).Count
11.   When the Enable-ClusterStorageSpacesDirect is run, it will create a Storage Pool out of the disks that are available.  So the next thing would be to create a virtual disk, but first you would want to see the Friendly Name of this pool with the command:
Get-StoragePool
12.   In Failover Cluster Manager, if you go to Storage / Pools, you will see Cluster Pool 1that  was created.  Highlighting it and choosing the Summary and Physical Disks and you will see the disks involved.
13.   To create the virtual disk (called VDISK) from this pool, the command is below.  This will create as a 3-way Mirror with the name of the disk being DASDISK and 20 gigabytes in size.  It will also automatically add it into the Cluster as a resource.
New-Volume –StoragePoolFriendlyName “S2D on S2D-CLUSTER” –FriendlyName VDISK –PhysicalDiskRedundancy 3 –FileSystem CSVFS_REFS –Size 20GB
14.   In Failover Cluster Manager, if you go to Storage / Disks, you will see Cluster Virtual Disk (VDISK) that was created. 
15.   Our next step is to create a Scale-Out File Server.  However, we first need to add the File Server Role to the Servers.  From Failover Cluster Manager:
a.       Right mouse click on Roles
b.       Choose Configure Role to begin the wizard
c.        In the list of roles, choose File Server
d.       In the File Server Type window, select Scale-Out File Server for application data
e.       In the Client Access Point window, input S2D-SOFS
f.        Choose Next on the Confirmation page.
16.   Go under Roles and you will see the new role you created.
17.   We now must create a share for the role.
a.       In Failover Cluster Manager, go to Roles and click on S2D-SOFS
b.       In the far right pane, choose Add File Share
NOTE: If you receive a popup about the Client Access Point not being ready, switch to the DC and Stop/Start the DNS Client Service.
c.        On the Share Profile page, select SMB Share – Applications and Next
d.       On the Share Location page, choose Select by volume and ensure C:\ClusterStorage\Volume1 is highlighted and Next.
e.       In the Share name box, input VM-SHARE and Next.  Note the Local and Remote paths.
f.        On the Other Settings page, Enable continuous availability is already selected for you.  Choose Next.
g.       On the Specify permissions to control access page, choose Customize Permissions.
h.       On the Advanced Security Settings page, choose Add.
i.         On the Permission Entry page, choose Select a principle.
j.         In the Select User, Computer, Service Account, or Group dialog box, click Object Types and select only Computers.
k.       In the object name box, input the name of the Cluster S2D-CLUSTER, click Check Names and OK.
l.         In the Basic permissions box, select Full control and the OK box twice.
m.     On the Specify permissions to control access page, choose Next and Create.

You have your Windows Server Storage Direct Cluster with a Scale Out File Server containing the share \\S2D-SOFS\VM-SHARE.









Scenario 2: Rolling Cluster Upgrades

Failover Cluster Operating System (OS) Rolling Upgrade is a new feature in Windows Server 2016 that enables an administrator to upgrade the operating system of the cluster nodes from Windows Server 2012 R2 to Windows Server 2016 without stopping the Hyper-V or the Scale-Out File Server workloads. Using this feature, the downtime penalties against Service Level Agreements (SLA) can be avoided for the Hyper-V and Scale-Out File Server workloads.

In this scenario, the currently running Windows Server 2012 R2 Cluster is being used as a Hyper-V Cluster that has a highly available virtual machine on it.  Do not start the virtual machine as it will not start.  To take advantage of the newly created Storage Direct Cluster we created, we will be moving the virtual machine storage (VHDX) to it and perform the rolling Cluster upgrade.

For this scenario, we will be using the following machines:

2012R2-NODE1
2016-NODE1
2016-NODE2

The basic premise here is to add the Server 2016 nodes, move the resources over, remove the 2012R2 node, and update the Cluster functional level.  When you are going to be in mixed mode with the versions, you will always want to use the tools from the higher version.

a.       Log on to the 2016-NODE1 machine using the Contoso\Ignite account.
a.       Go into Failover Cluster Manager
a.       Start Server Manager from the icon on the Task Bar or run Servermanager.exe
b.       Choose the Tools menu
c.        Select Failover Cluster Manager
b.       We will then need to connect to the Cluster if a connection is not already made.
a.       From Failover Cluster Manager, select Connect to Cluster under either the Management window or the Actions pane
    

b.       In the Cluster name box, input the name of the Cluster which is Ignite-Rolling and select OK.
c.        In the far left pane under the name of the Cluster, you will see Nodes.  Going to it, you will see you have only a Windows Server 2012R2 node.  We will need to add the Windows 2016 nodes to the Cluster.
d.       Right Mouse Click on Nodes and select Add Nodes.
e.       Next passed the Before You Begin page.
f.        On the Select Servers page, add in 2016-NODE1 and 2016-NODE2 then choose Next.
g.       On the Validation Warning page, choose No and Next.
h.       On the Confirmation page, choose Next.
i.         Click Finish on the Summary page.

You will be returned back to Failover Cluster Manager under Nodes.  You should now see that all three nodes are now listed.  With the two different versions of Windows participating in the Cluster, it is known as a mixed-mode Cluster.  This is also known as Cluster Functional Level 8.  While in this mode, Windows Server 2012 R2 can be removed and added back as needed.  This also means that you cannot take advantage of any of the new features that Windows Server 2016 Failover Cluster can provide.

In this lab, it is a virtual environment and we are using Nested Virtualization.  Nested virtualization is only going to work with Server 2016 and Windows 10.  It will not work with Server 2012 R2 and why I had mentioned previously not to turn the VM on that is running on Server 2012 R2.

In a physical environment, our next step would be a live migration of the virtual machine to a Server 2016 node.  So we will be simulating it. 

1.       Switch back to 2016-NODE1.
2.       In Cluster Manager, choose Roles
3.       Right mouse click on the Nano VM
4.       Choose Move and choose Quick Migration then Select Node.  Normally, you would choose Live Migration so that it moves over while it is up and production is not affected.
5.       In the list, select 2016-NODE1.
6.       Once it moves, right click on it and select Start.

To get the Cluster up to the Server 2016 level and take advantage of the new features, all Server 2012 R2 nodes must be removed and the functional needs to be updated.  One thing to note is that Server 2016 Failover Cluster will not allow the functional level to be updated if a Server 2012 R2 node still exists.

To see this error, attempt to update the functional level.

1.       While still on 2016-NODE1, run PowerShell as an administrator
2.       Run the command Get-Cluster | fl ClusterFunctionalLevel and you will see it is at Level 8
3.       Now let’s try to increase the level with the command Update-ClusterFunctionalLevel and you will see the error:
4.       We will need to evict the Server 2012 R2 node from the Cluster.  Since we are in PowerShell, you can run the command Remove-ClusterNode –Name 2012R2-NODE1 and select Yes to evict it.
5.       Run the Update-ClusterFunctionalLevel again and it will now succeed.
6.       Run Get-Cluster | fl ClusterFunctionalLeveland you will see that it is now  9 and at the Server 2016 level.
7.       With it at this level, a Server 2012R2 will be blocked.  To see this, log onto the 2012R2-NODE1 machine.
8.       Run Failover Cluster Manager and try to connect to the Ignite-Rolling Cluster.  It will fail with this error:

9.       Switch over to the 2016-NODE1 and in Failover Cluster Manager, right mouse click on Nodes and choose Add Node to try and add the 2012R2-NODE1 machine.
10.   This is going to fail with the below:

Now we have the Server 2016 Cluster fully functional with the Nano VM up and running.






Scenario 3: Virtual Machine Compute Resiliency
========================================

For this next scenario, we are going to see how Virtual Machine Compute Resiliency works and what it is.  In previous versions of Hyper-V Clusters, if a node is removed from Cluster membership due to transient network failures, any virtual machines that were running on it crash and are moved to a new node and restarted.  This causes downtime even if the virtual machine’s VHD/VHDX files are located on a share and accessible.  

With VM Computer Resiliency, if the VHD/VHDX is still accessible, the virtual machine will continue to run in an unmonitored state.  The node is allowed to rejoin the Cluster several times (configurable).  However, if it continues to have issues, it is isolated and not allowed to join back automatically for 2 hours and is isolated. If the Cluster Service does not restart or becomes isolated, the VM is Live Migrated to another node.

So for this scenario, we will crash the Cluster Service on a node the VM is running on multiple times and you will see the VM is still accessible.

1.       Log onto 2016-NODE1
2.       Run Failover Cluster Manager and go to Roles
3.       Ensure that the Nano VM is running on 2016-NODE1.  If it is not, right mouse click on it and select Live or Quick Migration to move it to 2016-NODE1
4.       Right mouse click on Nano and choose Connect and you will be taken to the Recovery Console.  Leave it where it is.
5.       Start Failover Cluster Manager and go to Nodes
6.       Adjust the windows so that you can see the Nano VM and Cluster Manager / Nodes with Powershell in front.  Something similar to this.
7.       Run PowerShell as an Administrator
8.       Run Stop-Process –processname Clussvc* -force which will kill the Cluster Service.  Notice in Failover Cluster Manager that 2016-NODE1 is isolated.  If you switch to Roles, you will see the VM is still running but is unmonitored and still at the log in prompt
9.       The Cluster Service will restart on its own and all is back to normal.  Run Stop-Process –processname Clussvc* -force again to kill the Cluster Service
10.   The same things will occur with 2016-NODE1 going isolated and will restart again
11.   Run Stop-Process –processname Clussvc* -force again to kill the Cluster Service
12.   This time, you will notice that the node is now quarantined and not a part of the Cluster
13.   Switch over to Nodes and you will see it is on 2016-NODE1 and then Live Migrated over to 2016-NODE2.  If you switch over to
14.   To get the 2016-NODE1 back into the Cluster, you must start the Cluster Service with a switch that clear the quarantine.  The command would be Start-ClusterNode -CQ







Scenario 4: Virtual Machine Sets and Start Ordering
===========================================
A new concept is being introduced in Windows Server 2016 called a “Set”.  A set can contain one or more groups, and sets can have dependencies on each other. This enables creating dependencies between cluster groups for controlling start ordering.  While Sets were primarily focused at virtual machines, it is generic cluster infrastructure which will work with any clustered role.



In this scenario, which we will combine with the next, we are going to create virtual machine sets that will combine virtual machines together.  We will then create dependencies between sets to order their startups.

To do this, we will want to create several VMs.  However, these VMs will not have any operating system on them as for this, they really do not need them.

1.       Log onto 2016-NODE1 and start Failover Cluster Manager
2.       Go to Roles and make sure all the VMs are off
3.       In the Right Column, click Virtual Machines and New Virtual Machine
4.       Select 2016-NODE1 and OK
5.       Run through the wizard choosing these options
a.       VM Name: APP1
b.       Store the VM: \\S2D-SOFS\VM-SHARE
c.        Generation: 2
d.       Startup Memory: 128 meg
e.       Network Connection: Not Connected
f.        Virtual Disk: Leave Defaults
g.       Install an operating system later
h.       Finish
6.       Do the same for creating APP2, DB1, and DB2 so we have a total of 5 VMs

Now that we have created the VMs, we need to create the sets and add the VMs to it.  This is all done from an administrative PowerShell prompt.


To view all the sets that have been created, you would run the command:


Once we have created the sets, we need to set the dependencies.  For this, we are going to make the Nano set depend on the App set.  Then the App set depend on the DB set.


To see the set dependencies defined, it would be:


Once all the dependencies are set, your VMs will start up in the order that they should.  As a default, we have set delay as how we are starting things with a 20 second delay between them.  Another trigger of having them start is if the VM is online, but for our purposes, we will stick with a delay.

To see this in action, go the Failover Cluster Manager and right mouse click the Nano VM and select Start.  You will see that the Nano VM will be “Starting” but will trigger the dependency tree.

It will start at the top which is DB1 and DB2.  20 seconds later, it will start APP1 and APP2.  20 seconds after that, it will start Nano.







Scenario 5: Get-ClusterDiagnosticInfo
===============================

This is a new command that is used to gather diagnostic info.  The scenario here is a method of getting information regarding the Cluster for review.  This could be used for troubleshooting or other things.

In the past, if you wanted to gather logs from a Cluster, you had to do it manually.  If you called into Microsoft for support (or other companies), they would send you a utility that would gather the information that you would upload for review.  However, you had to wait for it to be sent to you.  Now, you can gather the information prior to ever calling so it can be readily available.

When you run it, you will notice it is gathering information about the Cluster (all nodes).  If this is a Hyper-V Cluster, you will see information about the virtual machines.  Going through and giving a few highlights, you will see it is giving information about the top memory and CPU users, free memory and disk space available on the hosts.

One of the real keys that it can show you is if you have had issues within the last 24 hours.  What it will do is collect events from the numerous event channels it is collecting, parses them, and gives you the top events.  This can be beneficial for if you are having a problem and you do not know where to start.

Once it has completed, it will zip these files up and place it in the C:\Users\.  The other nice thing about it is it also tags the date and time as part of the filename so you know when it has been run.

1.       From 2016-NODE1, run an administrative PowerShell prompt. 
2.       Once there, run Get-ClusterDiagnosticInfo.
3.       Review the information on the screen. 
4.       At the end, it will give you a path.  Extract the files it has added to the ZIP and review what it collects.  Right mouse click the ZIP it created and choose Extract All
5.       Review all the files it has collected and notice it is from both nodes.  Keep in mind, it will get only the last 24 hours.  So if you have event logs/channels that have not been written to in that time period, they will be blank.






























No comments: