SMB3 PowerShell changes in Windows Server 2012 R2: SMB Multi-instance

August 11, 2014, 10:20 am

≫ Next: SMB3 PowerShell changes in Windows Server 2012 R2: SMB Delegation

Introduction

Windows Server 2012 R2 introduced a new version of SMB. Technically it’s SMB version 3.02, but we continue to call it just SMB3. The main changes are described at http://technet.microsoft.com/en-us/library/hh831474.aspx.

With this new release, we made a few changes in SMB PowerShell to support the new scenarios and features. This includes a few new cmdlets and some changes to existing cmdlets, with extra care not break any of your existing scripts.

This blog post outlines one of the 7 set of changes related to SMB PowerShell in Windows Server 2012 R2.

SMB Multi-instance

SMB Multi-instance is a new feature in Windows Server 2012 R2 that separates regular SMB traffic from CSV-related inter-node SMB traffic in distinct SMB instances.

This is designed to improve isolation between the two types of traffic in improve the reliability of the SMB servers.

Information related to this new CSV-only instance in Windows Server 2012 R2 is typically hidden by default in all PowerShell cmdlets.

Showing hidden instance information

There are the changes in SMB PowerShell so an Administrator can view information related to the hidden CSV instance:

The "-SmbInstance CSV" option in Get-SmbConnection and Get-SmbMultichannelConnection will show the connections associated with the hidden CSV instance.
There is now an InstanceName property in the full output of Get-SmbConnection, Get-SmbMultichannelConnection, Get-SmbSession and Get-SmbOpenFile. It shows either “Default” or “CSV” (only shows if using the –SmbInstance option).

There is really little use in inspecting the information on the hidden CSV instance, except if you’re troubleshooting CSV inter-node communications.

Note

This blog post is an updated version of the September 2013 post at http://blogs.technet.com/b/josebda/archive/2013/09/03/what-s-new-in-smb-powershell-in-windows-server-2012-r2.aspx focused on a single topic.

↧

SMB3 PowerShell changes in Windows Server 2012 R2: SMB Delegation

August 11, 2014, 10:26 am

≫ Next: SMB3 PowerShell changes in Windows Server 2012 R2: SMB1 can now be completely removed

≪ Previous: SMB3 PowerShell changes in Windows Server 2012 R2: SMB Multi-instance

Introduction

This blog post outlines one of the 7 set of changes related to SMB PowerShell in Windows Server 2012 R2.

The need for SMB Delegation

When you configure Hyper-V over SMB and you manage your Hyper-V hosts remotely using Hyper-V Manager, you will might run into access denied messages. This is because you’re using your credentials from the remote machine running Hyper-V Manager in the Hyper-V host to access a third machine (the file server). This is what we call a “double-hop”, and it’s not allowed by default for security reasons.The main problem with the scenario is that an intruder that compromises one computer in your environment could then connect to other systems in your environments without the need to provide a username and password. One way to work around this issue is to connect directly to the Hyper-V host and providing your credentials at that time, avoiding the double-hop.

You can also address this by configuring Constrained Delegation for SMB shares, which is a process that involves changing properties in Active Directory. The security risk is reduced here because a potential intruder double-hop would be limited to that specific use case (using SMB shares on the specified servers). The constrained delegation process was greatly simplified in Windows Server 2012 when the the Active Directory team introduced resource-based Kerberos constrained delegation, as explained at http://technet.microsoft.com/library/hh831747.aspx. However, even with this new resource-based constrained delegation, there are still quite a few steps to enable it.

Requirements for SMB Delegation

Before you use the new SMB Delegation cmdlets, you must meet two specific requirements.

First, the new cmdlets do rely on Active Directory PowerShell to perform their actions. For this reason, you need to install the Active Directory cmdlets before using the SMB delegation cmdlets. To install the Active Directory cmdlets, use:

Install-WindowsFeature RSAT-AD-PowerShell

Second, these cmdlets rely on the the new resource-based delegation in Active Directory. Since that AD feature was introduced in Windows Server 2012, the Active Directory forest must be in “Windows Server 2012” functional level. To check the Active Directory Forest Functional level, use:

Get-ADForest

The new SMB Delegation cmdlets

For Hyper-V over SMB in Windows Server 2012, we provided TechNet and blog-based guidance on how to automate constrained delegation. In Windows Server 2012 R2, SMB has a new set of cmdlets to simplify the configuration of resource-based constrained Delegation in SMB scenarios.

Here are the new cmdlets introduced:

Get-SmbDelegation –SmbServer X
Enable-SmbDelegation –SmbServer X –SmbClient Y
Disable-SmbDelegation –SmbServer X [–SmbClient Y] [-Force]

Notes

1) For the Disable-SmbDelegation cmdlet, if no client is specified, delegation will be removed for all clients.

2) System Center Virtual Machine Manager uses a different method to remote into the Hyper-V host and configure SMB shares. When using VMM, constrained delegation is not required for management of Hyper-V over SMB.

3) This blog post is an updated version of the September 2013 post at http://blogs.technet.com/b/josebda/archive/2013/09/03/what-s-new-in-smb-powershell-in-windows-server-2012-r2.aspx focused on a single topic.

↧

SMB3 PowerShell changes in Windows Server 2012 R2: SMB1 can now be completely removed

August 11, 2014, 10:38 am

≫ Next: Sample C# code for using the latest WMI classes to manage Windows Storage

≪ Previous: SMB3 PowerShell changes in Windows Server 2012 R2: SMB Delegation

Introduction

This blog post outlines one of the 7 set of changes related to SMB PowerShell in Windows Server 2012 R2.

SMB1 can now be completely removed

In Windows Server 2012 R2, SMB1 became an optional component and can now be completely disabled, so that the associated binaries are not even loaded. For scenarios where SMB1 is not required, this means less resource utilization, less need for patching and improved security.

For instance, in the Hyper-V over SMB scenario, where you are storing Hyper-V virtual disks and virtual machine configuration in SMB file shares, SMB3 is a requirement. In this case, SMB1 is not necessary and can be safely disabled.

For information worker scenarios, if you have Windows XP clients, you absolutely still need SMB1, since that is the only SMB version supported by Windows XP. If *all* your clients are running Windows Vista or later, SMB1 is no longer required and you can disable SMB1. Windows Vista and Windows 7 do not need SMB1 since they support SMB2. Windows 8 and Windows 8.1 do not need SMB1, since they support both SMB2 and SMB3.

For classic server scenarios, if you have Windows Server 2003 or Windows Server 2003 R2 servers, you absolutely still need SMB1, since that is the only SMB version supported by them. Windows Server 2008 and Windows Server 2008 R2 do not need SMB1 since they support SMB2. Windows Server 2012 and Windows Server 2012 R2 do not need SMB1, since they support both SMB2 and SMB3.

Disable SMB1

Even though the component can now be removed, due to the compatibility issues listed above, it is still enabled by default.

To disable SMB1 completely, use the following PowerShell cmdlet:

Remove-WindowsFeature FS-SMB1

You can re-enable it by using:

Add-WindowsFeature FS-SMB1

Notes

1) A reboot is required after this feature is enabled or disabled.

2) For more details about SMB versions and dialects and which operating systems support them, see http://blogs.technet.com/b/josebda/archive/2013/10/02/windows-server-2012-r2-which-version-of-the-smb-protocol-smb-1-0-smb-2-0-smb-2-1-smb-3-0-or-smb-3-02-you-are-using.aspx

↧

Sample C# code for using the latest WMI classes to manage Windows Storage

August 11, 2014, 2:00 pm

≫ Next: Understanding the files collected by the Test-StorageHealth.ps1 PowerShell script

≪ Previous: SMB3 PowerShell changes in Windows Server 2012 R2: SMB1 can now be completely removed

This blog post shows a bit of C# code to use the Windows Storage Management API (SM-API) classes that were introduced in Windows Server 2012 and Windows 8.

You can find a list of these classes at class described at http://msdn.microsoft.com/en-us/library/hh830612.aspx, including MSFT_PhysicalDisk, MSFT_StoragePool or MSFT_VirtualDisk.

I found a number of examples with the old interface using the old classes like Win32_Volume, but few good ones with the new classes like MSFT_Volume.

This is some simple C# code using console output. The main details to highlight here are the use of System.Management and how to specify the scope, which allows you to manage a remote computer.

Please note that you might need to enable WMI on the computer, which can be easily done with the command line “winrm quickconfig”.

Here is my first attempt is below, using just only System.Management and the ManagementObject class.

It’s implemented in a simple console application, which lists information about volumes and physical disks on the local machine.

using System;
using System.Text;
using System.Threading;
using System.Management;

namespace SMAPIQuery
{
    class Program
    {
        static void Main(string[] args)
        {
            // Use the Storage management scope
            ManagementScope scope = new ManagementScope("\\\\localhost\\ROOT\\Microsoft\\Windows\\Storage");
            // Define the query for volumes
            ObjectQuery query = new ObjectQuery("SELECT * FROM MSFT_Volume");

            // create the search for volumes
            ManagementObjectSearcher searcher = new ManagementObjectSearcher(scope, query);
            // Get the volumes
            ManagementObjectCollection allVolumes = searcher.Get();
            // Loop through all volumes
            foreach (ManagementObject oneVolume in allVolumes)
            {
                // Show volume information
                if (oneVolume["DriveLetter"].ToString()[0] > ' ' )
                {
                    Console.WriteLine("Volume '{0}' has {1} bytes total, {2} bytes available", oneVolume["DriveLetter"], oneVolume["Size"], oneVolume["SizeRemaining"]);
                }
            }

// Define the query for physical disks
query = new ObjectQuery("SELECT * FROM MSFT_PhysicalDisk");

// create the search for physical disks
searcher = new ManagementObjectSearcher(scope, query);

// Get the physical disks
ManagementObjectCollection allPDisks = searcher.Get();

            // Loop through all physical disks
            foreach (ManagementObject onePDisk in allPDisks)
            {
                // Show physical disk information
                Console.WriteLine("Disk {0} is model {1}, serial number {2}", onePDisk["DeviceId"], onePDisk["Model"], onePDisk["SerialNumber"]);
            }

            Console.ReadLine();
         }
        }
    }

Here is some sample output from this application:

Volume 'D' has 500104687616 bytes total, 430712184832 bytes available
Volume 'E' has 132018860032 bytes total, 110077665280 bytes available
Volume 'F' has 500105216000 bytes total, 356260683776 bytes available
Volume 'C' has 255690010624 bytes total, 71789502464 bytes available

Disk 2 is model SD              , serial number
Disk 0 is model MTFDDAK256MAM-1K12, serial number         131109303905
Disk 3 is model 5AS             , serial number 00000000e45ca01b30c1
Disk 1 is model ST9500325AS, serial number             6VEK9B89

Next, I got some help from other folks from Microsoft, including Cosmos Darwin (PM Intern) and Gustavo Franco (Senior Developer).

I wanted to use the more modern CimInstance objects, which offer more flexibility. Here’s the same code as above, but now using the Microsoft.Management.Insfrastructure namespace:

using System;
using System.Text;
using System.Threading;
using Microsoft.Management.Infrastructure;

namespace SMAPIQuery
{
    class Program
    {
        static void Main(string[] args)
        {

string computer = "localhost";

// Create CIM session
CimSession Session = CimSession.Create(computer);

// Query Volumes, returns CimInstances
var allVolumes = Session.QueryInstances(@"root\microsoft\windows\storage", "WQL", "SELECT * FROM MSFT_Volume");

            // Loop through all volumes
            foreach (CimInstance oneVolume in allVolumes)
            {
                // Show volume information

if (oneVolume.CimInstanceProperties["DriveLetter"].ToString()[0] > ' ' )
{

Console.WriteLine("Volume '{0}' has {1} bytes total, {2} bytes available", oneVolume.CimInstanceProperties["DriveLetter"], oneVolume.CimInstanceProperties["Size"], oneVolume.CimInstanceProperties["SizeRemaining"]);
}

}

// Query Physical Disks, returns CimInstances
var allPDisks = Session.QueryInstances(@"root\microsoft\windows\storage", "WQL", "SELECT * FROM MSFT_PhysicalDisk");

            // Loop through all physical disks
            foreach (CimInstance onePDisk in allPDisks)
            {
                // Show physical disk information
                Console.WriteLine("Disk {0} is model {1}, serial number {2}", onePDisk.CimInstanceProperties["DeviceId"], onePDisk.CimInstanceProperties["Model"].ToString().TrimEnd(), onePDisk.CimInstanceProperties["SerialNumber"]);
            }

            Console.ReadLine();
         }
        }
    }

The output is the same as before, but you can see that the top portion of the code uses a completely different set of classes to create a CimSession, then query the objects, which return in the form of CimInstance objects.

That code works well if the user running it has the right credentials to access the objects. If you want to provide specific credentials for your CimSession, that’s also possible with a bit more code. Here’s what it would look like:

using System;
using System.Text;
using System.Threading;
using Microsoft.Management.Infrastructure;
using Microsoft.Management.Infrastructure.Options;
using System.Security;

namespace SMAPIQuery
{
    class Program
    {
        static void Main(string[] args)
        {

            string computer = "10.1.1.1";
            string domain = "Domain1";
            string username = "User1";

string plaintextpassword;

Console.WriteLine("Enter password:");
plaintextpassword = Console.ReadLine();

            SecureString securepassword = new SecureString();
            foreach (char c in plaintextpassword)
            {
                securepassword.AppendChar(c);
            }

// create Credentials
CimCredential Credentials = new CimCredential(PasswordAuthenticationMechanism.Default, domain, username, securepassword);

            // create SessionOptions using Credentials
            WSManSessionOptions SessionOptions = new WSManSessionOptions();
            SessionOptions.AddDestinationCredentials(Credentials);

// create Session using computer, SessionOptions
CimSession Session = CimSession.Create(computer, SessionOptions);

var allVolumes = Session.QueryInstances(@"root\microsoft\windows\storage", "WQL", "SELECT * FROM MSFT_Volume");
var allPDisks = Session.QueryInstances(@"root\microsoft\windows\storage", "WQL", "SELECT * FROM MSFT_PhysicalDisk");

            // Loop through all volumes
            foreach (CimInstance oneVolume in allVolumes)
            {
                // Show volume information

if (oneVolume.CimInstanceProperties["DriveLetter"].ToString()[0] > ' ' )
{

}

            Console.ReadLine();
         }
        }
    }

Notice that the only thing that changed is how you create the CimSession object, using the SessionOptions to provide explicit credentials.

This sample code if focused on the storage side of things, so I am letting the user enter a visible plaintext password here. It should go without saying that you should provide a better mechanism to enter passwords.

You can get all the details on the other classes in the Microsoft.Management.Infrastructure namespace at http://msdn.microsoft.com/en-us/library/microsoft.management.infrastructure(v=vs.85).aspx.

All right, that should give you plenty of food for thought. Now go write some code!

↧

Understanding the files collected by the Test-StorageHealth.ps1 PowerShell script

August 16, 2014, 8:30 am

≫ Next: Using file copy to measure storage performance – Why it’s not a good idea and what you should do instead

≪ Previous: Sample C# code for using the latest WMI classes to manage Windows Storage

I recently published a PowerShell script to check the health and capacity of a Storage Cluster based on Windows Server 2012 R2 Scale-Out File Servers.

You can find details about this script (including the download link) in this blog post: PowerShell script for Storage Cluster Health Test published to the TechNet Script Center.

This script, when used without any parameters, simply performs a number of health checks and provides a report on health and capacity for the storage cluster.

However, when used with the optional –IncludeEvents parameter, it will collect lots of diagnostic information. The script also creates a convenient ZIP archive for transport.

Inside that ZIP (when you use the -IncludeEvents parameter), you will find a number of files that can help troubleshoot the storage cluster. That includes:

System Information (txt file). You can open it with Notepad to review details about the system, including CPU, memory, network interfaces, OS version and much more.
Cluster log (log file). This is a text file containing events from Failover Clustering. You can open it with Notepad to find every detail of what happened to the cluster.
Event logs for Failover Clustering, Hyper-V Shared VHDX, SMB, Core Storage and Storage Spaces (evtx file). You can open these with Event Viewer to look at the specific events, with the usual options to filter and search.
Mini dumps (dmp file). Dumps files contain information about what happened to the system during a crash. You can find more about how to read them at http://support.microsoft.com/kb/315263
PowerShell objects (xml file). Use PowerShell to import these objects into memory and query them. For instance: Import-CliXml .\GetPhysicalDisk.XML | Select DeviceId, Model, FirmwareVersion
Performance details as a comma-separated file (txt file). Open with Excel to create a pivot table.

These files can be used by the Test-StorageHealth.ps1 script to recreate its output without being connected to the live cluster, using the –ReadFromPath parameter.

In addition to that, for a support professional or experienced administrator, these files include a wealth of information that can greatly help with troubleshooting.

Here is a comprehensive list of the files inside the ZIP created by the script:

File Name	Quantity	Type
node1.domain.com_cluster.log	1/node	Cluster Log
node1.domain.com_Event_Application.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-FailoverClustering-CsvFs-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-FailoverClustering-Manager-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-FailoverClustering-Manager-Tracing.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-FailoverClustering-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-FailoverClustering-WMIProvider-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Hyper-V-High-Availability-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Hyper-V-Shared-VHDX-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Hyper-V-Shared-VHDX-Reservation.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SmbClient-Connectivity.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SMBClient-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SmbClient-Security.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SMBDirect-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SMBHashGeneration-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SMBServer-Connectivity.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SMBServer-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-SMBServer-Security.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-ATAPort-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-ATAPort-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-ClassPnP-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-ClassPnP-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-Disk-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-Disk-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-MultipathIoControlDriver-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-MultipathIoControlDriver-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-StorageSpaces-Driver-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-StorageSpaces-ManagementAgent-WHC.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-Storport-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-Storport-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-Storage-Tiering-Admin.EVTX	1/node	Event Log
node1.domain.com_Event_Microsoft-Windows-VHDMP-Operational.EVTX	1/node	Event Log
node1.domain.com_Event_System.EVTX	1/node	Event Log
node1.domain.com_SystemInfo.TXT	1/node	System Info
node1.domain.com _062414-129296-01.dmp	varies	Minidump
GetAllErrors.XML (Event Summary)	1	PowerShell Object
GetCluster.XML	1	PowerShell Object
GetClusterGroup.XML	1	PowerShell Object
GetClusterNetwork.XML	1	PowerShell Object
GetClusterNode.XML	1	PowerShell Object
GetClusterResource.XML	1	PowerShell Object
GetClusterSharedVolume.XML	1	PowerShell Object
GetParameters.XML	1	PowerShell Object
GetDedupVolume.XML	1	PowerShell Object
GetDrivers.XML	1	PowerShell Object
GetNetAdapter_node1.domain.com.XML	1/node	PowerShell Object
GetPartition.XML	1	PowerShell Object
GetPhysicalDisk.XML	1	PowerShell Object
GetReliabilityCounter.XML	1	PowerShell Object
GetSmbOpenFile.XML	1	PowerShell Object
GetSmbServerNetworkInterface_node1.domain.com.XML	1/node	PowerShell Object
GetSmbShare.XML	1	PowerShell Object
GetSmbWitness.XML	1	PowerShell Object
GetStorageEnclosure.XML	1	PowerShell Object
GetStoragePool.XML	1	PowerShell Object
GetVersion.XML	1	PowerShell Object
GetVirtualDisk.XML	1	PowerShell Object
GetVolume.XML	1	PowerShell Object
ShareStatus.XML	1	PowerShell Object
GetAssociations.XML	1	PowerShell Object
VolumePerformanceDetails.TXT (open with Excel to create a pivot table)	1	Text/CSV

↧

Using file copy to measure storage performance – Why it’s not a good idea and what you should do instead

August 18, 2014, 9:00 am

≫ Next: Adding Storage Performance to the Test-StorageHealth.ps1 script

≪ Previous: Understanding the files collected by the Test-StorageHealth.ps1 PowerShell script

1. Introduction

Every once in a while I hear from someone that they believe they have a performance problem with their Scale-Out File Server. When I dig a little further, it’s very common to find that file copies are being used as the mechanism for measuring storage performance for all kinds of scenarios.

This blog post is about why file copies are not a good metric for evaluating storage performance. It also covers what other tools you could use instead to measure things. As usual, my focus here is on copies that use SMB3 file shares or File Server Clusters.

2. Why people use file copies to measure storage performance

First of all, it’s important to understand why people use file copies to measure performance. There are actually quite a few reasons:

2.1. It’s so easy…

File copies are a simple thing to do. You can even do it from File Explorer.

2.2. It’s so pretty…

File Explorer now has a nice visualization of the file copy operation, including a chart showing bandwidth as the copy progresses.

clip_image002

2.3. I have lots of large VHDX files sitting around ready to copy

We now have many large VHDX files readily available for testing file copies, and they will take a while to transfer. When someone is looking for a simple way to generate IOs against a storage subsystem, they might be tempted to simply copy those files in or out.

2.4. Someone told me it was a good idea

There are a lot of blog posts and demo videos out there about how fast file copies are in this new storage system or this new protocol. I might have done that myself a few times (sorry about that).

3. Why using file copies to measure storage performance is not a good idea

Using file copy to measure storage performance has a number of issues. You might essentially end up reaching incorrect conclusions. Here are a few reasons for that:

3.1. Copy might not be optimized

In order to get the most of your storage subsystem, you need to queue up enough IO requests to keep the entire system busy end-to-end. For instance, when using hard disk drives, we like to watch the queue depth performance counters to make sure we maintain at least two IOs queued for each HDD you have in a system.

In order to take advantage of this, when you’re copying large files using the Windows copy engine, it will issue 8 asynchronous writes of 1MB each by default. However, when copying lots of small files (and using a fast storage backend), you are less likely to stand up the required IO queue depth unless you queue more IOs and use multiple threads.

To make matters worse, most file copy programs will copy only one file at a time, waiting for one file copy to finish before starting the next one. This serialization will give you a much lower storage performance than the system is actually capable of delivering when more requests are being queued in parallel.

The SMB protocol is able to queue up multiple asynchronous requests, but that does not happen unless the application (the file copy program in this case) takes advantage of it. ROBOCOPY is one of these programs that can use multiple threads to copy multiple files at once.

More importantly, the file copy behavior is not a good stand-in for all other workloads. You will find that its behavior is different from a SQL Server OLTP databases or a Hyper-V Virtual Desktop deployment, both in regards to IO sizes and the number of IOs that are queued up.

3.2. Every copy has two sides

People will often forget that file copies measure the performance of both source storage and the destination storage.

If you’re copying from a single slow disk at the source to 10 fast disks in a striped volume on the other end, the speed at which you can read from that source will determine the overall speed of the file copy process.

You’re basically measuring the performance of the slower of the two sides.

3.3. Offloads and caching

There are a number of specific technologies that can accelerate certain kinds of copies like Offloaded Data Transfers (ODX) and SMB file copy acceleration (COPYCHUNK) which will impact certain kinds of file copies and it might be hard to determine when they apply or not.

File copies will attempt to use buffered IOs, which can be accelerated by regular windows file caching. On the other hand, buffered file copies will not be accelerated by cluster CSV caching, which only accelerates unbuffered IOs.

As with item 3.1, these are examples of how file copy performance will not match the performance of other types of workload, like an OLTP database or a VDI deployment.

3.4. CA will not cache

Continuously Available (CA) file shares are commonly used to provide Hyper-V and SQL Server with a storage solution that will transparently failover in case of node failure.

In order to deliver on its availability promise, CA file shares will make sure that every write goes straight to disk (write-through) and no writes are cached only in RAM. That’s how SMB can recover from the failure of a file server node at any time without data loss.

While this is great for resiliency, it will slow down file copies, particularly if you are copying a single large file.

It’s important to note that most server workloads (including SQL Server and Hyper-V) will always use write-through regardless of CA, so while file copies are affected by the write-through behavior change caused by CA, most server workloads are not.

4. Better ways to measure performance

4.1. Run the actual workload

The best way to measure performance is to run the actual workload that you are targeting.

If you’re configuring some storage that will be used for an OLTP database, install SQL Server, run an OLTP workload and see how many transactions per second you can get out of it.

If you’re creating a solution for static web sites running inside a VM, install Hyper-V, create a VM configured with IIS, set up a few static web sites and see how they handle multiple clients accessing the static web content.

4.2. Run a workload simulator

When it’s not possible or practical to run the actual workload, you can at least run a synthetic workload simulator.

There are many of these simulators out there that mimic the actual behavior of the application and allow you to simulate a large number of clients accessing your machine.

For instance, to simulate running SQL Server databases, you might want to try the DiskSpd tool (see http://aka.ms/DiskSpd).

DiskSpd is actually fairly flexible and can go beyond simulating just SQL Server behavior. I even created a specific blog post on how to use DiskSpd, which you can find at http://blogs.technet.com/b/josebda/archive/2014/10/13/diskspd-powershell-and-storage-performance-measuring-iops-throughput-and-latency-for-both-local-disks-and-smb-file-shares.aspx.

4.3. Keep your simulations as real as possible

While it’s more practical to use workload simulators, you should try to stay as close as possible to the actual solution you will deploy.

For instance, if you planning to deploy 4 SQL Servers in a Hyper-V-based private cloud, you should measure the storage performance by actually creating 4 VMs and running your DiskSpd simulation inside each one.

That is a much better simulation than just running 4 instances of DiskSpd in the host, since the IO pattern of 4 instances of SQL Server running on bare metal will be different from the IO pattern of four VMs each running one instance of SQL Server.

5. If your workload is actually file copies

All that aside, there’s a chance that what you are actually trying to test a file copy workload. I mean, you actually have a production scenario where you will be transferring files. In that case (and only in that case), here are a few tips to optimize that specific scenario.

5.1. Check both sides of the copy

Remember to optimize both the source and the destination storage subsystems. As mentioned before, you will be as fast as the weakest link in the chain. You might want to redesign your storage solution so that source and destination have better performance or are “closer” to each other.

5.2. Use the right copy tool

Most file copy tools like the EXPLORER.EXE GUI, the COPY command in the shell, the XCOPY.EXE tool and the PowerShell Copy-Item cmdlet not optimized for performance. They are single-threaded, one-file-at-a-time solutions that will do the job but are not designed to transfer files as fast as possible.

The best file copy tool included in Windows is actually the ROCOBOPY.EXE tool. It includes very useful options like /MT (for using multiple threads to copy multiple files at once) and /J (copy using unbuffered I/O, which is recommended for large files).

That tool got some love from the Performance Fundamentals team at Microsoft and it’s usually much faster than anything else in Windows.

It’s important to note that even ROBOCOPY with the /MT option won’t help if you’re copying a single file. Like most other file copy programs, it uses a common file copy API instead of custom code.

5.3. Offload with COPYCHUNK

If it’s an option for you, put source and destination of your copy on the same file server to leverage the built-in SMB COPYCHUNK. This optimization is part of the SMB protocol and basically avoids sending data over the wire if the source and destination are on the same machine.

You can read about it at http://msdn.microsoft.com/en-us/library/cc246475.aspx (yes, this has been there since SMB1 and it’s still there in SMB3).

Note that COPYCHUNK only applies if the source and destination shares are on the same file server and if the file size is at least 64KB.

5.3. Offload with ODX

If your file server uses a SAN back-end, consider using the Offloaded Data Transfers (ODX). This T10 standard improves performance by using a system of tokens to avoid transferring actual data over the wire.

It works only if the source and destination paths live on the same SAN (or somehow connected in the back-end). This also works with SMB file shares (SMB basically lets the request pass down to the underlying storage subsystem).

ODX support was introduced in Windows Server 2012 and requires specific support from your SAN vendor. You can read about it at http://msdn.microsoft.com/en-us/library/windows/hardware/dn265439.aspx.

5.4. Create a non-CA file share

If your file server is clustered, you can use SMB Continuously Available file shares that allow you to lose any node of the cluster at any time without impact to the applications. The file clients and file servers will automatically recover though a process we call SMB Transparent Failover.

However, this requires that every write be written through to the storage (instead of potentially being cached). Most server workloads (like Hyper-V and SQL Server) already have this unbuffered IO behavior, but not file copies. So, CA has the potential of slowing down file copy operations, which are normally done with buffered IOs.

If you want to trade reliability for performance during file copies, you can create a file share with the Continuous Availability property turned off (it’s on by default on all clustered file shares).

In that case, if there is a failover during a file copy, you might get an error and the copy might be aborted. But if you don’t have any failovers, the copy will go faster.

For server workloads like Hyper-V and SQL Server, turning off CA will not make things any faster, but you will lose the ability to transparently failover.

Note that you can create two shares pointing to the same folder, one without CA for file copy operations only and one with CA for regular server workloads. Having those two shares might have the side effect of confusing your management software and your file server administrators.

5.5. Use SMB Multichannel

If you can afford some extra hardware, consider adding a second network interface of the same type and speed to leverage SMB Multichannel (using multiple network paths simultaneously).

This was introduced in Windows Server 2012 (along with SMB3) and you must have it on both sides of the copy to be effective.

SMB Multichannel might be able to help with many scenarios, including a single large file copy when you are constrained by bandwidth or IOPS.

Also check if you have a second port on your NIC that is not wired to the switch, which might be an even easier upgrade (you will still need some extra cables and switch ports to make it happen).

You can learn more about SMB Multichannel at http://blogs.technet.com/b/josebda/archive/2012/05/13/the-basics-of-smb-multichannel-a-feature-of-windows-server-2012-and-smb-3-0.aspx.

When using SMB Multichannel in a file server cluster, be sure to use multiple subnets, as described at http://blogs.technet.com/b/josebda/archive/2012/11/12/windows-server-2012-file-server-tip-use-multiple-subnets-when-deploying-smb-multichannel-in-a-cluster.aspx.

5.6. Windows Server 2012 R2 Update

There are specific copy-related improvements in the Windows Server 2012 R2 Update released in April 2014. That update is especially important if you are using a Continuously Available file share as the destination of your file copy. You can find information on how to obtain the update at http://technet.microsoft.com/en-us/library/dn645472.aspx.

By the way, we are constantly evolving Windows Server and the Scale-Out File Server. We release updates regularly and keeping up with them will give you the best results.

5.7. File copy for Hyper-V VM provisioning

One special case of file copy is related to the provisioning of virtual machines from a template. Typically you keep a “Library Share” with your VHDX files and you copy from this library to the deployment folder, where the VHDX will be associated with a running virtual machine.

You can avoid this by using differencing VHDX files, or you can use some interesting tricks (like live VHDX file re-parenting, introduced in Windows Server 2012) to optimize your VM provisioning.

You can find more details about your options at http://blogs.technet.com/b/josebda/archive/2012/03/20/windows-server-8-beta-hyper-v-over-smb-quick-provisioning-a-vm-on-an-smb-file-share.aspx.

5.8. System Center Virtual Machine Manager 2012 R2

If you’re using SCVMM for provisioning your VMs from a library, it’s highly recommended that you upgrade to SCVMM 2012 R2.

Before that release, SCVMM used the slower BITS protocol to transfer files from the library to their final destination. In the 2012 R2 release, VMM uses a new way to copy files, which will leverage things like SMB Multichannel, SMB COPYCHUNK and ODX offloads to significantly improve performance.

You can find some details on how VMM deploys virtual machines at http://technet.microsoft.com/en-us/library/hh368991.aspx(that bottom of the page includes the details on Fast File Copy).

6. Conclusion

If you’re using copies to measure performance of anything except file copies, I hope this post made it clear that’s not a good idea and convinced you to use other methods of storage performance testing.

If you’re actually trying to optimize file copies, I hope you were able to find at least one or two useful tips here.

Feel free to share your own file copy experiences and tips using the comment section below.

Note: This post was updated on 10/26/2014 to use DiskSpd instead of SQLIO.

↧

Adding Storage Performance to the Test-StorageHealth.ps1 script

September 7, 2014, 5:01 pm

≫ Next: DiskSpd, PowerShell and storage performance: measuring IOPs, throughput and latency for both local disks and SMB file shares

≪ Previous: Using file copy to measure storage performance – Why it’s not a good idea and what you should do instead

A few weeks ago, I published a script to the TechNet Script Center that gathers health and capacity information for a Windows Server 2012 R2 storage cluster. This script checks a number of components, including clustering, storage spaces, SMB file shares and core storage. You can get more details at http://blogs.technet.com/b/josebda/archive/2014/07/27/powershell-script-for-storage-cluster-health-test-published-to-the-technet-script-center.aspx

From the beginning, I wanted to add the ability to look at storage performance in addition to health and capacity. This was added yesterday in version 1.7 of the script and this blog describes the decisions along the way and the details on how it was implemented.

Layers and objects

The first decision was the layer we should use to monitor storage performance. You can look at the core storage, storage spaces, cluster, SMB server, SMB client or Hyper-V. The last two would actually come from the client side, outside the storage cluster itself. Ideally I would gather storage performance from all these layers, but honestly that would be just too much to capture and also to review later.

We also have lots of objects, like physical disks, pools, tiers, virtual disks, partitions, volumes and file shares. Most people would agree that looking at volumes would be the most useful, if you had to choose only one. It helps that most deployments will use a single file share per volume and a single partition and volume per virtual disk. In these deployments, a share matches a single volume that matches a single virtual disk. Looking at pool (which typically hosts multiple virtual disks) would also be nice.

In the end, I decided to go with the view from the CSV file system, which looks at volumes (CSVs, really) and offers the metrics we need. It is also nice that it reports a consistent volume ID across the many nodes of the cluster. That made it easier to consolidate data and then correlate to the right volume, SMB share and pool.

Metrics

The other important decision is which of the many metrics we would show. There is plenty of information you can capture, including number of IOs, latency, bandwidth, queue depth and several others. These can also be captured as current values, total value since the system was started or an average for a given time period. In addition, you can capture those for read operations, write operations, metadata operations or all together.

After much debating with the team on this topic, we concluded the single most important metric to find out if a system is reaching its performance limit would be latency. IOs per second (IOPs) was also important to get an idea of how much work is currently being handled by the cluster. Differentiating read and write IOs would also be desirable, since they usually show different behavior.

To keep it reasonable, I made the call to gather read IOPS (average number of read operations per second), write IOPS, read latency (average latency of read operations per second, as measured by the CSV file system) and write latency. These 4 performance counters were captured per volume on every node of the cluster. Since it was easy and also useful, the script also shows total IOPS (sum of reads and writes) and total latency (average of reads and writes).

Samples

The other important decision was how many samples we would take. Ideally we would constantly monitor the storage performance and would have a full history of everything that ever happened to a storage cluster by the millisecond. That would be a lot of data, though. We would need another storage cluster just to store the performance information for a storage cluster :-) (just half joking here).

The other problem is that the storage health script aims to be as nonintrusive as possible. To constantly gather performance information we require need some sort of agent or service running on every node and we definitely did not want to go there.

The decision was to take a few samples only during the execution of the script. It gathers 60 samples, 1 second apart. During those 60 seconds the script is simply waiting and it's doing nothing else. I considered starting the capture on a separate thread (PowerShell job) and let it run while we’re gathering other health information, but I was afraid that the results would be impacted. I figured that waiting for one minute would be reasonable.

Capture

There are a few different ways to capture performance data. Using the performance counter infrastructure seems like the way to go, but even there you have a few different options. We could save the raw performance information to a file and parse it later. We could also use Win32 APIs to gather counters.

Since this is a PowerShell script, I decided to go with the Get-Counter cmdlet. It provides a simple way to get a specified list of counters, including multiple servers and multiple samples. The scripts uses a single command to gather all the relevant data, which is kept in memory and later processed.

Here’s some sample code:

$Samples = 60
$Nodes = “Cluster123N17”, “Cluster123N18”, “Cluster123N19”, “Cluster123N20”
$Names = “reads/sec”, “writes/sec” , “read latency”, “write latency”
$Items = $Nodes | % { $Node=$_; $Names | % { (”\\”+$Node+”\Cluster CSV File System(*)\”+$_) } }
$RawCounters = Get-Counter -Counter $Items -SampleInterval 1 -MaxSamples $Samples

The script then goes on to massage the data a bit. For the raw data, I wanted to fill in some related information (like pool and share) for every line. This would make it a proper fact table for Excel pivot tables, once it's exported to a comma-separated file. The other processing needed is summarizing the raw data into per-volume totals and averages.

Summary

I spent some time figuring out the best way to show a summary of the 60 seconds of performance data from the many nodes and volumes. The goal was to have something that would fit into a single screen for a typical configuration with a couple of dozen volumes.

The script shows one line per volume, but also includes the associated pool name and file share name. For each line you get read/write/total IOPS and also read/write/total latency. IOPS are shown as an integer and latency is shown in milliseconds with 3 decimals. The data is sorted in descending order by average latency, which should show the busiest volume/share on top.

Here’s a sample output

Pool Volume   Share    ReadIOPS WriteIOPS TotalIOPS ReadLatency (ms) WriteLatency (ms)
---- ------   -----    -------- --------- --------- ---------------- -----------------
Pool2 volume15 TShare8       162         6       168           33.771            52.893
Pool2 volume16 TShare9        38       858       896           37.241             17.12
Pool2 volume10 TShare11        0         9         9                0             6.749
Pool2 volume17 TShare10       20        19        39            4.128              8.95
Pool2 volume13 HShare         13       243       256            3.845             8.424
Pool2 volume11 TShare12        0         7         7            0.339             5.959
Pool1 volume8 TShare6       552       418       970            5.041             4.977
Pool2 volume14 TShare7         3        12        15            2.988             5.814
Pool3 volume28 BShare28        0        11        11                0             4.955
Pool1 volume6 TShare4       232         3       235            1.626             5.838
Pool1 volume7 TShare5        62       156       218            1.807             4.241
Pool1 volume3 TShare1         0         0         0                0                 0
Pool3 volume30 BShare30        0         0         0                0                 0
…

Excel

Another way to look at the data is to get the raw output and use Excel. You can find that data in a file is saved as a comma-separated values on the output folder (C:\HealthTest by default) under the name VolumePerformanceDetails.TXT.

If you know your way with Excel and pivot tables, you can extract more details. You have access to all the 60 samples and to the data for each of the nodes. The data also includes pool name, share name and owner node, which do not come with a simple Get-Counter cmdlet. Here is another example of a pivot table in Excel (using a different data set from the one shown above):

Conclusion

I hope you liked the new storage performance section of the Test-StorageHealth script. As with the rest of the script (on health and capacity) the idea is to provide simple way to get a useful summary and collect additional data you could dig into.

Let me know how it works for you. I welcome feedback on the specific choices made (layer, metrics, samples, summary) and further ideas on how to make it more useful.

↧

DiskSpd, PowerShell and storage performance: measuring IOPs, throughput and latency for both local disks and SMB file shares

October 13, 2014, 3:57 pm

≫ Next: SmbStorageTier.ps1: A simple way to pin files to tiers in Scale-Out File Servers

≪ Previous: Adding Storage Performance to the Test-StorageHealth.ps1 script

1. Introduction

I have been doing storage-related demos and publishing blogs with some storage performance numbers for a while, and I commonly get questions such as “How do you run these tests?” or “What tools do you use to generate IOs for your demos?”. While it’s always best to use a real workload to test storage, sometimes that is not convenient. In the past, I frequently used and recommended a free tool from Microsoft to simulate IOs called SQLIO. However, there is a better tool that was recently released by Microsoft called DiskSpd. This is a flexible tool that can simulate many different types of workloads. And you can apply it to several configurations, from a physical host or virtual machine, using all kinds of storage, including local disks, LUNs on a SAN, Storage Spaces or SMB file shares.

2. Download the tool

To get started, you need to download and install the DiskSpd. You can get the tool from http://aka.ms/DiskSpd. It comes in the form of a ZIP file that you can open and copy local folder. There are actually 3 subfolders with different versions of the tool included in the ZIP file: amd64fre (for 64-bit systems), x86fre (for 32-bit systems) and armfre (for ARM systems). This allows you to run it in pretty much every Windows version, client or server.

In the end, you really only need one of the versions of DiskSpd.EXE files included in the ZIP (the one that best fits your platform). If you’re using a recent version of Windows Server, you probably want the version in the amd64fre folder. In this blog post, I assume that you copied the correct version of DiskSpd.EXE to the C:\DiskSpd local folder.

If you're a developer, you might also want to take a look at the source code for DiskSpd. You can find that at https://github.com/microsoft/diskspd.

3. Run the tool

When you’re ready to start running DiskSpd, you want to make sure there’s nothing else running on the computer. Other running process can interfere with your results by putting additional load on the CPU, network or storage. If the disk you are using is shared in any way (like a LUN on a SAN), you want to make sure that nothing else is competing with your testing. If you’re using any form of IP storage (iSCSI LUN, SMB file share), you want to make sure that you’re not running on a network congested with other kinds of traffic.

WARNING: You could be generating a whole lot of disk IO, network traffic and/or CPU load when you run DiskSpd. If you’re in a shared environment, you might want to talk to your administrator and ask permission. This could generate a whole lot of load and disturb anyone else using other VMs in the same host, other LUNs on the same SAN or other traffic on the same network.

WARNING: If you use DiskSpd to write data to a physical disk, you might destroy the data on that disk. DiskSpd does not ask for confirmation. It assumes you know what you are doing. Be careful when using physical disks (as opposed to files) with DiskSpd.

NOTE: You should run DiskSpd from an elevated command prompt. This will make sure file creation is fast. Otherwise, DiskSpd will fall back to a slower method of creating files. In the example below, when you're using a 1TB file, that might take a long time.

From an old command prompt or a PowerShell prompt, issue a single command line to start getting some performance results. Here is your first example using 8 threads of execution, each generating 8 outstanding random 8KB unbuffered read IOs:

PS C:\DiskSpd> C:\DiskSpd\diskspd.exe -c1000G -d10 -r -w0 -t8 -o8 -b8K -h -L X:\testfile.dat
Command Line: C:\DiskSpd\diskspd.exe -c1000G -d10 -r -w0 -t8 -o8 -b8K -h -L X:\testfile.dat
Input parameters:
        timespan:   1
        -------------
        duration: 10s
        warm up time: 5s
        cool down time: 0s
        measuring latency
        random seed: 0
        path: 'X:\testfile.dat'
                think time: 0ms
                burst size: 0
                software and hardware cache disabled
                performing read test
                block size: 8192
                using random I/O (alignment: 8192)
                number of outstanding I/O operations: 8
                stride size: 8192
                thread stride size: 0
                threads per file: 8
                using I/O Completion Ports
                IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:       10.01s
thread count:           8
proc count:             4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0|   5.31%|   0.16%|    5.15%| 94.76%
   1|   1.87%|   0.47%|    1.40%| 98.19%
   2|   1.25%|   0.16%|    1.09%| 98.82%
   3|   2.97%|   0.47%|    2.50%| 97.10%
-------------------------------------------
avg.|   2.85%|   0.31%|    2.54%| 97.22%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |        20480000 |         2500 |       1.95 |     249.77 |   32.502 |    55.200 | X:\testfile.dat (1000GB)
     1 |        20635648 |         2519 |       1.97 |     251.67 |   32.146 |    54.405 | X:\testfile.dat (1000GB)
     2 |        21094400 |         2575 |       2.01 |     257.26 |   31.412 |    53.410 | X:\testfile.dat (1000GB)
     3 |        20553728 |         2509 |       1.96 |     250.67 |   32.343 |    56.548 | X:\testfile.dat (1000GB)
     4 |        20365312 |         2486 |       1.94 |     248.37 |   32.599 |    54.448 | X:\testfile.dat (1000GB)
     5 |        20160512 |         2461 |       1.92 |     245.87 |   32.982 |    54.838 | X:\testfile.dat (1000GB)
     6 |        19972096 |         2438 |       1.90 |     243.58 |   33.293 |    55.178 | X:\testfile.dat (1000GB)
     7 |        19578880 |         2390 |       1.87 |     238.78 |   33.848 |    58.472 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:         162840576 |        19878 |      15.52 |    1985.97 |   32.626 |    55.312
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |        20480000 |         2500 |       1.95 |     249.77 |   32.502 |    55.200 | X:\testfile.dat (1000GB)
     1 |        20635648 |         2519 |       1.97 |     251.67 |   32.146 |    54.405 | X:\testfile.dat (1000GB)
     2 |        21094400 |         2575 |       2.01 |     257.26 |   31.412 |    53.410 | X:\testfile.dat (1000GB)
     3 |        20553728 |         2509 |       1.96 |     250.67 |   32.343 |    56.548 | X:\testfile.dat (1000GB)
     4 |        20365312 |         2486 |       1.94 |     248.37 |   32.599 |    54.448 | X:\testfile.dat (1000GB)
     5 |        20160512 |         2461 |       1.92 |     245.87 |   32.982 |    54.838 | X:\testfile.dat (1000GB)
     6 |        19972096 |         2438 |       1.90 |     243.58 |   33.293 |    55.178 | X:\testfile.dat (1000GB)
     7 |        19578880 |         2390 |       1.87 |     238.78 |   33.848 |    58.472 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:         162840576 |        19878 |      15.52 |    1985.97 |   32.626 |    55.312
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     1 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     2 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     3 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     4 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     5 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     6 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     7 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      3.360 |        N/A |      3.360
   25th |      5.031 |        N/A |      5.031
   50th |      8.309 |        N/A |      8.309
   75th |     12.630 |        N/A |     12.630
   90th |    148.845 |        N/A |    148.845
   95th |    160.892 |        N/A |    160.892
   99th |    172.259 |        N/A |    172.259
3-nines |    254.020 |        N/A |    254.020
4-nines |    613.602 |        N/A |    613.602
5-nines |    823.760 |        N/A |    823.760
6-nines |    823.760 |        N/A |    823.760
7-nines |    823.760 |        N/A |    823.760
8-nines |    823.760 |        N/A |    823.760
    max |    823.760 |        N/A |    823.760

NOTE: The -w0 is the default, so you could skip it. I'm keeping it here to be explicit about the fact we're doing all reads.

For this specific disk, I am getting 1,985 IOPS, 15.52 MB/sec of average throughput and 32.626 milliseconds of average latency. I’m getting all that information from the blue line above.

That average latency looks high for small IOs (even though this is coming from a set of HDDs), but we’ll examine that later.

Now, let’s try now another command using sequential 512KB reads on that same file. I’ll use 2 threads with 8 outstanding IOs per thread this time:

PS C:\DiskSpd> C:\DiskSpd\diskspd.exe -c1000G -d10 -w0 -t2 -o8 -b512K -h -L X:\testfile.dat
Command Line: C:\DiskSpd\diskspd.exe -c1000G -d10 -w0 -t2 -o8 -b512K -h -L X:\testfile.dat
Input parameters:
        timespan:   1
        -------------
        duration: 10s
        warm up time: 5s
        cool down time: 0s
        measuring latency
        random seed: 0
        path: 'X:\testfile.dat'
                think time: 0ms
                burst size: 0
                software and hardware cache disabled
                performing read test
                block size: 524288
                number of outstanding I/O operations: 8
                stride size: 524288
                thread stride size: 0
                threads per file: 2
                using I/O Completion Ports
                IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:       10.00s
thread count:           2
proc count:             4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0|   4.53%|   0.31%|    4.22%| 95.44%
   1|   1.25%|   0.16%|    1.09%| 98.72%
   2|   0.00%|   0.00%|    0.00%| 99.97%
   3|   0.00%|   0.00%|    0.00%| 99.97%
-------------------------------------------
avg.|   1.44%|   0.12%|    1.33%| 98.52%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |       886046720 |         1690 |      84.47 |     168.95 |   46.749 |    47.545 | X:\testfile.dat (1000GB)
     1 |       851443712 |         1624 |      81.17 |     162.35 |   49.497 |    54.084 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:        1737490432 |         3314 |     165.65 |     331.29 |   48.095 |    50.873
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |       886046720 |         1690 |      84.47 |     168.95 |   46.749 |    47.545 | X:\testfile.dat (1000GB)
     1 |       851443712 |         1624 |      81.17 |     162.35 |   49.497 |    54.084 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:        1737490432 |         3314 |     165.65 |     331.29 |   48.095 |    50.873
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
     1 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      9.406 |        N/A |      9.406
   25th |     31.087 |        N/A |     31.087
   50th |     38.397 |        N/A |     38.397
   75th |     47.216 |        N/A |     47.216
   90th |     64.783 |        N/A |     64.783
   95th |     90.786 |        N/A |     90.786
   99th |    356.669 |        N/A |    356.669
3-nines |    452.198 |        N/A |    452.198
4-nines |    686.307 |        N/A |    686.307
5-nines |    686.307 |        N/A |    686.307
6-nines |    686.307 |        N/A |    686.307
7-nines |    686.307 |        N/A |    686.307
8-nines |    686.307 |        N/A |    686.307
    max |    686.307 |        N/A |    686.307

With that configuration and parameters, I got about 165.65 MB/sec of throughput with an average latency of 48.095 milliseconds per IO. Again, that latency sounds high even for 512KB IOs and we’ll dive into that topic later on.

5. Understand the parameters used

Now let’s inspect the parameters on those DiskSpd command lines. I know it’s a bit overwhelming at first, but you will get used to it. And keep in mind that, for DiskSpd parameters, lowercase and uppercase mean different things, so be very careful.

Here is the explanation for the parameters used above:

PS C:\> C:\DiskSpd\diskspd.exe -c1G -d10 -r -w0 -t8 -o8 -b8K -h -L X:\testfile.dat

Parameter	Description	Notes
-c	Size of file used.	Specify the number of bytes or use suffixes like K, M or G (KB, MB, or GB). You should use a large size (all of the disk) for HDDs, since small files will show unrealistically high performance (short stroking).
-d	The duration of the test, in seconds.	You can use 10 seconds for a quick test. For any serious work, use at least 60 seconds.
-w	Percentage of writes.	0 means all reads, 100 means all writes, 30 means 30% writes and 70% reads. Be careful with using writes on SSDs for a long time, since they can wear out the drive. The default is 0.
-r	Random	Random is common for OLTP workloads. Sequential (when –r is not specified) is common for Reporting, Data Warehousing.
-b	Size of the IO in KB	Specify the number of bytes or use suffixes like K, M or G (KB, MB, or GB). 8K is the typical IO for OLTP workloads. 512K is common for Reporting, Data Warehousing.
-t	Threads per file	For large IOs, just a couple is enough. Sometimes just one. For small IOs, you could need as many as the number of CPU cores.
-o	Outstanding IOs or queue depth (per thread)	In RAID, SAN or Storage Spaces setups, a single disk can be made up of multiple physical disks. You can start with twice the number of physical disks used by the volume where the file sits. Using a higher number will increase your latency, but can get you more IOPs and throughput.
-L	Capture latency information	Always important to know the average time to complete an IO, end-to-end.
-h	Disable hardware and software caching	No hardware or software buffering. Buffering plus a small file size will give you performance of the memory, not the disk.

For OLTP workloads, I commonly start with 8KB random IOs, 8 threads, 16 outstanding per thread. 8KB is the size of the page used by SQL Server for its data files. In parameter form, that would be: –r -b8K -t8 -o16. For reporting or OLAP workloads with large IO, I commonly start with 512KB IOs, 2 threads and 16 outstanding per thread. 512KB is a common IO size when SQL Server loads a batch of 64 data pages when using the read ahead technique for a table scan. In parameter form, that would be: -b512K -t2 -o16. These numbers will need to be adjusted if your machine has many cores and/or if you volume is backed up by a large number of physical disks.

If you’re curious, here are more details about parameters for DiskSpd, coming from the tool’s help itself:

PS C:\> C:\DiskSpd\diskspd.exe

Usage: C:\DiskSpd\diskspd.exe [options] target1 [ target2 [ target3 ...] ]
version 2.0.12 (2014/09/17)

Available targets:
       file_path
       #<physical drive number>
       <partition_drive_letter>:

Available options:
-?                 display usage information
-a#[,#[...]]       advanced CPU affinity - affinitize threads to CPUs provided after -a
                       in a round-robin manner within current KGroup (CPU count starts with 0); the same CPU
                       can be listed more than once and the number of CPUs can be different
                       than the number of files or threads (cannot be used with -n)
-ag                group affinity - affinitize threads in a round-robin manner across KGroups
-b<size>[K|M|G]    block size in bytes/KB/MB/GB [default=64K]
-B<offs>[K|M|G|b] base file offset in bytes/KB/MB/GB/blocks [default=0]
                       (offset from the beginning of the file)
-c<size>[K|M|G|b] create files of the given size.
                       Size can be stated in bytes/KB/MB/GB/blocks
-C<seconds>        cool down time - duration of the test after measurements finished [default=0s].
-D<bucketDuration> Print IOPS standard deviations. The deviations are calculated for samples of duration <bucketDuration>.
                       <bucketDuration> is given in milliseconds and the default value is 1000.
-d<seconds>        duration (in seconds) to run test [default=10s]
-f<size>[K|M|G|b] file size - this parameter can be used to use only the part of the file/disk/partition
                       for example to test only the first sectors of disk
-fr                open file with the FILE_FLAG_RANDOM_ACCESS hint
-fs                open file with the FILE_FLAG_SEQUENTIAL_SCAN hint
-F<count>          total number of threads (cannot be used with -t)
-g<bytes per ms>   throughput per thread is throttled to given bytes per millisecond
                       note that this can not be specified when using completion routines
-h                 disable both software and hardware caching
-i<count>          number of IOs (burst size) before thinking. must be specified with -j
-j<duration>       time to think in ms before issuing a burst of IOs (burst size). must be specified with -i
-I<priority>       Set IO priority to <priority>. Available values are: 1-very low, 2-low, 3-normal (default)
-l                 Use large pages for IO buffers
-L                 measure latency statistics
-n                 disable affinity (cannot be used with -a)
-o<count>          number of overlapped I/O requests per file per thread
                       (1=synchronous I/O, unless more than 1 thread is specified with -F)
                       [default=2]
-p                 start async (overlapped) I/O operations with the same offset
                       (makes sense only with -o2 or grater)
-P<count>          enable printing a progress dot after each <count> completed I/O operations
                       (counted separately by each thread) [default count=65536]
-r<align>[K|M|G|b] random I/O aligned to <align> bytes (doesn't make sense with -s).
                       <align> can be stated in bytes/KB/MB/GB/blocks
                       [default access=sequential, default alignment=block size]
-R<text|xml>       output format. Default is text.
-s<size>[K|M|G|b] stride size (offset between starting positions of subsequent I/O operations)
-S                 disable OS caching
-t<count>          number of threads per file (cannot be used with -F)
-T<offs>[K|M|G|b] stride between I/O operations performed on the same file by different threads
                       [default=0] (starting offset = base file offset + (thread number * <offs>)
                       it makes sense only with -t or -F
-v                 verbose mode
-w<percentage>     percentage of write requests (-w and -w0 are equivalent).
                     absence of this switch indicates 100% reads
                       IMPORTANT: Your data will be destroyed without a warning
-W<seconds>        warm up time - duration of the test before measurements start [default=5s].
-x                 use completion routines instead of I/O Completion Ports
-X<path>           use an XML file for configuring the workload. Cannot be used with other parameters.
-z                 set random seed [default=0 if parameter not provided, GetTickCount() if value not provided]

Write buffers:
-Z                        zero buffers used for write tests
-Z<size>[K|M|G|b]         use a global <size> buffer filled with random data as a source for write operations.
-Z<size>[K|M|G|b],<file> use a global <size> buffer filled with data from <file> as a source for write operations.
                              If <file> is smaller than <size>, its content will be repeated multiple times in the buffer.

By default, the write buffers are filled with a repeating pattern (0, 1, 2, ..., 255, 0, 1, ...)

Synchronization:
-ys<eventname>     signals event <eventname> before starting the actual run (no warmup)
                       (creates a notification event if <eventname> does not exist)
-yf<eventname>     signals event <eventname> after the actual run finishes (no cooldown)
                       (creates a notification event if <eventname> does not exist)
-yr<eventname>     waits on event <eventname> before starting the run (including warmup)
                       (creates a notification event if <eventname> does not exist)
-yp<eventname>     allows to stop the run when event <eventname> is set; it also binds CTRL+C to this event
                       (creates a notification event if <eventname> does not exist)
-ye<eventname>     sets event <eventname> and quits

Event Tracing:
-ep                   use paged memory for NT Kernel Logger (by default it uses non-paged memory)
-eq                   use perf timer
-es                   use system timer (default)
-ec                   use cycle count
-ePROCESS             process start & end
-eTHREAD              thread start & end
-eIMAGE_LOAD          image load
-eDISK_IO             physical disk IO
-eMEMORY_PAGE_FAULTS all page faults
-eMEMORY_HARD_FAULTS hard faults only
-eNETWORK             TCP/IP, UDP/IP send & receive
-eREGISTRY            registry calls

Examples:

Create 8192KB file and run read test on it for 1 second:

C:\DiskSpd\diskspd.exe -c8192K -d1 testfile.dat

Set block size to 4KB, create 2 threads per file, 32 overlapped (outstanding)
I/O operations per thread, disable all caching mechanisms and run block-aligned random
access read test lasting 10 seconds:

C:\DiskSpd\diskspd.exe -b4K -t2 -r -o32 -d10 -h testfile.dat

Create two 1GB files, set block size to 4KB, create 2 threads per file, affinitize threads
to CPUs 0 and 1 (each file will have threads affinitized to both CPUs) and run read test
lasting 10 seconds:

C:\DiskSpd\diskspd.exe -c1G -b4K -t2 -d10 -a0,1 testfile1.dat testfile2.dat

6. Tune the parameters for large sequential IO

Now that you have the basics down, we can spend some time looking at how you can refine your number of threads and queue depth for your specific configuration. This might help us figure out why we had those higher than expected latency numbers in the initial runs. You basically need to experiment with the -t and the -o parameters until you find the one that give you the best results. You first want to find out the latency for a given system with a queue depth of 1. Then you can increase the queue depth and check what happens in terms of IOPs, throughput and latency.

Keep in mind that many logical (and “physical”) disks may have multiple IO paths. That’s the case in the examples mentioned here, but also true for most cloud storage systems and some physical drives (especially SSDs). In general, increasing outstanding IOs will have minimal impact on latency until the IO paths start to saturate. Then latency will start to increase dramatically.

Here’s a sample script that measures queue depth from 1 to 16, parsing the output of DiskSpd to give us just the information we need. The results for each DiskSpd run are stored in the $result variable and parsed to show IOPs, throughput, latency and CPU usage on a single line. There is some fun string parsing going on there, first to find the line that contains the information we’re looking for, and then using the Split() function to break that line into the individual metrics we need. DiskSpd has the -Rxml option to output XML instead of text, but for me it was easier to parse the text.

1..16 | % {
   $param = "-o $_"
   $result = C:\DiskSpd\diskspd.exe -c1000G -d10 -w0 -t1 $param -b512K -h -L X:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   "Param $param, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}

Here is the output:

Param -o 1, 61.01 iops, 30.50 MB/sec, 16.355 ms, 0.20% CPU
Param -o 2, 140.99 iops, 70.50 MB/sec, 14.143 ms, 0.31% CPU
Param -o 3, 189.00 iops, 94.50 MB/sec, 15.855 ms, 0.47% CPU
Param -o 4, 248.20 iops, 124.10 MB/sec, 16.095 ms, 0.47% CPU
Param -o 5, 286.45 iops, 143.23 MB/sec, 17.431 ms, 0.94% CPU
Param -o 6, 316.05 iops, 158.02 MB/sec, 19.052 ms, 0.78% CPU
Param -o 7, 332.51 iops, 166.25 MB/sec, 21.059 ms, 0.66% CPU
Param -o 8, 336.16 iops, 168.08 MB/sec, 23.875 ms, 0.82% CPU
Param -o 9, 339.95 iops, 169.97 MB/sec, 26.482 ms, 0.55% CPU
Param -o 10, 340.93 iops, 170.46 MB/sec, 29.373 ms, 0.70% CPU
Param -o 11, 338.58 iops, 169.29 MB/sec, 32.567 ms, 0.55% CPU
Param -o 12, 344.98 iops, 172.49 MB/sec, 34.675 ms, 1.09% CPU
Param -o 13, 332.09 iops, 166.05 MB/sec, 39.190 ms, 0.82% CPU
Param -o 14, 341.05 iops, 170.52 MB/sec, 41.127 ms, 1.02% CPU
Param -o 15, 339.73 iops, 169.86 MB/sec, 44.037 ms, 0.39% CPU
Param -o 16, 335.43 iops, 167.72 MB/sec, 47.594 ms, 0.86% CPU

For large sequential IOs, we typically want to watch the throughput (in MB/sec). There is a significant increase until we reach 6 outstanding IOs, which gives us around 158 MB/sec with 19 millisecond of latency per IO. You can clearly see that if you don’t queue up some IO, you’re not extracting the full throughput of this disk, since we’ll be processing the data while the disks are idle waiting for more work. If we queue more than 6 IOs, we really don’t get much more throughput, we only manage to increase the latency, as the disk subsystem is unable to give you much more throughput. You can queue up 10 IOs to reach 170 MB/sec, but we increase latency to nearly 30 milliseconds (a latency increase of 50% for a gain of only 8% in throughput).

At this point, it is clear that using multiple outstanding IOs is a great idea. However, using more outstanding IOs than what your target application can drive will be misleading as it will achieve throughput the application isn’t architected to achieve. Using less outstanding IOs than what the application can drive may lead to an incorrect conclusion that the disk can’t achieve the necessary throughput, because the full parallelism of the disk isn’t being utilized. You should try to find what your specific application does to make sure that your DiskSpd simulation is a good approximation of your real workload.

So, looking at the data above, we can conclude that 6 outstanding IOs is a reasonable number for this storage subsystem. Now we can see if we can gain by spreading the work across multiple threads. What we want to avoid here is bottlenecking on a single CPU core, which is very common we doing lots and lots of IO. A simple experiment is to double the number of threads while reducing the queue depth by half. Let’s now try 2 threads instead of 1.

1..8 | % {
   $param = "-o $_"
   $result = C:\DiskSpd\diskspd.exe -c1000G -d10 -w0 -t2 $param -b512K -h -L X:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   "Param –t2 $param, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}

Here is the output with 2 threads and a queue depth of 1:

Param –t2 -o 1, 162.01 iops, 81.01 MB/sec, 12.500 ms, 0.35% CPU
Param –t2 -o 2, 250.47 iops, 125.24 MB/sec, 15.956 ms, 0.82% CPU
Param –t2 -o 3, 312.52 iops, 156.26 MB/sec, 19.137 ms, 0.98% CPU
Param –t2 -o 4, 331.28 iops, 165.64 MB/sec, 24.136 ms, 0.82% CPU
Param –t2 -o 5, 342.45 iops, 171.23 MB/sec, 29.180 ms, 0.74% CPU
Param –t2 -o 6, 340.59 iops, 170.30 MB/sec, 35.391 ms, 1.17% CPU
Param –t2 -o 7, 337.75 iops, 168.87 MB/sec, 41.400 ms, 1.05% CPU
Param –t2 -o 8, 336.15 iops, 168.08 MB/sec, 47.859 ms, 0.90% CPU

Well, it seems like we were not bottlenecked on CPU after all (we sort of knew that already). So, with 2 threads and 3 outstanding IOs per thread, we effective get 6 total outstanding IOs and the performance numbers match what we got with 1 thread and queue depth of 6 in terms of throughput and latency. That pretty much proves that 1 thread was enough for this kind of configuration and workload and that increasing the number of threads yields no gain. This is not surprising for large IO. However, for smaller IO size, the CPU is more taxed and we might hit a single core bottleneck. We can look at the full DiskSpd output to confirm that no single core has pegged with 1 thread:

PS C:\DiskSpd> C:\DiskSpd\diskspd.exe -c1000G -d10 -w0 -t1 -o6 -b512K -h -L X:\testfile.dat
Command Line: C:\DiskSpd\diskspd.exe -c1000G -d10 -w0 -t1 -o6 -b512K -h -L X:\testfile.dat
Input parameters:
        timespan:   1
        -------------
        duration: 10s
        warm up time: 5s
        cool down time: 0s
        measuring latency
        random seed: 0
        path: 'X:\testfile.dat'
                think time: 0ms
                burst size: 0
                software and hardware cache disabled
                performing read test
                block size: 524288
                number of outstanding I/O operations: 6
                stride size: 524288
                thread stride size: 0
                threads per file: 1
                using I/O Completion Ports
                IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:       10.00s
thread count:           1
proc count:             4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0|   2.03%|   0.16%|    1.87%| 97.96%
   1|   0.00%|   0.00%|    0.00%| 99.84%
   2|   0.00%|   0.00%|    0.00%| 100.15%
   3|   0.00%|   0.00%|    0.00%| 100.31%
-------------------------------------------
avg.|   0.51%|   0.04%|    0.47%| 99.56%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |      1664614400 |         3175 |     158.74 |     317.48 |   18.853 |    21.943 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:        1664614400 |         3175 |     158.74 |     317.48 |   18.853 |    21.943
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |      1664614400 |         3175 |     158.74 |     317.48 |   18.853 |    21.943 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:        1664614400 |         3175 |     158.74 |     317.48 |   18.853 |    21.943
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      7.743 |        N/A |      7.743
   25th |     13.151 |        N/A |     13.151
   50th |     15.301 |        N/A |     15.301
   75th |     17.777 |        N/A |     17.777
   90th |     22.027 |        N/A |     22.027
   95th |     29.791 |        N/A |     29.791
   99th |    102.261 |        N/A |    102.261
3-nines |    346.305 |        N/A |    346.305
4-nines |    437.603 |        N/A |    437.603
5-nines |    437.603 |        N/A |    437.603
6-nines |    437.603 |        N/A |    437.603
7-nines |    437.603 |        N/A |    437.603
8-nines |    437.603 |        N/A |    437.603
    max |    437.603 |        N/A |    437.603

This confirms we’re not bottleneck on any of CPU cores. You can see above that the busiest CPU core is at only around 2% use.

7. Tune queue depth for small random IOs

Performing the same tuning exercise for small random IOS is typically more interesting, especially when you have fast storage. For this one, we’ll continue to use the same PowerShell script. However, for small IOs, we’ll try a larger number for queue depth. This might take a while to run, though… Here’s a script that you can run from a PowerShell prompt, trying out many different queue depths:

1..24 | % {
   $param = "-o $_"
   $result = C:\DiskSpd\DiskSpd.exe -c1000G -d10 -w0 -r -b8k $param -t1 -h -L X:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   "Param $param, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}

As you can see, the script runs DiskSpd 24 times, using different queue depths. Here’s the sample output:

Param -o 1, 191.06 iops, 1.49 MB/sec, 5.222 ms, 0.27% CPU
Param -o 2, 361.10 iops, 2.82 MB/sec, 5.530 ms, 0.82% CPU
Param -o 3, 627.30 iops, 4.90 MB/sec, 4.737 ms, 1.02% CPU
Param -o 4, 773.70 iops, 6.04 MB/sec, 5.164 ms, 1.02% CPU
Param -o 5, 1030.65 iops, 8.05 MB/sec, 4.840 ms, 0.86% CPU
Param -o 6, 1191.29 iops, 9.31 MB/sec, 5.030 ms, 1.33% CPU
Param -o 7, 1357.42 iops, 10.60 MB/sec, 5.152 ms, 1.13% CPU
Param -o 8, 1674.22 iops, 13.08 MB/sec, 4.778 ms, 2.07% CPU
Param -o 9, 1895.25 iops, 14.81 MB/sec, 4.745 ms, 1.60% CPU
Param -o 10, 2097.54 iops, 16.39 MB/sec, 4.768 ms, 1.95% CPU
Param -o 11, 2014.49 iops, 15.74 MB/sec, 5.467 ms, 2.03% CPU
Param -o 12, 1981.64 iops, 15.48 MB/sec, 6.055 ms, 1.84% CPU
Param -o 13, 2000.11 iops, 15.63 MB/sec, 6.517 ms, 1.72% CPU
Param -o 14, 1968.79 iops, 15.38 MB/sec, 7.113 ms, 1.79% CPU
Param -o 15, 1970.69 iops, 15.40 MB/sec, 7.646 ms, 2.34% CPU
Param -o 16, 1983.77 iops, 15.50 MB/sec, 8.069 ms, 1.80% CPU
Param -o 17, 1976.84 iops, 15.44 MB/sec, 8.599 ms, 1.56% CPU
Param -o 18, 1982.57 iops, 15.49 MB/sec, 9.049 ms, 2.11% CPU
Param -o 19, 1993.13 iops, 15.57 MB/sec, 9.577 ms, 2.30% CPU
Param -o 20, 1967.71 iops, 15.37 MB/sec, 10.121 ms, 2.30% CPU
Param -o 21, 1964.76 iops, 15.35 MB/sec, 10.699 ms, 1.29% CPU
Param -o 22, 1984.55 iops, 15.50 MB/sec, 11.099 ms, 1.76% CPU
Param -o 23, 1965.34 iops, 15.35 MB/sec, 11.658 ms, 1.37% CPU
Param -o 24, 1983.87 iops, 15.50 MB/sec, 12.161 ms, 1.48% CPU

As you can see, for small IOs, we got consistently better performance as we increased the queue depth for the first few runs. After a certain number of outstanding IOs, adding more started giving us very little improvement until things flatten out completely. As we kept adding more queue depth, all we had was more latency with no additional benefit in IOPS or throughput. If you have a better storage subsystem, you might need to try even higher queue depths. If you don’t hit an IOPS plateau with increasing average latency, you did not queue enough IO to fully exploit the capabilities of your storage subsystem.

So, in this setup, we seem to reach a limit at around 10 outstanding IOs and latency starts to ramp up more dramatically after that. Let’s see the full output for queue depth of 10 to get a good sense:

PS C:\DiskSpd> C:\DiskSpd\DiskSpd.exe -c1000G -d10 -w0 -r -b8k -o10 -t1 -h -L X:\testfile.dat
Command Line: C:\DiskSpd\DiskSpd.exe -c1000G -d10 -w0 -r -b8k -o10 -t1 -h -L X:\testfile.dat
Input parameters:
        timespan:   1
        -------------
        duration: 10s
        warm up time: 5s
        cool down time: 0s
        measuring latency
        random seed: 0
        path: 'X:\testfile.dat'
                think time: 0ms
                burst size: 0
                software and hardware cache disabled
                performing read test
                block size: 8192
                using random I/O (alignment: 8192)
                number of outstanding I/O operations: 10
                stride size: 8192
                thread stride size: 0
                threads per file: 1
                using I/O Completion Ports
                IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:       10.01s
thread count:           1
proc count:             4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0|   8.58%|   1.09%|    7.49%| 91.45%
   1|   0.00%|   0.00%|    0.00%| 100.03%
   2|   0.00%|   0.00%|    0.00%| 99.88%
   3|   0.00%|   0.00%|    0.00%| 100.03%
-------------------------------------------
avg.|   2.15%|   0.27%|    1.87%| 97.85%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |       160145408 |        19549 |      15.25 |    1952.47 |    5.125 |     8.135 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:         160145408 |        19549 |      15.25 |    1952.47 |    5.125 |     8.135
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |       160145408 |        19549 |      15.25 |    1952.47 |    5.125 |     8.135 | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:         160145408 |        19549 |      15.25 |    1952.47 |    5.125 |     8.135
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | X:\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      3.101 |        N/A |      3.101
   25th |      3.961 |        N/A |      3.961
   50th |      4.223 |        N/A |      4.223
   75th |      4.665 |        N/A |      4.665
   90th |      5.405 |        N/A |      5.405
   95th |      6.681 |        N/A |      6.681
   99th |     21.494 |        N/A |     21.494
3-nines |    123.648 |        N/A |    123.648
4-nines |    335.632 |        N/A |    335.632
5-nines |    454.760 |        N/A |    454.760
6-nines |    454.760 |        N/A |    454.760
7-nines |    454.760 |        N/A |    454.760
8-nines |    454.760 |        N/A |    454.760
    max |    454.760 |        N/A |    454.760

Note that there is some variability here. This second run with the same parameters (1 thread, 10 outstanding IOs) yielded slightly fewer IOPS. You can reduce this variability by running with longer duration or averaging multiple runs. More on that later.

With this system, we don’t seem to have a CPU bottleneck. The overall CPU utilization is around 2% and the busiest core is under 9% of usage. This system has 4 cores and anything with less than 25% (1/4) overall CPU utilization is probably not an issue. In other configurations, you might run into CPU core bottlenecks, though…

8. Tune queue depth for small random IOs, part 2

Now let’s perform the same tuning exercise for small random IOS with a system with better storage performance and less capable cores. For this one, we’ll continue to use the same PowerShell script. However, this is on system using an SSD for storage and 8 slower CPU cores. Here’s that same script again:

1..16 | % {
   $param = "-o $_"
   $result = C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k $param -t1 -h -L C:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   "Param $param, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}

Here’s the sample output from our second system:

Param -o 1, 7873.26 iops, 61.51 MB/sec, 0.126 ms, 3.96% CPU
Param -o 2, 14572.54 iops, 113.85 MB/sec, 0.128 ms, 7.25% CPU
Param -o 3, 23407.31 iops, 182.87 MB/sec, 0.128 ms, 6.76% CPU
Param -o 4, 31472.32 iops, 245.88 MB/sec, 0.127 ms, 19.02% CPU
Param -o 5, 32823.29 iops, 256.43 MB/sec, 0.152 ms, 20.02% CPU
Param -o 6, 33143.49 iops, 258.93 MB/sec, 0.181 ms, 20.71% CPU
Param -o 7, 33335.89 iops, 260.44 MB/sec, 0.210 ms, 20.13% CPU
Param -o 8, 33160.54 iops, 259.07 MB/sec, 0.241 ms, 21.28% CPU
Param -o 9, 36047.10 iops, 281.62 MB/sec, 0.249 ms, 20.86% CPU
Param -o 10, 33197.41 iops, 259.35 MB/sec, 0.301 ms, 20.49% CPU
Param -o 11, 35876.95 iops, 280.29 MB/sec, 0.306 ms, 22.36% CPU
Param -o 12, 32955.10 iops, 257.46 MB/sec, 0.361 ms, 20.41% CPU
Param -o 13, 33548.76 iops, 262.10 MB/sec, 0.367 ms, 20.92% CPU
Param -o 14, 34728.42 iops, 271.32 MB/sec, 0.400 ms, 24.65% CPU
Param -o 15, 32857.67 iops, 256.70 MB/sec, 0.456 ms, 22.07% CPU
Param -o 16, 33026.79 iops, 258.02 MB/sec, 0.484 ms, 21.51% CPU

As you can see, this SSD can deliver many more IOPS than the previous system which used multiple HDDs. We got consistently better performance as we increased the queue depth for the first few runs. As usual, after a certain number of outstanding IOs, adding more started giving us very little improvement until things flatten out completely and all we do is increase latency. This is coming from a single SSD. If you have multiple SSDs in Storage Spaces Pool or a RAID set, you might need to try even higher queue depths. Always make sure you increase –o parameter to reach the point where IOPS hit a peak and only latency increases.

So, in this setup, we seem to start losing steam at around 6 outstanding IOs and latency starts to ramp up more dramatically after queue depth reaches 8. Let’s see the full output for queue depth of 8 to get a good sense:

PS C:\> C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k -o8 -t1 -h -L C:\testfile.dat
Command Line: C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k -o8 -t1 -h -L C:\testfile.dat
Input parameters:
    timespan:   1
    -------------
    duration: 10s
    warm up time: 5s
    cool down time: 0s
    measuring latency
    random seed: 0
    path: 'C:\testfile.dat'
        think time: 0ms
        burst size: 0
        software and hardware cache disabled
        performing read test
        block size: 8192
        using random I/O (alignment: 8192)
        number of outstanding I/O operations: 8
        stride size: 8192
        thread stride size: 0
        threads per file: 1
        using I/O Completion Ports
        IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:    10.00s
thread count:        1
proc count:        8
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0| 99.06%|   2.97%|   96.09%|   0.94%
   1|   5.16%|   0.62%|    4.53%| 94.84%
   2| 14.53%|   2.81%|   11.72%| 85.47%
   3| 17.97%|   6.41%|   11.56%| 82.03%
   4| 24.06%|   5.16%|   18.91%| 75.94%
   5|   8.28%|   1.56%|    6.72%| 91.72%
   6| 16.09%|   3.91%|   12.19%| 83.90%
   7|   8.91%|   0.94%|    7.97%| 91.09%
-------------------------------------------
avg.| 24.26%|   3.05%|   21.21%| 75.74%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |      2928967680 |       357540 |     279.32 |   35753.26 |    0.223 |     0.051 | C:\testfile.dat (1024MB)
-----------------------------------------------------------------------------------------------------
total:        2928967680 |       357540 |     279.32 |   35753.26 |    0.223 |     0.051
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |      2928967680 |       357540 |     279.32 |   35753.26 |    0.223 |     0.051 | C:\testfile.dat (1024MB)
-----------------------------------------------------------------------------------------------------
total:        2928967680 |       357540 |     279.32 |   35753.26 |    0.223 |     0.051
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | C:\testfile.dat (1024MB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      0.114 |        N/A |      0.114
   25th |      0.209 |        N/A |      0.209
   50th |      0.215 |        N/A |      0.215
   75th |      0.224 |        N/A |      0.224
   90th |      0.245 |        N/A |      0.245
   95th |      0.268 |        N/A |      0.268
   99th |      0.388 |        N/A |      0.388
3-nines |      0.509 |        N/A |      0.509
4-nines |      2.905 |        N/A |      2.905
5-nines |      3.017 |        N/A |      3.017
6-nines |      3.048 |        N/A |      3.048
7-nines |      3.048 |        N/A |      3.048
8-nines |      3.048 |        N/A |      3.048
    max |      3.048 |        N/A |      3.048

Again you note that there is some variability here. This second run with the same parameters (1 thread, 8 outstanding IOs) yielded a few more IOPS. We’ll later cover some tips on how to average out multiple runs.

You can also see that apparently one of the CPU cores is being hit harder than others. There is clearly a potential bottleneck. Let’s look into that…

9. Tune threads for small random IOs with CPU bottleneck

In this 8-core system, any overall utilization above 12.5% (1/8 of the total) means a potential core bottleneck when using a single thread. You can actually see in the CPU table in our last run that our core 0 is pegged at 99%. We should be able to do better with multiple threads. Let’s try increasing the number of threads with a matching reduction of queue depth so we end up with the same number of total outstanding IOs.

$o = 8
$t = 1
While ($o -ge 1) {
   $paramo = "-o $o"
   $paramt = “-t $t”
   $result = C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k $paramo $paramt -h -L C:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   “Param $paramo $paramt, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
   $o = $o / 2
   $t = $t * 2
}

Here’s the output:

Param -o 8 -t 1, 35558.31 iops, 277.80 MB/sec, 0.225 ms, 22.36% CPU
Param -o 4 -t 2, 37069.15 iops, 289.60 MB/sec, 0.215 ms, 25.23% CPU
Param -o 2 -t 4, 34592.04 iops, 270.25 MB/sec, 0.231 ms, 27.99% CPU
Param -o 1 -t 8, 34621.47 iops, 270.48 MB/sec, 0.230 ms, 26.76% CPU

As you can see, in my system, adding a second thread improved things a bit, reaching our best yet 37,000 IOPS without much of a change in latency. It seems like we were a bit limited by the performance of a single core. We call that being “core bound”. See below the full output for the run with two threads:

PS C:\> C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k -o4 -t2 -h -L C:\testfile.dat
Command Line: C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k -o4 -t2 -h -L C:\testfile.dat
Input parameters:
    timespan:   1
    -------------
    duration: 10s
    warm up time: 5s
    cool down time: 0s
    measuring latency
    random seed: 0
    path: 'C:\testfile.dat'
        think time: 0ms
        burst size: 0
        software and hardware cache disabled
        performing read test
        block size: 8192
        using random I/O (alignment: 8192)
        number of outstanding I/O operations: 4
        stride size: 8192
        thread stride size: 0
        threads per file: 2
        using I/O Completion Ports
        IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:    10.00s
thread count:        2
proc count:        8
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0| 62.19%|   1.87%|   60.31%| 37.81%
   1| 62.34%|   1.87%|   60.47%| 37.66%
   2| 11.41%|   0.78%|   10.62%| 88.75%
   3| 26.25%|   0.00%|   26.25%| 73.75%
   4|   8.59%|   0.47%|    8.12%| 91.56%
   5| 16.25%|   0.00%|   16.25%| 83.75%
   6|   7.50%|   0.47%|    7.03%| 92.50%
   7|   3.28%|   0.47%|    2.81%| 96.72%
-------------------------------------------
avg.| 24.73%|   0.74%|   23.98%| 75.31%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |      1519640576 |       185503 |     144.92 |   18549.78 |    0.215 |     0.419 | C:\testfile.dat (1024MB)
     1 |      1520156672 |       185566 |     144.97 |   18556.08 |    0.215 |     0.404 | C:\testfile.dat (1024MB)
-----------------------------------------------------------------------------------------------------
total:        3039797248 |       371069 |     289.89 |   37105.87 |    0.215 |     0.411
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |      1519640576 |       185503 |     144.92 |   18549.78 |    0.215 |     0.419 | C:\testfile.dat (1024MB)
     1 |      1520156672 |       185566 |     144.97 |   18556.08 |    0.215 |     0.404 | C:\testfile.dat (1024MB)
-----------------------------------------------------------------------------------------------------
total:        3039797248 |       371069 |     289.89 |   37105.87 |    0.215 |     0.411
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | C:\testfile.dat (1024MB)
     1 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | C:\testfile.dat (1024MB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      0.088 |        N/A |      0.088
   25th |      0.208 |        N/A |      0.208
   50th |      0.210 |        N/A |      0.210
   75th |      0.213 |        N/A |      0.213
   90th |      0.219 |        N/A |      0.219
   95th |      0.231 |        N/A |      0.231
   99th |      0.359 |        N/A |      0.359
3-nines |      0.511 |        N/A |      0.511
4-nines |      1.731 |        N/A |      1.731
5-nines |     80.959 |        N/A |     80.959
6-nines |     90.252 |        N/A |     90.252
7-nines |     90.252 |        N/A |     90.252
8-nines |     90.252 |        N/A |     90.252
    max |     90.252 |        N/A |     90.252

You can see now that cores 0 and 1 are being used, with both at around 62% utilization. So we have effectively eliminated the core bottleneck that we had before.

For systems with more capable storage, it’s easier to get “core bound” and adding more threads can make a much more significant difference. As I mentioned, it’s important to keep an eye on the per-core CPU utilization (not only the total CPU utilization) to look out for these bottlenecks.

10. Multiple runs are better than one

One thing you might have notice with DiskSpd (or any other tools like it) is that the results are not always the same given the same parameters. Each run is a little different. For instance, let’s try running our “-b8K –o4 -t2” with the very same parameters a few times to see what happens:

1..8 | % {
   $result = C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k -o4 -t2 -h -L C:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   “Run $_, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
}

Here are the results:

Run 1, 34371.97 iops, 268.53 MB/sec, 0.232 ms, 24.53% CPU
Run 2, 37138.29 iops, 290.14 MB/sec, 0.215 ms, 26.72% CPU
Run 3, 36920.81 iops, 288.44 MB/sec, 0.216 ms, 26.66% CPU
Run 4, 34538.00 iops, 269.83 MB/sec, 0.231 ms, 36.85% CPU
Run 5, 34406.91 iops, 268.80 MB/sec, 0.232 ms, 37.09% CPU
Run 6, 34393.72 iops, 268.70 MB/sec, 0.214 ms, 33.71% CPU
Run 7, 34451.48 iops, 269.15 MB/sec, 0.232 ms, 25.74% CPU
Run 8, 36964.47 iops, 288.78 MB/sec, 0.216 ms, 30.21% CPU

The results have a good amount of variability. You can look at the standard deviations by specifying the -D option to check how stable things are. But, in the end, how can you tell which measurements are the most accurate? Ideally, once you settle on a specific set of parameters, you should run DiskSpd a few times and average out the results. Here’s a sample PowerShell script to do it, using the last set of parameters we used for the 8KB IOs:

$tiops=0
$tmbps=0
$tlatency=0
$tcpu=0
$truns=10
1..$truns | % {
   $result = C:\DiskSpd\DiskSpd.exe -c1G -d10 -w0 -r -b8k -o4 -t2 -h -L C:\testfile.dat
   foreach ($line in $result) {if ($line -like "total:*") { $total=$line; break } }
   foreach ($line in $result) {if ($line -like "avg.*") { $avg=$line; break } }
   $mbps = $total.Split("|")[2].Trim()
   $iops = $total.Split("|")[3].Trim()
   $latency = $total.Split("|")[4].Trim()
   $cpu = $avg.Split("|")[1].Trim()
   “Run $_, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU"
   $tiops += $iops
   $tmbps += $mbps
   $tlatency += $latency
   $tcpu += $cpu.Replace("%","")
}
$aiops = $tiops / $truns
$ambps = $tmbps / $truns
$alatency = $tlatency / $truns
$acpu = $tcpu / $truns
“Average, $aiops iops, $ambps MB/sec, $alatency ms, $acpu % CPU"

The script essentially runs DiskSpd 10 times, totaling the numbers for IOPs, throughput, latency and CPU usage, so it can show an average at the end. The $truns variable represents the total number of runs desired. Variables starting with $t hold the totals. Variables starting with $a hold averages. Here’s a sample output:

Run 1, 37118.31 iops, 289.99 MB/sec, 0.215 ms, 35.78% CPU
Run 2, 34311.40 iops, 268.06 MB/sec, 0.232 ms, 38.67% CPU
Run 3, 36997.76 iops, 289.04 MB/sec, 0.215 ms, 38.90% CPU
Run 4, 34463.16 iops, 269.24 MB/sec, 0.232 ms, 24.16% CPU
Run 5, 37066.41 iops, 289.58 MB/sec, 0.215 ms, 25.14% CPU
Run 6, 37134.21 iops, 290.11 MB/sec, 0.215 ms, 26.02% CPU
Run 7, 34430.21 iops, 268.99 MB/sec, 0.232 ms, 23.61% CPU
Run 8, 35924.20 iops, 280.66 MB/sec, 0.222 ms, 25.21% CPU
Run 9, 33387.45 iops, 260.84 MB/sec, 0.239 ms, 21.64% CPU
Run 10, 36789.85 iops, 287.42 MB/sec, 0.217 ms, 25.86% CPU
Average, 35762.296 iops, 279.393 MB/sec, 0.2234 ms, 28.499 % CPU

As you can see, it’s a good idea to capture multiple runs. You might also want to run each iteration for a longer time, like 60 seconds instead of just 10 second.

Using 10 runs of 60 seconds (10 minutes total) might seem a little excessive, but that was the minimum recommended by one of our storage performance engineers. The problem with shorter runs is that they often don’t give the IO subsystem time to stabilize. This is particularly true when testing virtual file systems (such as those in cloud storage or virtual machines) when files are allocated dynamically. Also, SSDs exhibit write degradation and can sometimes take hours to reach a steady state (depending on how full the SSD is). So it's a good idea to run the test for a few hours in these configurations on a brand new system, since this could drop your initial IOPs number by 30% or more.

11. DiskSpd and SMB file shares

You can use DiskSpd to get the same type of performance information for SMB file shares. All you have to do is run DiskSpd from an SMB client with access to a file share.

It is as simple as mapping the file share to a drive letter using the old “NET USE” command or the new PowerShell cmdlet “New-SmbMapping”. You can also use a UNC path directly in the command line, instead of using drive letters.

Here are an example using the HDD-based system we used as our first few examples, now running remotely:

PS C:\diskspd> C:\DiskSpd\DiskSpd.exe -c1000G -d10 -w0 -r -b8k -o10 -t1 -h -L \\jose1011-st1\Share1\testfile.dat
Command Line: C:\DiskSpd\DiskSpd.exe -c1000G -d10 -w0 -r -b8k -o10 -t1 -h -L \\jose1011-st1\Share1\testfile.dat
Input parameters:
        timespan:   1
        -------------
        duration: 10s
        warm up time: 5s
        cool down time: 0s
        measuring latency
        random seed: 0
        path: '\\jose1011-st1\Share1\testfile.dat'
                think time: 0ms
                burst size: 0
                software and hardware cache disabled
                performing read test
                block size: 8192
                using random I/O (alignment: 8192)
                number of outstanding I/O operations: 10
                stride size: 8192
                thread stride size: 0
                threads per file: 1
                using I/O Completion Ports
                IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time:       10.01s
thread count:           1
proc count:             4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
   0| 12.96%|   0.62%|   12.34%| 86.98%
   1|   0.00%|   0.00%|    0.00%| 99.94%
   2|   0.00%|   0.00%|    0.00%| 99.94%
   3|   0.00%|   0.00%|    0.00%| 99.94%
-------------------------------------------
avg.|   3.24%|   0.16%|    3.08%| 96.70%
Total IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |       158466048 |        19344 |      15.10 |    1933.25 |    5.170 |     6.145 | \\jose1011-st1\Share1\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:         158466048 |        19344 |      15.10 |    1933.25 |    5.170 |     6.145
Read IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |       158466048 |        19344 |      15.10 |    1933.25 |    5.170 |     6.145 | \\jose1011-st1\Share1\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:         158466048 |        19344 |      15.10 |    1933.25 |    5.170 |     6.145
Write IO
thread |       bytes     |     I/Os     |     MB/s   | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
     0 |               0 |            0 |       0.00 |       0.00 |    0.000 |       N/A | \\jose1011-st1\Share1\testfile.dat (1000GB)
-----------------------------------------------------------------------------------------------------
total:                 0 |            0 |       0.00 |       0.00 |    0.000 |       N/A
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      3.860 |        N/A |      3.860
   25th |      4.385 |        N/A |      4.385
   50th |      4.646 |        N/A |      4.646
   75th |      5.052 |        N/A |      5.052
   90th |      5.640 |        N/A |      5.640
   95th |      6.243 |        N/A |      6.243
   99th |     12.413 |        N/A |     12.413
3-nines |     63.972 |        N/A |     63.972
4-nines |    356.710 |        N/A |    356.710
5-nines |    436.406 |        N/A |    436.406
6-nines |    436.406 |        N/A |    436.406
7-nines |    436.406 |        N/A |    436.406
8-nines |    436.406 |        N/A |    436.406
    max |    436.406 |        N/A |    436.406

This is an HDD-based storage system, so most of the latency comes from the local disk, not the remote SMB access. In fact, we achieved numbers similar to what we had locally before.

12. Conclusion

I hope you have learned how to use DiskSpd to perform some storage testing of your own. I encourage you to use it to look at the performance of the storage features in Windows Server 2012, Windows Server 2012 R2 and Windows Server Technical Preview. That includes Storage Spaces, SMB3 shares, Scale-Out File Server, Storage Replica and Storage QoS. Let me know if you were able to try it out and feel free to share some of your experiments via blog comments.

Thanks to Bartosz Nyczkowski, Dan Lovinger, David Berg and Scott Lee for their contributions to this blog post.

↧

SmbStorageTier.ps1: A simple way to pin files to tiers in Scale-Out File Servers

October 18, 2014, 8:52 am

≫ Next: Storage Quality of Service Guide Released for Windows Server Technical Preview

≪ Previous: DiskSpd, PowerShell and storage performance: measuring IOPs, throughput and latency for both local disks and SMB file shares

Storage Spaces Tiering

Storage Spaces has an interesting new feature introduced in Windows Server 2012 R2: you can create a single space using different types of disk (typically HDDs and SSDs) and it will automatically move hot data to the fast tier and cold data to the slow tier. You can read more about it at http://technet.microsoft.com/en-us/library/dn789160.aspx. You can also try my step-by-step instructions at http://blogs.technet.com/b/josebda/archive/2013/08/28/step-by-step-for-storage-spaces-tiering-in-windows-server-2012-r2.aspx.

Pinning Files to a Tier

While Storage Spaces will automatically track “heat” and move data to correct tier, there are situations where you already know ahead of time that a certain file will work best when placed in the fast tier (like when you have the base OS VHDX file for many differencing disks in a VDI setup) or the slow tier (like when you create a VHDX file specifically for backups). For those situations, you have the option to pin a file to a specific tier. That means Storage Spaces will place those pinned files on the tier you specified, tracking heat and moving automatically only the other files in the space.

Here are the cmdlets used to pin, unpin and report on pinned files with some sample parameters:

Set-FileStorageTier -CimSession <computer> -FilePath <String> -DesiredStorageTierFriendlyName <String>
Clear-FileStorageTier -CimSession <computer> -FilePath <String>
Get-FileStorageTier -CimSession <computer> -Volume <CimInstance>

Tiers on a Scale-Out File Server

The cmdlets above are fine if you are using the Storage Spaces on a standalone system, but when you are using them with a Scale-Out File Server, they are not so convenient.

First of all, when running from a remote computer, they require you to know which server owns the storage space at that time. When you have your multiple spaces running on a Scale-Out File Server, they are naturally spread across the nodes of the cluster (that’s why we call it “Scale-Out”). You can, of course, query the cluster to find out which node owns the space, but that’s an extra step.

Another issue is that the cmdlets above require you to know the local path to the file you want to pin (or the volume you want to report on). When you’re using the Scale-Out File Server, you typically refer to files using the UNC path (also called the remote file path). For instance, you might refer a VHDX file as \\Server1\Share2\Folder3\File4.VHDX, even though the local path to that file on the cluster shared volume might be C:\ClusterStorage\Volume1\Shares\Share2\Folder3\File4.VHDX. There are ways to query the path behind a share and calculate the local path, but that’s again an extra step.

The third item is that the cmdlet requires you to know the name of the tier to pin the file. Each tiered storage space has two tiers (typically one associated with HDDs and the other associated with SSDs) and they each are independent objects. If you have 5 tiered storage spaces, you effectively have 10 tiers (2 per space), each with their own unique ID and friendly name. If you name things consistently, you can probably create a scheme where the name of the tier uses the name of the space plus the media type (SSD or HDD), but sometimes people get creative. So you need to gather that third piece of information before you can pin the file.

The last thing to keep in mind is that the pinning (or unpinning) will only happen after the Storage Spaces Tiering Optimization task is run. So, if you want this to happen immediately, you need to run that schedule task manually. It’s a simple command, but you should always remember to do it.

The SmbStorageTier.ps1 PowerShell Script

Faced with these issues, I decided to write a little PowerShell script that makes life easier for the Hyper-V or Scale-Out File Server administrator. Using the Cluster, SMB and Storage PowerShell cmdlets, the script helps you pin/unpin files to a storage tier in a Scale-Out File Server. To pin a file, all you need to specify is the UNC path and the media type. To unpin, just specify the UNC path. You can also get a report for files pinned on a specific file share (actually, on the volume behind that share).

Here are the script parameters used to pin, unpin and report on pinned files with some sample parameters:

.\SmbStorageTier.ps1 –PinFile <String> –MediaType <hdd/ssd>
.\SmbStorageTier.ps1 –UnpinFile <String>
.\SmbStorageTier.ps1 –ReportShare <String>

The file path specified is the UNC path, starting with \\ and the name of the server. The same format that you use in Hyper-V when creating VMs. The media type is either HDD or SSD.

The script will basically take those parameters and calculate the information required to run the Set-FileStorageTier, Clear-FileStorageTier or Get-FileStorageTier. It will also kick off the Storage Spaces Tiering Optimization task, to make the change effective immediately. It’s not all that complicated to do using PowerShell, but it has the potential to save you some time if you pin/unpin files on a regular basis.

This script has been tested with both Windows Server 2012 R2 and the Windows Server Technical Preview.

Sample Output

To help you understand how this works, here are some examples of the script in action.

Note that it actually shows the input parameters, the calculated parameters required by the native Storage cmdlets and the command line it actually executes.

PS C:\> .\SmbStorageTier.ps1 -PinFile \\josebda-fs\share2\VM2\VM2.vhdx -MediaType ssd

Input parameters
File to pin: \\JOSEBDA-FS\SHARE2\VM2\VM2.VHDX
Media type: SSD

Calculated parameters
Local file path: C:\ClusterStorage\Volume2\Share2\VM2\VM2.VHDX
Node owning the volume: JOSEBDA-A3
Tier Name: Space2_SSDTier

Executing command:
Set-FileStorageTier -CimSession JOSEBDA-A3 -FilePath C:\ClusterStorage\Volume2\Share2\VM2\VM2.VHDX –DesiredStorageTierFriendlyName Space2_SSDTier

PS C:\> .\SmbStorageTier.ps1 -ReportShare \\josebda-fs\share2\

Input parameter
Share Name: \\JOSEBDA-FS\SHARE2\

Calculated parameters
Volume behind the share: \\?\Volume{37bb3bf3-80fc-4a43-bc79-37246e5d2666}\
Node owning the volume: JOSEBDA-A3

Executing command:
Get-FileStorageTier –CimSession JOSEBDA-A3 –VolumePath \\?\Volume{37bb3bf3-80fc-4a43-bc79-37246e5d2666}\ | Select *

PlacementStatus              : Partially on tier
State                        : Pending
DesiredStorageTierName       : Space2_SSDTier
FilePath                     : C:\ClusterStorage\Volume2\Share2\VM2\VM2.vhdx
FileSize                     : 9063890944
FileSizeOnDesiredStorageTier : 7987003392
PSComputerName               : JOSEBDA-A3
CimClass                     : ROOT/Microsoft/Windows/Storage:MSFT_FileStorageTier
CimInstanceProperties        : {DesiredStorageTierName, FilePath, FileSize, FileSizeOnDesiredStorageTier...}
CimSystemProperties          : Microsoft.Management.Infrastructure.CimSystemProperties

PS C:\> .\SmbStorageTier.ps1 -UnpinFile \\josebda-fs\share2\VM2\VM2.vhdx

Input parameter
File to unpin: \\JOSEBDA-FS\SHARE2\VM2\VM2.VHDX

Calculated parameters
Local Path: C:\ClusterStorage\Volume2\Share2\VM2\VM2.VHDX
Node owning the volume: JOSEBDA-A3

Executing command:
Clear-FileStorageTier -CimSession JOSEBDA-A3 -FilePath C:\ClusterStorage\Volume2\Share2\VM2\VM2.VHDX

PS C:\>

The Link to Download

Well, now I guess all you need is the link to download the script in the TechNet Script Center. Here it is: https://gallery.technet.microsoft.com/scriptcenter/SmbStorageTierps1-A-simple-934c0f22

Let me know what you think. Add your comments to the blog or use the Q&A section in the TechNet Script Center link above.

↧

Storage Quality of Service Guide Released for Windows Server Technical Preview

October 24, 2014, 2:05 pm

≫ Next: Windows Server Storage Sessions from TechEd Europe 2014 (includes links to recordings and slides)

≪ Previous: SmbStorageTier.ps1: A simple way to pin files to tiers in Scale-Out File Servers

As part of the Windows Server Technical Preview released a few weeks ago, we announced the evolution of the Storage Quality of Service (Storage QoS) feature. Now, in addition to that TechNet page with an overview, we released the Storage QoS Step-by-Step Guide.

An Overview of Storage QoS

Storage Quality of Service in Windows Server Technical Preview provides a way to centrally monitor and manage storage performance for virtual machines using Hyper-V and the Scale Out File Server roles. The feature automatically improves storage resource fairness between multiple virtual machines using the same file server and allows specific Minimum and Maximum performance goals to be configured in units of normalized IOPs.

Windows Server 2012 R2 had the ability to enforce an IOPS maximum on a single Hyper-V host. With the Windows Server Technical Preview, this is extended to enforce both minimum and maximum IOPS on a set of Hyper-V hosts that share a Scale-Out File Server as their storage solution.

Note:Please keep in mind that this is an early pre-release build. Many of the features and scenarios are still in development and the experiences are still evolving. At this stage, Windows Server Technical Preview and Storage QoS are not intended for production environments, only for introductory evaluation.

Scenarios for Storage QoS

With this new solution, you will be able to address the following scenarios:

Noisy neighbor mitigation– By default, Storage QoS ensures that a single virtual machine cannot consume all storage resources and starve other virtual machines of storage bandwidth.
Deploy at high density with confidence– Storage QoS policies define performance minimums and maximums for virtual machines and ensures they are met. This provides consistent performance to virtual machines, even in dense and overprovisioned environments.
End to end storage monitoring– As soon as the virtual machines stored on a Scale Out File Server are started, their performance is monitored. Performance of all VMs running on the Scale Out File Server cluster can be viewed from a single location.

Architecture

clip_image002

Storage Quality of Service is built into the Microsoft Software-Defined Storage solution provided by Scale Out File Servers and Hyper-V. The Scale Out File Server exposes file shares to the Hyper-V servers using the SMB3 protocol. A new Policy Manager has been added to the File Server cluster, which provides the central storage performance monitoring.

As Hyper-V servers launch virtual machines, they are monitored by the policy manager. The Policy Manager will communicate the Storage QoS policy and any limits or reserves back to the Hyper-V server, which will control the performance of the virtual machine as appropriate.

When there are changes to Storage QoS policies or to the performance demands by virtual machines, the policy manager will notify the Hyper-V servers to adjust their behavior. This feedback loop ensures that all virtual machines perform consistently according to the Storage QoS policies defined.

A Nod to Microsoft Research

It's important to mention that this new feature implemented in Windows Server Technical Preview would not be possible without the incredible work done by Microsoft Research in this space.

Eno Thereska and his fellow researchers have been working on this QoS problem for years now. If you want to understand the Computer Science behind what Hyper-V implemented, you can access the MSR page about the Predictable Data Centers (PDC).

There you can find a few papers on the subject and a particularly relevant one is "IOFlow: A Software-Defined Storage Architecture", presented about a year ago at the 24th ACM Symposium on Operating Systems Principles (SOSP'13).

↧

Windows Server Storage Sessions from TechEd Europe 2014 (includes links to recordings and slides)

October 28, 2014, 7:50 pm

≫ Next: Testing Windows Server and the Scale-Out File Server – What should your lab look like?

≪ Previous: Storage Quality of Service Guide Released for Windows Server Technical Preview

This blog post includes a list of the Storage-related sessions from TechEd Europe 2014 in Barcelona, Spain.

You can use it as an easy reference to the Storage sessions, including links to the recordings and the slides for each one.

Day	Time	Code	Title (with link to recording and slides)	Speaker(s)
Tue	8:30 AM	KEY01	Keynote	Joe Belfiore, Jason Zander
Tue	11:00 AM	FDN03	Optimizing Your Datacenter with Windows Server, System Center, and Microsoft Azure	Brian Hillger, Jeff Woolsey, Jeremy Winter, Matt McSpirit, Patrick Lang
Tue	1:30 PM	CDP-B362	Architecting a Modern Datacenter: Windows Server 2012 R2 End-to-End Design	Philip Moss
Tue	3:15 PM	CDP-B232	Introducing the NEW Microsoft Cloud Platform System	Jonobie Ford, Michael Schulz, Vijay Tewari
Tue	5:00 PM	CDP-B222	Software Defined Storage in the Next Release of Windows Server	Ned Pyle, Patrick Lang, Siddhartha Roy
Tue	1:30 PM	CDP-B318	Building Scalable and Reliable Backup Solutions in the Next Release of Windows Server Hyper-V	Taylor Brown
Wed	8:30 AM	CDP-B323	Delivering Predictable Storage Performance with Storage Quality of Service in the Next Release of Windows	Patrick Lang
Wed	10:15 AM	CDP-B354	Advantages of Upgrading Your Private Cloud Infrastructure in the Next Release of Windows Server	Rob Hindman, Taylor Brown
Wed	10:15 AM	CDP-B361	Architecting Software Defined Storage: Design Patterns from Real-World Deployments	Joshua Adams
Wed	10:15 AM	CDP-B291	Dell Storage Spaces: An End-to-End Solution	Lee Harrison, Shai Ofek, Terry Storey
Wed	12:00 PM	CDP-B341	Architectural Deep Dive into the Microsoft Cloud Platform System	James Pinkerton, Spencer Shepler
Wed	3:15 PM	CDP-B363	End-to-End Design for a Highly Available Datacenter	Philip Moss
Wed	3:15 PM	CDP-B339	Leveraging SAN Replication for Enterprise Grade Disaster Recovery with Azure Site Recovery and System	Abhishek Agrawal, Hector Linares, Karsten Bott, Pavel Lobanov
Wed	5:00 PM	CDP-B352	Stretching Failover Clusters and Using Storage Replica for Disaster Recovery in the Next Release of Windows	Ned Pyle
Thu	8:30 AM	EM-B314	BYOD for File Server Home Folders: Understanding, Deploying and Managing Work Folders in Windows Server 2012 R2	Fabian Uhse, Gene Chellis
Thu	12:00 PM	CDP-B340	Using Tiered Storage Spaces for Greater Performance and Lower Costs	Spencer Shepler
Thu	3:15 PM	CDP-B349	Storage Management in a Hybrid Cloud Environment with Windows Server and System Center	Hector Linares
Thu	3:15 PM	CDP-B353	Automated Workload Provisioning with the Azure Pack and Windows PowerShell	Charles Joy, Jeff Goldner, Michael Greene, Tiander Turpijn
Fri	8:30 AM	CDP-B325	Design Scale-Out File Server Clusters in the Next Release of Windows Server	Claus Joergensen
Fri	10:15 AM	CDP-B334	Cloud Integrated Data Protection with System Center Data Protection Manager and Microsoft Azure Backup	Islam Gomaa, Shreesh Dubey
Fri	10:15 AM	CDP-321	Cluster-in-a-Box Meets Datacenter Convergence to Redefine Successful Private Cloud Deployments	John Loveall
Fri	2:45 PM	CDP-B358	Windows Server Data Deduplication at Scale: Dedup Updates for Large-Scale VDI and Backup Scenarios	Daniel Hefenbrock, John Loveall, Rutwick Bhatt

Note 1: The list above is focused on Private Cloud deployments using Windows Server. It does not include Azure Storage or SQL Server Storage sessions.

Note 2: The list above includes the keynote, foundational session and breakout sessions only. It does not include hands-on labs, instructor-led labs and lunch sessions (these are not recorded).

↧

Testing Windows Server and the Scale-Out File Server – What should your lab look like?

October 31, 2014, 2:47 pm

≫ Next: Migrating File Servers from Windows Server 2003 to Windows Server 2012 R2

≪ Previous: Windows Server Storage Sessions from TechEd Europe 2014 (includes links to recordings and slides)

With the release of Windows Server Technical Preview, a common question comes to the surface again. What kind of lab hardware do I need to play with it?

Ideally, you would get one of those new Cloud Platform System (CPS) racks with a 32-node Hyper-V cluster, a 4-node Scale-Out File Server and 4 JBODs with 60 disks each. However, not everyone can afford a lab like that :-).

I am recycling some of my previous step-by-step guides and offer some basic options on how to test the Windows Server 2012 R2 and Windows Server Technical Preview features.

IMPORTANT NOTE: Some of these configurations are not supported and are not recommended for production deployments. Please see each linked blog post for details.

Option 1 – Virtual Machines

You can do a lot with just a single physical machine running Hyper-V. With enough RAM and a nice Core i7 CPU you can configure half a dozen VMs and try out many scenarios.
This would be enough to to test Failover Clustering, Scale-Out File Server, Shared VHDX, Scale-Out Rebalancing and Storage Replica. You can even test some basic Storage Spaces and Storage QoS capabilities.
Obviously, you can’t do much testing of Hyper-V itself and certainly no Hyper-V clustering features. This will also not showcase storage performance at all, since you’re limited by hardware.
This is the setup I used many times on my beefy laptop with 32GB of RAM and used for many demos in the past.
Details at http://blogs.technet.com/b/josebda/archive/2013/07/31/windows-server-2012-r2-storage-step-by-step-with-storage-spaces-smb-scale-out-and-shared-vhdx-virtual.aspx

Option 2 – Physical Machines, basic networking

If you’re testing new Hyper-V features related to clustering, you can’t run virtualized. This would require at least two physical Hyper-V hosts.
Also, might want to test how Storage Spaces works with a physical JBOD plus a mix of HDD/SSD to test storage performance, tiering, etc.
A setup like this will be ideal to test Hyper-V high availability, Storage Spaces performance and the full capabilities of the new Storage QoS.
To do some basic testing you don't need high-end hardware. I've done a lot of testing with basic (even desktop-class) hardware. You do want to make they properly support Hyper-V.
Details at http://blogs.technet.com/b/josebda/archive/2013/07/31/windows-server-2012-r2-storage-step-by-step-with-storage-spaces-smb-scale-out-and-shared-vhdx-physical.aspx
While the setup in the link above suggests the use of RDMA, this will work fine with regular networking, even 1GbE. It will obviously perform accordingly.

Option 3 – Physical Machines, RDMA networking, all SSD

You can go one step further if you add RDMA networking to you setup. This means you need some RDMA cards and a more expensive switch. I might be a little noisier as well.
You might even go as far as to have multiple RDMA NICs so you can try out SMB Multichannel at high speeds. At this point you might want to move to an all-SSD configuration.
At that point you want to make sure the hardware for both the Hyper-V host and the File Server are server-class equipment with good performance, fast CPUs, lots of RAM.
Details at http://blogs.technet.com/b/josebda/archive/2014/03/10/smb-direct-and-rdma-performance-demo-from-teched-includes-summary-powershell-scripts-and-links.aspx
This is probably the simplest configuration to showcase extremely high performance. Note that the example above does not include failover clustering.

Option 4 – CPS

Beyond that, you are now in CPS territory, which includes multiple Hyper-V hosts, multiple file servers, dual RDMA paths, multiple JBODs, tiering, the whole thing.
Details at http://www.microsoft.com/cps

Bonus – Option 5 – Azure VMs

An interesting option is to skip the physical lab completely and go all Azure with your testing lab.
It's fairly easy to configure an Azure VM as a standalone file server and use Azure data disks to experiment with Storage Spaces. You can also try out Storage Replica.
With a little care on how you configure your Azure networking, you can also setup a Scale-Out File Server using iSCSI for your shared storage.
Details at http://blogs.technet.com/b/josebda/archive/2014/03/29/deploying-a-windows-server-2012-r2-scale-out-file-server-cluster-using-azure-vms.aspx

I hope that gives you a few ideas on how to get started with your testing.

↧

Migrating File Servers from Windows Server 2003 to Windows Server 2012 R2

November 5, 2014, 8:20 pm

≫ Next: Storage Spaces Survival Guide (Links to presentations, articles, blogs, tools)

≪ Previous: Testing Windows Server and the Scale-Out File Server – What should your lab look like?

Introduction

If you have SMB File Servers running Windows Server 2003, you are probably already aware that the extended support for that OS will end on July 14, 2015. You can read more about this at: http://blogs.technet.com/b/server-cloud/archive/2014/08/26/best-practices-for-windows-server-2003-end-of-support-migration.aspx.

If you're still using Windows Server 2003, you should be planning your migration to a newer version right now. The easiest path to migrate an old SMB File Server would be to use a virtual machine to replace your old box and move the data to the new VM. While it is fairly straightforward to perform the migration, you do have to be careful about it, since it does mean moving data around and requires at least a little bit of down time.

I’m glad you’re reading this, since it indicates you’re taking the steps to retire your old server before it falls out of support.

The Steps

To test this migration scenario, I configured a 32-bit Windows Server 2003 machine (which I called FILES) and a Hyper-V virtual machine running Windows Server 2012 R2. I also configured a domain controller running Windows Server 2012 R2 and joined both machines to a single domain.

In general, here are the steps I used to test and capture the details on how to perform the migration:

Configure accounts/groups in Active Directory *
Configure your WS2003 box *
Prepare for migration (initial data copy prior to final migration)
Export the share information
Changes happen after Step 3 *
Rename Windows Server 2003
Final data migration pass
Create shares at the destination
Rename the WS2012 R2
Verify you are all done

Items marked with * are needed only for simulation purposes and should not be executed in your existing environment already running a Windows Server 2003 File Server.

Find below the details for each of the steps above. Note that I tried to capture the commands I used in my environment, but you will obviously need to adjust the server names and paths as required in your specific configuration.

Step 1 – Configure accounts/groups in Active Directory

This step creates the several users and groups for the domain that we’ll use in the script.

This step should be run from an elevated PowerShell prompt on the test domain controller running Windows Server 2012 R2.

Older Windows Server versions for the DC are fine, but I cannot vouch the PowerShell below working on all older versions.

IMPORTANT: These commands should only be used for a test environment simulation. Do not run on your production environment.

$cred = get-credential
1..99 | % { New-ADUser -Name User$_ -AccountPassword $cred.password -CannotChangePassword $true -DisplayName "Test $_" -Enabled $true -SamAccountName User$_ }
1..99 | % { New-ADGroup -DisplayName "Project $_" -Name Project$_ -GroupCategory Security -GroupScope Global }
1..99 | % { $Group = $_; 1..99 | % { Get-ADGroup Project$Group | Add-ADGroupMember -Members User$_ } }

Step 2 – Configure your WS2003 box

This step creates several folders and shares, with different permissions at the share and the file system level. This simulates a production environment and helps test that files, folders, shares and permissions are being properly migrated.

This step should be run from a command prompt on the test Windows Server 2003 File Server. In the script, JOSE is the name of the domain.

IMPORTANT: These commands should only be used for a test environment simulation. Do not run on your production environment.

md C:\homefolder
for /L %%a in (1,1,99) do md C:\homefolder\user%%a
for /L %%a in (1,1,99) do NET SHARE share%%a=C:\homefolder\user%%a /GRANT:JOSE\Administrator,FULL /GRANT:JOSE\user%%a,FULL
for /L %%a in (1,1,99) do echo y | cacls C:\homefolder\user%%a /E /G JOSE\Administrator:F
for /L %%a in (1,1,99) do echo y | cacls C:\homefolder\user%%a /E /G JOSE\user%%a:F
md c:\projects
for /L %%a in (1,1,99) do md C:\projects\project%%a
for /L %%a in (1,1,99) do NET SHARE project%%a=C:\projects\project%%a /GRANT:JOSE\Administrator,FULL /GRANT:JOSE\Project%%a,FULL
for /L %%a in (1,1,99) do echo y | cacls c:\projects\project%%a /E /G JOSE\Administrator:F
for /L %%a in (1,1,99) do echo y | cacls c:\projects\project%%a /E /G JOSE\project%%a:F
for /L %%a in (1,1,99) do xcopy c:\windows\media\*.mid C:\homefolder\user%%a
for /L %%a in (1,1,99) do xcopy c:\windows\media\*.mid c:\projects\project%%a

Step 3 – Prepare for migration

This step performs an initial data copy from the Windows Server 2003 File Server to the Windows Server 2012 R2 machine prior to the final migration.

By doing this initial copy with the old file server still accessible to users, you minimize the downtime required for the final copy. If there are issues with open files or other errors during this step, that is OK. You will have a chance to grab those files later.

You should make sure to include all the folders used for all your file shares. In this example I am assuming relevant files are in the folders called c:\homefolder and c:\projects.

IMPORTANT: You must use the same drive letters and the exact same paths on your new Windows Server 2012 R2 server. If you don't, the share information won't match and your migration will not work.

IMPORTANT: This migration process only works if you only use domain accounts and domain groups for your permissions. If you are using local accounts for the file share or file system permissions, the permissions will not be migrated by ROBOCOPY.

In case you’re not familiar with ROCOBOPY, here are the details about the parameters used:

      /e – Copy subdirectories, including empty ones
     /xj – Exclude junction points
    /r:2 – 2 retries
    /w:5 – 5 second wait between retries
      /v – Verbose output for skipped files
     /it – Include tweaked files (identical size/timestamp, but different attributes)
/purge – Delete destination files/directories that no longer exist in source
/copyall – Copy data, attributes, timestamps, security (ACLs), owner, auditing info

Run this step at the Windows Server 2012 R2 server from an elevated command prompt.

md C:\homefolder
ROBOCOPY /e /xj /r:2 /w:5 /v /it /purge /copyall \\FILES\c$\homefolder c:\homefolder
md c:\projects
ROBOCOPY /e /xj /r:2 /w:5 /v /it /purge /copyall \\FILES\c$\projects c:\projects

Step 4 – Export the share information

This step exports the share information from the registry of the Windows Server 2003 machine. This will include share names, share path and share security (ACLs). There are more details on this export procedure at http://support.microsoft.com/kb/125996.

This command should be run from a command prompt on the test Windows Server 2003 File Server.

reg export HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Shares c:\export.reg

Step 5 – Changes happen after Step 3

This step simulates changes being applied to the files after the initial copy in step 3. Since some time will pass between steps 3 and 6, we expect that users will still be making changes to their files and adding new files. This simulated step makes sure the command are able to property capture those changes.

This step should be run from a command prompt on the test Windows Server 2003 File Server.

IMPORTANT: These commands should only be used for a test environment simulation. Do not run on your production environment.

for /L %%a in (1,1,99) do xcopy c:\windows\media\r*.wav c:\homefolder\user%%a
for /L %%a in (1,1,99) do xcopy c:\windows\media\r*.wav c:\projects\project%%a

Step 6 – Rename Windows Server 2003 (***DOWNTIME STARTS HERE***)

This step asks you to rename the Windows Server 2003 computer and reboot it. This will mark the beginning of downtime for your File service.

Since Windows Server 2003 did not ship with a command-line option to perform this operation, use the GUI to manually rename the machine from FILES to XFILES. This assumes that FILES is the name of the existing file server (users access data using \\FILES\<sharename>) and XFILES is an unused name in your network. At this point, your FILES file server will become unavailable.

If you want to automate this step, download the Support Tools from http://www.microsoft.com/en-us/download/details.aspx?id=15326 and use the command below from the Windows Server 2003 machine:

NETDOM RENAMECOMPUTER /NEWNAME XFILES /REBOOT /FORCE

Step 7 – Final data migration pass

This step copies the changes in the shares (changed files, new files) after the initial copy. Since this happens after the system is renamed and rebooted, there should be no more users in the system and there should be no further problems with files being in use during the copy.

We’re using the same parameters as before and ROBOCOPY will basically copy just what changed since the initial copy. If the initial copy was not too far back, you will have fewer changes and this step will be shorter.

IMPORTANT: Since this is the last copy, you should watch for any failures and repeat the copy until there are no issues with any files you care for.

Run this from the Windows Server 2012 R2 server from an elevated command prompt.

ROBOCOPY /e /xj /r:2 /w:5 /v /it /purge /copyall \\XFILES\c$\homefolder c:\homefolder
ROBOCOPY /e /xj /r:2 /w:5 /v /it /purge /copyall \\XFILES\c$\projects c:\projects

Step 8 – Create shares at the destination

This step imports the share configuration from the Windows Server 2003 system, using the file you created on step 4.

Run this from the Windows Server 2012 R2 server from an elevated command prompt.

reg import \\XFILES\c$\export.reg

Step 9 – Rename the WS2012 R2

In this step, the Windows Server 2012 R2 system is renamed to the same name as the old Windows Server 2003 system and the migration is done.

As soon as the system is rebooted, the clients will be able to access the file shares in the new system and the downtime ends.

Run this from the Windows Server 2012 R2 server from an elevated PowerShell prompt.

Rename-Computer -NewName FILES -Restart -Force

Step 10 – Verify you are all done! (***DOWNTIME ENDS HERE***)

At this point, you migration is complete and you’re basically using these commands to verify that the shares were properly migrated. The commands also sample the permissions in some of the shares and folders to verify that portion worked as well.

Run this from the Windows Server 2012 R2 server from an elevated PowerShell prompt.

Get-SmbShare
Get-SmbShareAccess Share23
Get-SmbShareAccess Project9
Get-Acl c:\homefolder\user23 | Format-List Path, AccessToString
Get-Acl c:\projects\project9 | Format-List Path, AccessToString

Conclusion

I highly recommend that you setup a test environment before trying this on your production Windows Server 2003 File Server. You test environment should also mimic your production environment as closely as possible. This way you can learn details about the procedure and customize the scripts to the details of your environment.

Good luck on your migration!

↧

Storage Spaces Survival Guide (Links to presentations, articles, blogs, tools)

November 19, 2014, 4:30 pm

≫ Next: Using PowerShell to select Physical Disks for use with Storage Spaces

≪ Previous: Migrating File Servers from Windows Server 2003 to Windows Server 2012 R2

In this post, I'm sharing my favorite links related to Storage Spaces in Windows Server 2012 R2. This includes TechEd Presentations, TechNet articles, Blogs and tools related to Storage Spaces in general and more specifically about its deployment in a Failover Cluster or Scale-Out File Server configuration. It's obviously not a complete reference (there are always new blogs and articles being posted), but hopefully this is a useful collection of links.

TechEd Presentations

TechEd North America: Storage Spaces: What’s New in Windows Server 2012 R2
TechEd North America:Best Practices for Deploying Tiered Storage Spaces in Windows Server 2012 R2
TechEd Europe 2014: Upcoming sessions about Storage Spaces

TechNet Articles – Storage Spaces

TechNet Wiki – Storage Spaces

Microsoft Cloud Platform System (CPS) powered by Dell

Main page for the Microsoft Cloud Platform System (CPS)
Blog Post: Unveiling The Microsoft Cloud Platform System, powered by Dell
Video: Introducing the Microsoft Cloud Platform System
Data Sheet: Microsoft Cloud Platform System powered by Dell
White Paper: Microsoft’s Cloud Platform System Delivers Best Price-to-Performance
TechEd Europe 2014: Sessions about the Cloud Platform System (CPS)

TechNet Articles – Cost-Effective Storage for Hyper-V

Blogs - Storage Spaces

TechNet Download - Tools

Test-StorageHealth Script– PowerShell script for gathering info on health, capacity, performance and events
Storage Spaces Physical Disk Validation Script– PowerShell script for testing disks before deployment
Diskspd, a Robust Storage Testing Tool, Now Publically Available
SmbStorageTier.ps1: A simple way to pin files to tiers in Scale-Out File Servers

Updates required for deployment

Windows Server Catalog

Hardware Certified for Storage Spaces

Partner Articles on Storage Spaces (alphabetical order, just a sample of the many partners solutions out there)

Thanks for the suggestions in the comments section (some of them already added to the list). Keep them coming…

↧

Using PowerShell to select Physical Disks for use with Storage Spaces

February 8, 2015, 2:12 pm

≫ Next: Case Studies on Storage Spaces, Scale-Out File Servers with SMB3 or both

≪ Previous: Storage Spaces Survival Guide (Links to presentations, articles, blogs, tools)

1. Introduction

If you use PowerShell to configure Storage Spaces, you probably noticed that selecting physical disks is an important part of the process.

You must select disks to create a pool and you might also need to do it also when you create a virtual disk using a subset of the disks.

2. Select all poolable disks

The simplest way to create a pool is to select all physical disks available that can be pooled.

The Get-PhysicalDisk cmdlet has a parameter to filter by the “CanPool” property for this exact purpose:

Get-PhysicalDisk -CanPool $true

You can also do the same thing by filtering the output by the “CanPool” property, which I find even simpler:

Get-PhysicalDisk | ? CanPool

3. Creating a pool

When creating a pool, you can use this cmdlet (in parenthesis) directly in the command line:

New-StoragePool -FriendlyName Pool1 -PhysicalDisk (Get-PhysicalDisk | ? CanPool)

Note that you will need additional parameters on your “New-StoragePool” cmdlet, like providing the storage subsystem or the default provisioning type. I removed those here to keep things simple and focused on how you select the physical disks.

Some people (particularly those with a programming background) might prefer to store the list of physical disks in a variable and use that variable in the other cmdlets.

$Disks = Get-PhysicalDisk | ? CanPool
New-StoragePool -FriendlyName Pool1 -PhysicalDisk $Disks

If you’re writing a complex script, creating the variable helps break tasks into smaller chunks that are hopefully easier to understand.

4. Filtering by other properties

If you have just a simple configuration, it might be OK to put all your disks in a single pool.

However, you might want to create separate pools for different purposes and that’s when you want to make sure you can select exactly the disks you want.

For instance, you might want to put half your SSDs in a “high performance” pool by themselves and put all your HDDs plus the other half of your SSDs in a second pool where you will use tiering.

Or maybe you will put all your 4TB, 7.2k rpm disks in one pool and all your 1TB 15K rpm disks in a second pool.

Here are a few ways to filter disks, shown as commands to create a $Disks variable that you could later use to create a pool or a virtual disk.

1: All the HDDs:
$Disks = Get-PhysicalDisk | ? CanPool | ? MediaType –eq HDD

2: All disks of a certain size:
$Disks = Get-PhysicalDisk | ? CanPool | ? Size –gt 1TB

3: A specific disk model:
$Disks = Get-PhysicalDisk | ? CanPool | ? Model –like ST95*

4: Specific Serial Numbers:
$Disks = Get-PhysicalDisk | ? {"6VEK9B89;a011c94e;13110930".Contains($_.SerialNumber)}

5: A certain quantity of SSDs:
$Disks = Get-PhysicalDisk | ? CanPool | ? MediaType –eq SSD| Select –First 10

Most of the examples above show the use of simple comparisons using a property and a constant value.

I used a few tricks like using the Contains string function to match a list or using the –like operator for patter matching using * as a wildcard.

PowerShell is indeed a powerful language and there are many, many ways to filter or select from a list.

5. Complex scenarios with enclosures

I find that the most challenging scenarios for selecting disks involve the use of enclosures.

In those cases, you want to spread the disks across enclosures for additional fault tolerance when using the IsEnclosureAware property of the pool.

For instance, you might have 4 enclosures with 40 disks each and you want to create 2 pools, each with 20 disks from each enclosure.

That will require a little scripting, which I show below. You basically create a variable to keep the list of disks and add to it as you loop through the enclosures:

# To select 20 HDDs from each enclosure

$PDisks = @()

$HDDsPerE = 20

Get-StorageEnclosure | % {

$EDisks = $_ | Get-PhysicalDisk –CanPool $true | ? MediaType –eq HDD |

Sort Slot | Select –First $HDDsPerEnc

If ($EDisks.Count –ne $HDDsPerE) {

Write-Error “Could not find $HDDsPerE HDDs on the enclosure” –ErrorAction Stop

}

$PDisks += $EDisks

}

New-StoragePool -FriendlyName Pool1 -PhysicalDisk $PDisks -EnclosureAwareDefault $true

In the example above, the $PDisks variable will hold all disks to pool and the $EDisks hold the disks to pool from a specific enclosure.

We basically enumerate the enclosures and get a subset of the physical disks using the object filtering techniques we discussed previously.

Then we add the $EDisks variable to the $PDisks variable, which will accumulate disks from the various enclosures.

We finally use the $PDisks variable to actually create the pool. Note again that you will need more parameters in the cmdlet to create the pool, but that’s not our focus here.

For our final example, we’ll use 4 SSDs and 16 HDDs per enclosure. It’s somewhat similar to the example above, but it is a more realistic scenario if you are using tiering:

# To select 16 HDDs and 4 SSDs from each enclosure

$PDisks = @()

$HDDsPerE = 16

$SSDsPerE = 4

Get-StorageEnclosure | % {

$EName = $_.FriendlyName

$EDisks = $_ | Get-PhysicalDisk –CanPool $true | ? MediaType –eq HDD |

Sort Slot | Select –First $HDDsPerE

If ($EDisks.Count –ne $HDDsPerE) {

Write-Error “Could not find $HDDsPerE HDDs on enclosure $EName ” –ErrorAction Stop

}

$PDisks += $EDisks

$EDisks = $_ | Get-PhysicalDisk –CanPool $true | ? MediaType –eq SSD |

Sort Slot | Select –First $SSDsPerE

If ($EDisks.Count –ne $SSDsPerE) {

Write--Error “Could not find $SSDsPerE SSDs on enclosure $EName ” –ErrorAction Stop

}

$PDisks += $EDisks

}

New-StoragePool -FriendlyName Pool1 -PhysicalDisk $PDisks -EnclosureAwareDefault $true

Conclusion

I am sure you will need to customize things for your own environment, but I hope this blog post has put you on the right track.

It might be a good idea to write a little function for disk selection, maybe with a few parameters to fit your specific requirements.

And always remember to test your scripts carefully before deploying anything in a production environment.

↧

Case Studies on Storage Spaces, Scale-Out File Servers with SMB3 or both

March 13, 2015, 2:10 pm

≫ Next: New version of the Storage Spaces physical disk validation PowerShell script

≪ Previous: Using PowerShell to select Physical Disks for use with Storage Spaces

There are many customers out there using Storage Spaces and Scale-Out File Servers with SMB3 since their initial release in Windows Server 2012 a few years back.

Every once in a while, someone will ask me for details on how these technologies were deployed by customers. The best source for those examples is the Microsoft Case Studies site.

The list below includes case studies on how a customer deployed a solution using Storage Spaces, SMB3 file servers or both combined:

And you should also note that the recently release Cloud Platform System (CPS) is another example of a solution that uses both Storage Spaces and Scale-Out File Servers with SMB3:

If you’re focused on gathering data about the performance of Storage Spaces and Scale-Out File Servers, there are a few interesting white papers available:

For more information about Storage Spaces or SMB, you can check these blog posts:

↧

New version of the Storage Spaces physical disk validation PowerShell script

March 14, 2015, 9:33 am

≫ Next: Windows PowerShell equivalents for common networking commands (IPCONFIG, PING, NSLOOKUP)

≪ Previous: Case Studies on Storage Spaces, Scale-Out File Servers with SMB3 or both

The Storage Spaces team has just published a new version of the Storage Space disk validation script written in PowerShell.

This script makes sure that the physical disks in a system have everything that is needed by Storage Spaces. That includes checking functional requirements and performance characteristics.

In the same way that we ask you to run the Cluster Validation Wizard before creating a cluster, it is a great idea to always run this script before using a specific set of disks with Storage Spaces.

The changes in the version 2 of the script include:

Intelligent parallelized drive profiling dramatically speeds up script execution time. Script will first profile throughput and IOPS of a single drive from each group, then the group in aggregate, and use these numbers to figure out the maximum batch size possible while remaining under Throughput or CPU limitations.
Intelligent SSD preconditioning. Script will only pre-condition the SSDs to sufficient time to overwrite the address space twice.
Switched to more reliable percentile-based latency measurements, rather than Min/Avg/Max.
In addition to the existing relative performance testing and comparisons, there are absolute performance thresholds which drives must meet in order to pass.
(currently disabled) MPIO testing for supported configurations will compare Round Robin performance to Fail-over only in order to determine optimal policies for specific drives and firmware versions. Further, the script detects the number of paths to drives through the devcon utility - this can help determine if there is an uneven configuration where a set of drives only has a single path, while other drives have multiple.
Drive wear is evaluated.
Switched to the new Diskspd version 2 release which brings a number of improvements over SQLIO.
Significant improvement to the output report.

Download the script from:

https://gallery.technet.microsoft.com/scriptcenter/Storage-Spaces-Physical-7ca9f304

↧

Windows PowerShell equivalents for common networking commands (IPCONFIG, PING, NSLOOKUP)

April 18, 2015, 9:19 am

≫ Next: The Deprecation of SMB1 – You should be planning to get rid of this old SMB dialect

≪ Previous: New version of the Storage Spaces physical disk validation PowerShell script

Network troubleshooting is part any System Administrator’s life. Maybe you need to check the IP address of a machine or test if its networking connection is working. Maybe you need to see if DNS is properly configured or check the latency between two hosts.

If you have been in this field long enough, you probably have a few favorite commands that you learned years ago and use on a regular basis, like IPCONFIG or PING.

There are literally hundreds of networking-related PowerShell cmdlets in Windows these days. Just try out this command on your machine: Get-Command -Module Net* | Group Module

But more important than knowing every one of them, is to know the most useful cmdlets that have the potential to replace those old commands that you can’t live without.

And it’s when you combine the many networking cmdlets in ways that only PowerShell can do that you’ll find amazing new troubleshooting abilities…

IPCONFIG

Description: This command has many options, but the most common usage is just to show the IP address, subnet mask and default gateway for each network adapter in a machine.

PowerShell: Get-NetIPConfiguration or Get-NetIPAddress

Sample command lines:

Get-NetIPConfiguration
Get-NetIPAddress | Sort InterfaceIndex | FT InterfaceIndex, InterfaceAlias, AddressFamily, IPAddress, PrefixLength -Autosize
Get-NetIPAddress | ? AddressFamily -eq IPv4 | FT –AutoSize
Get-NetAdapter Wi-Fi | Get-NetIPAddress | FT -AutoSize

Sample output:

PS C:\> Get-NetIPConfiguration
InterfaceAlias       : Wi-Fi
InterfaceIndex       : 3
InterfaceDescription : Dell Wireless 1703 802.11b|g|n (2.4GHz)
NetProfile.Name      : HomeWifi
IPv6Address          : fded:b22c:44c4:1:88f2:9970:4082:4118
IPv4Address          : 192.168.1.2
IPv6DefaultGateway   :
IPv4DefaultGateway   : 192.168.1.1
DNSServer            : 192.168.1.1
InterfaceAlias       : Bluetooth Network Connection
InterfaceIndex       : 6
InterfaceDescription : Bluetooth Device (Personal Area Network)
NetAdapter.Status    : Disconnected
InterfaceAlias       : Ethernet
InterfaceIndex       : 4
InterfaceDescription : Realtek PCIe GBE Family Controller
NetAdapter.Status    : Disconnected
PS C:\> Get-NetIPAddress | Sort InterfaceIndex | FT InterfaceIndex, InterfaceAlias, AddressFamily, IPAddress, PrefixLength –Autosize
InterfaceIndex InterfaceAlias                                AddressFamily IPAddress                            PrefixLength
-------------- --------------                                ------------- ---------                            -------
             1 Loopback Pseudo-Interface 1                            IPv6 ::1                                      128
             1 Loopback Pseudo-Interface 1                            IPv4 127.0.0.1                                  8
             3 Wi-Fi                                                  IPv6 fe80::88f2:9970:4082:4118%3               64
             3 Wi-Fi                                                  IPv6 fded:b22c:44c4:1:f188:1e45:58e3:9242     128
             3 Wi-Fi                                                  IPv6 fded:b22c:44c4:1:88f2:9970:4082:4118      64
             3 Wi-Fi                                                  IPv4 192.168.1.2                               24
             4 Ethernet                                               IPv6 fe80::ce6:97c9:ae58:b393%4                64
             4 Ethernet                                               IPv4 169.254.179.147                           16
             6 Bluetooth Network Connection                           IPv6 fe80::2884:6750:b46b:cec4%6               64
             6 Bluetooth Network Connection                           IPv4 169.254.206.196                           16
             7 Local Area Connection* 3                               IPv6 fe80::f11f:1051:2f3d:882%7                64
             7 Local Area Connection* 3                               IPv4 169.254.8.130                             16
             8 Teredo Tunneling Pseudo-Interface                      IPv6 2001:0:5ef5:79fd:1091:f90:e7e9:62f0       64
             8 Teredo Tunneling Pseudo-Interface                      IPv6 fe80::1091:f90:e7e9:62f0%8                64
             9 isatap.{024820F0-C990-475F-890B-B42EA24003F1}          IPv6 fe80::5efe:192.168.1.2%9                 128
PS C:\> Get-NetIPAddress | ? AddressFamily -eq IPv4 | FT –AutoSize
ifIndex IPAddress       PrefixLength PrefixOrigin SuffixOrigin AddressState PolicyStore
------- ---------       ------------ ------------ ------------ ------------ -----------
7       169.254.8.130             16 WellKnown    Link         Tentative    ActiveStore
6       169.254.206.196           16 WellKnown    Link         Tentative    ActiveStore
3       192.168.1.2               24 Dhcp         Dhcp         Preferred    ActiveStore
1       127.0.0.1                  8 WellKnown    WellKnown    Preferred    ActiveStore
4       169.254.179.147           16 WellKnown    Link         Tentative    ActiveStore

PS C:\> Get-NetAdapter Wi-Fi | Get-NetIPAddress | FT -AutoSize
ifIndex IPAddress                            PrefixLength PrefixOrigin        SuffixOrigin AddressState PolicyStore
------- ---------                            ------------ ------------        ------------ ------------ -----------
3       fe80::88f2:9970:4082:4118%3                    64 WellKnown           Link         Preferred    ActiveStore
3       fded:b22c:44c4:1:f188:1e45:58e3:9242          128 RouterAdvertisement Random       Preferred    ActiveStore
3       fded:b22c:44c4:1:88f2:9970:4082:4118           64 RouterAdvertisement Link         Preferred    ActiveStore
3       192.168.1.2                                    24 Dhcp                Dhcp         Preferred    ActiveStore

PING

Description: Checks connectivity to a specific host. Commonly used to check for liveliness, but also used to measure network latency.

PowerShell: Test-NetConnection

Sample command lines:

Test-NetConnection www.microsoft.com
Test-NetConnection -ComputerName www.microsoft.com -InformationLevel Detailed
Test-NetConnection -ComputerName www.microsoft.com | Select -ExpandProperty PingReplyDetails | FT Address, Status, RoundTripTime
1..10 | % { Test-NetConnection -ComputerName www.microsoft.com -RemotePort 80 } | FT -AutoSize

Sample output

PS C:\> Test-NetConnection www.microsoft.com
ComputerName           : www.microsoft.com
RemoteAddress          : 104.66.197.237
InterfaceAlias         : Wi-Fi
SourceAddress          : 192.168.1.2
PingSucceeded          : True
PingReplyDetails (RTT) : 22 ms
PS C:\> Test-NetConnection -ComputerName www.microsoft.com -InformationLevel Detailed
ComputerName             : www.microsoft.com
RemoteAddress            : 104.66.197.237
AllNameResolutionResults : 104.66.197.237
                           2600:1409:a:396::2768
                           2600:1409:a:39b::2768
InterfaceAlias           : Wi-Fi
SourceAddress            : 192.168.1.2
NetRoute (NextHop)       : 192.168.1.1
PingSucceeded            : True
PingReplyDetails (RTT)   : 14 ms
PS C:\> Test-NetConnection -ComputerName www.microsoft.com | Select -ExpandProperty PingReplyDetails | FT Address, Status, RoundTripTime -Autosize
Address         Status RoundtripTime
-------         ------ -------------
104.66.197.237 Success            22
PS C:\> 1..10 | % { Test-NetConnection -ComputerName www.microsoft.com -RemotePort 80 } | FT -AutoSize
ComputerName      RemotePort RemoteAddress PingSucceeded PingReplyDetails (RTT) TcpTestSucceeded
------------      ---------- ------------- ------------- ---------------------- ----------------
www.microsoft.com 80         104.66.197.237 True          17 ms                  True
www.microsoft.com 80         104.66.197.237 True          16 ms                  True
www.microsoft.com 80         104.66.197.237 True          15 ms                  True
www.microsoft.com 80         104.66.197.237 True          18 ms                  True
www.microsoft.com 80         104.66.197.237 True          20 ms                  True
www.microsoft.com 80         104.66.197.237 True          20 ms                  True
www.microsoft.com 80         104.66.197.237 True          20 ms                  True
www.microsoft.com 80         104.66.197.237 True          20 ms                  True
www.microsoft.com 80         104.66.197.237 True          15 ms                  True
www.microsoft.com 80         104.66.197.237 True          13 ms                  True

NSLOOKUP

Description: Name server lookup. Mostly used to find the IP address for a given DNS name (or vice-versa). Has many, many options.

PowerShell: Resolve-DnsName

Sample command lines:

Resolve-DnsName www.microsoft.com
Resolve-DnsName microsoft.com -type SOA
Resolve-DnsName microsoft.com -Server 8.8.8.8 –Type A

Sample output

PS C:\> Resolve-DnsName www.microsoft.com
Name                           Type   TTL   Section    NameHost
----                           ----   ---   -------    --------
www.microsoft.com              CNAME 6     Answer     toggle.www.ms.akadns.net
toggle.www.ms.akadns.net       CNAME 6     Answer     www.microsoft.com-c.edgekey.net
www.microsoft.com-c.edgekey.ne CNAME 6     Answer     www.microsoft.com-c.edgekey.net.globalredir.akadns.net
t
www.microsoft.com-c.edgekey.ne CNAME 6     Answer     e10088.dspb.akamaiedge.net
t.globalredir.akadns.net

Name       : e10088.dspb.akamaiedge.net
QueryType : AAAA
TTL        : 6
Section    : Answer
IP6Address : 2600:1409:a:39b::2768
Name       : e10088.dspb.akamaiedge.net
QueryType : AAAA
TTL        : 6
Section    : Answer
IP6Address : 2600:1409:a:396::2768
Name       : e10088.dspb.akamaiedge.net
QueryType : A
TTL        : 6
Section    : Answer
IP4Address : 104.66.197.237
PS C:\> Resolve-DnsName microsoft.com -type SOA
Name                        Type TTL   Section    PrimaryServer               NameAdministrator           SerialNumber
----                        ---- ---   -------    -------------               -----------------           ------------
microsoft.com               SOA 2976 Answer     ns1.msft.net                msnhst.microsoft.com        2015041801
PS C:\> Resolve-DnsName microsoft.com -Server 8.8.8.8 –Type A
Name                                           Type   TTL   Section    IPAddress
----                                           ----   ---   -------    ---------
microsoft.com                                  A      1244 Answer     134.170.188.221
microsoft.com                                  A      1244 Answer     134.170.185.46

ROUTE

Description: Shows the IP routes in a given system (also used to add and delete routes)

PowerShell: Get-NetRoute (also New-NetRoute and Remove-NetRoute)

Sample command lines:

Get-NetRoute -Protocol Local -DestinationPrefix 192.168*
Get-NetAdapter Wi-Fi | Get-NetRoute

Sample output:

PS C:\WINDOWS\system32> Get-NetRoute -Protocol Local -DestinationPrefix 192.168*
ifIndex DestinationPrefix NextHop RouteMetric PolicyStore
------- ----------------- ------- ----------- -----------
2       192.168.1.255/32 0.0.0.0         256 ActiveStore
2       192.168.1.5/32    0.0.0.0         256 ActiveStore
2       192.168.1.0/24    0.0.0.0         256 ActiveStore
PS C:\WINDOWS\system32> Get-NetAdapter Wi-Fi | Get-NetRoute
ifIndex DestinationPrefix                        NextHop     RouteMetric PolicyStore
------- -----------------                        -------     ----------- -----------
2       255.255.255.255/32                       0.0.0.0             256 ActiveStore
2       224.0.0.0/4                              0.0.0.0             256 ActiveStore
2       192.168.1.255/32                         0.0.0.0             256 ActiveStore
2       192.168.1.5/32                           0.0.0.0             256 ActiveStore
2       192.168.1.0/24                           0.0.0.0             256 ActiveStore
2       0.0.0.0/0                                192.168.1.1           0 ActiveStore
2       ff00::/8                                 ::                  256 ActiveStore
2       fe80::d1b9:9258:1fa:33e9/128             ::                  256 ActiveStore
2       fe80::/64                                ::                  256 ActiveStore
2       fded:b22c:44c4:1:d1b9:9258:1fa:33e9/128 ::                  256 ActiveStore
2       fded:b22c:44c4:1:c025:aa72:9331:442/128 ::                  256 ActiveStore
2       fded:b22c:44c4:1::/64                    ::                  256 ActiveStore

TRACERT

Description: Trace route. Shows the IP route to a host, including all the hops between your computer and that host.

PowerShell: Test-NetConnection –TraceRoute

Sample command lines:

Test-NetConnection www.microsoft.com –TraceRoute
Test-NetConnection outlook.com -TraceRoute | Select -ExpandProperty TraceRoute | % { Resolve-DnsName $_ -type PTR -ErrorAction SilentlyContinue }

Sample output:

PS C:\> Test-NetConnection www.microsoft.com–TraceRoute
ComputerName           : www.microsoft.com
RemoteAddress          : 104.66.197.237
InterfaceAlias         : Wi-Fi
SourceAddress          : 192.168.1.2
PingSucceeded          : True
PingReplyDetails (RTT) : 16 ms
TraceRoute             : 192.168.1.1
                         10.0.0.1
                         TimedOut
                         68.86.113.181
                         69.139.164.2
                         68.85.240.94
                         68.86.93.165
                         68.86.83.126
                         104.66.197.237
PS C:\> Test-NetConnection outlook.com -TraceRoute | Select -ExpandProperty TraceRoute | % { Resolve-DnsName $_ -type PTR -ErrorAction SilentlyContinue }
Name                           Type   TTL   Section    NameHost
----                           ----   ---   -------    --------
125.144.85.68.in-addr.arpa     PTR    7200 Answer     te-0-1-0-10-sur02.bellevue.wa.seattle.comcast.net
142.96.86.68.in-addr.arpa      PTR    4164 Answer     be-1-sur03.bellevue.wa.seattle.comcast.net
6.164.139.69.in-addr.arpa      PTR    2469 Answer     be-40-ar01.seattle.wa.seattle.comcast.net
165.93.86.68.in-addr.arpa      PTR    4505 Answer     be-33650-cr02.seattle.wa.ibone.comcast.net
178.56.167.173.in-addr.arpa    PTR    7200 Answer     as8075-1-c.seattle.wa.ibone.comcast.net
248.82.234.191.in-addr.arpa    PTR    3600 Answer     ae11-0.co2-96c-1a.ntwk.msn.net

NETSTAT

Description: Shows current TCP/IP network connections.

PowerShell: Get-NetTCPConnection

Sample command lines:

Get-NetTCPConnection | Group State, RemotePort | Sort Count | FT Count, Name –Autosize
Get-NetTCPConnection | ? State -eq Established | FT –Autosize
Get-NetTCPConnection | ? State -eq Established | ? RemoteAddress -notlike 127* | % { $_; Resolve-DnsName $_.RemoteAddress -type PTR -ErrorAction SilentlyContinue }

Sample output:

PS C:\> Get-NetTCPConnection | Group State, RemotePort | Sort Count | FT Count, Name -Autosize
Count Name
----- ----
    1 SynSent, 9100
    1 Established, 40028
    1 Established, 65001
    1 Established, 27015
    1 Established, 5223
    1 Established, 49227
    1 Established, 49157
    1 Established, 49156
    1 Established, 12350
    1 Established, 49200
    2 Established, 5354
    2 TimeWait, 5357
    2 Established, 80
    3 Established, 443
   36 Listen, 0
PS C:\> Get-NetTCPConnection | ? State -eq Established | FT -Autosize
LocalAddress LocalPort RemoteAddress   RemotePort State       AppliedSetting
------------ --------- -------------   ---------- -----       --------------
127.0.0.1    65001     127.0.0.1       49200      Established Internet
192.168.1.2 59619     91.190.218.57   12350      Established Internet
192.168.1.2 57993     213.199.179.175 40028      Established Internet
192.168.1.2 54334     17.158.28.49    443        Established Internet
192.168.1.2 54320     96.17.8.170     80         Established Internet
192.168.1.2 54319     23.3.105.144    80         Established Internet
192.168.1.2 54147     65.55.68.119    443        Established Internet
192.168.1.2 49257     17.143.162.214 5223       Established Internet
127.0.0.1    49227     127.0.0.1       27015      Established Internet
127.0.0.1    49200     127.0.0.1       65001      Established Internet
192.168.1.2 49197     157.56.98.92    443        Established Internet
127.0.0.1    49157     127.0.0.1       5354       Established Internet
127.0.0.1    49156     127.0.0.1       5354       Established Internet
127.0.0.1    27015     127.0.0.1       49227      Established Internet
127.0.0.1    5354      127.0.0.1       49157      Established Internet
127.0.0.1    5354      127.0.0.1       49156      Established Internet
PS C:\> Get-NetTCPConnection | ? State -eq Established | ? RemoteAddress -notlike 127* | % { $_; Resolve-DnsName $_.RemoteAddress -type PTR -ErrorAction SilentlyContinue }
LocalAddress                        LocalPort RemoteAddress                       RemotePort State       AppliedSetting
------------                        --------- -------------                       ---------- -----       --------------
192.168.1.2                         59619     91.190.218.57                       12350      Established Internet
192.168.1.2                         57993     213.199.179.175                     40028      Established Internet
192.168.1.2                         54334     17.158.28.49                        443        Established Internet
192.168.1.2                         54320     96.17.8.170                         80         Established Internet
Name      : 170.8.17.96.in-addr.arpa
QueryType : PTR
TTL       : 86377
Section   : Answer
NameHost : a96-17-8-170.deploy.akamaitechnologies.com
192.168.1.2                         54319     23.3.105.144                        80         Established Internet
Name      : 144.105.3.23.in-addr.arpa
QueryType : PTR
TTL       : 7
Section   : Answer
NameHost : a23-3-105-144.deploy.static.akamaitechnologies.com
192.168.1.2                         54147     65.55.68.119                        443        Established Internet
Name      : 119.68.55.65.in-addr.arpa
QueryType : PTR
TTL       : 850
Section   : Answer
NameHost : snt404-m.hotmail.com
192.168.1.2                         49257     17.143.162.214                      5223       Established Internet
192.168.1.2                         49197     157.56.98.92                        443        Established Internet
Name      : 92.98.56.157.in-addr.arpa
QueryType : PTR
TTL       : 3600
Section   : Answer
NameHost : bn1wns1011516.wns.windows.com

Note: Including a PDF version of the output in case you can't see it too well on the web with the lines wrapping and all. See below.

↧

The Deprecation of SMB1 – You should be planning to get rid of this old SMB dialect

April 21, 2015, 11:28 am

≫ Next: Storage Sessions at Microsoft Ignite. Make sure to update your schedule!

≪ Previous: Windows PowerShell equivalents for common networking commands (IPCONFIG, PING, NSLOOKUP)

I regularly get a question about when will SMB1 be completely removed from Windows. This blog post summarizes the current state of this old SMB dialect in Windows client and server.

1) SMB1 is deprecated, but not yet removed

We already added SMB1 to the Windows Server 2012 R2 deprecation list in June 2013. That does not mean it’s fully removed, but that the feature is “planned for potential removal in subsequent releases”. You can find the Windows Server 2012 R2 deprecation list at https://technet.microsoft.com/en-us/library/dn303411.aspx.

2) Windows Server 2003 is going away

The last supported Windows operating system that can only negotiate SMB1 is Windows Server 2003. All other currently supported Windows operating systems (client and server) are able to negotiate SMB2 or higher. Windows Server 2003 support will end on July 14 of this year, as you probably heard.

3) SMB versions in current releases of Windows and Windows Server

Aside from Windows Server 2003, all other versions of Windows (client and server) support newer versions of SMB:

Windows Server 2008 or Windows Vista – SMB1 or SMB2
Windows Server 2008 R2 or Windows 7 – SMB1 or SMB2
Windows Server 2012 and Windows 8 – SMB1, SMB2 or SMB3
Windows Server 2012 R2 and Windows 8.1 – SMB1, SMB2 or SMB3

For details on specific dialects and how they are negotiated, see this blog post on SMB dialects and Windows versions.

4) SMB1 removal in Windows Server 2012 R2 and Windows 8.1

In Windows Server 2012 R2 and Windows 8.1, we made SMB1 an optional component that can be completely removed. That optional component is enabled by default, but a system administrator now has the option to completely disable it. For more details, see this blog post on how to completely remove SMB1 in Windows Server 2012 R2.

5) SMB1 removal in Windows 10 Technical Preview and Windows Server Technical Preview

SMB1 will continue to be an optional component enabled by default with Windows 10, which is scheduled to be released in 2015. The next version of Windows Server, which is expected in 2016, will also likely continue to have SMB as an optional component enabled by default. In that release we will add an option to audit SMB1 usage, so IT Administrators can assess if they can disable SMB1 on their own.

6) What you should be doing about SMB1

If you are a systems administrator and you manage IT infrastructure that relies on SMB1, you should prepare to remove SMB1. Once Windows Server 2003 is gone, the main concern will be third party software or hardware like printers, scanners, NAS devices and WAN accelerators. You should make sure that any new software and hardware that requires the SMB protocol is able to negotiate newer versions (at least SMB2, preferably SMB3). For existing devices and software that only support SMB1, you should contact the manufacturer for updates to support the newer dialects.

If you are a software or hardware manufacturer that has a dependency on the SMB1 protocol, you should have a clear plan for removing any such dependencies. Your hardware or software should be ready to operate in an environment where Windows clients and servers only support SMB2 or SMB3. While it’s true that today SMB1 still works in most environments, the fact that the feature is deprecated is a warning that it could go away at any time.

7) Complete removal of SMB1

Since SMB1 is a deprecated component, we will assess for its complete removal with every new release.

↧

Storage Sessions at Microsoft Ignite. Make sure to update your schedule!

April 26, 2015, 7:25 pm

≫ Next: SMB3 Networking Links for Windows Server 2012 R2

≪ Previous: The Deprecation of SMB1 – You should be planning to get rid of this old SMB dialect

Hi!

If you’re planning to attend Ignite, here is a list of sessions related to Storage at the event:

Code	Session title	Presenters
BRK3496	Deploying Private Cloud Storage with Dell Servers and Windows Server vNext	Claus Joergensen, Shai Ofek, Syama Poluri
BRK3474	Enabling New On-premises Scale-Out File Server with Direct-Attached Storage	Claus Joergensen, Michael Gray
BRK3489	Exploring Storage Replica in Windows Server vNext	Ned Pyle
BRK3504	Hyper-V Storage Performance with Storage Quality of Service	Jose Barreto, Senthil Rajaram
BRK3498	Managing Storage with System Center Virtual Machine Manager: A Deep Dive	Hector Linares
BRK2458	Overview of Microsoft Azure Storage and Key Usage Scenarios	Vamshidhar Kommineni
BRK2472	Overview of the Microsoft Cloud Platform System	Vijay Tewari, Wassim Fayed
BRK2485	Platform Vision & Strategy (4 of 7): Storage Overview	Jose Barreto, Siddhartha Roy
BRK3463	Spaces-Based, Software-Defined Storage: Design and Configuration Best Practices	Allen Stewart, Jason Gerend, Joshua Adams
BRK2494	StorSimple: Extending Your Datacenter into Microsoft Azure with Hybrid Cloud Storage	Badri Venkatachari, Meghan Liese
BRK3487	Stretching Failover Clusters and Using Storage Replica in Windows Server vNext	Elden Christensen, Ned Pyle
BRK2473	System Center Virtual Machine Manager: Technical Overview and Roadmap	Eric Winner, Jonobie Ford
BRK2469	The Power of the Windows Server Software Defined datacenter in action	Phillip Moss
BRK3484	Upgrading your private cloud to Windows Server 2012 R2 and beyond!	Ben Armstrong, Rob Hindman

There are other sessions on Storage and other sessions on Windows Server, but this list tries to cover the intersection of the two, plus some bonus Azure and Hybrid sessions. My apologies in advance if I missed anything and feel free to add comments about other sessions that you’re planning to attend at Ignite.

With hundreds of sessions in the event, I suggest that you get started building your schedule…