Friday, 23 March 2012

Xperf Rocks Part 2: Analyzing Storage Performance Data & Generating Graphs


Xperf Rocks Part 2: Analyzing Storage Performance Data & Generating Graphs

Introduction

Xperf is one of Microsoft’s hottest diagnostic tools. It collects event trace information from components of the operating system and then displays the data in a graphical format. Powerful graph options allow you to quickly pinpoint any storage bottlenecks, along with the detailed data in a tabular format. This article continues where Xperf Rocks Part 1: Troubleshooting Storage Performance Problems left off. It discusses how to use Xperf to analyze event data and generate graphs and tables.


Over 70% of network issues are caused by faulty configurations or unauthorized changes. SolarWinds Network Configuration Manager (NCM) helps you reduce or eliminate those issues by continuously monitoring device configurations and providing immediate notification of config changes so you can resolve issues before they impact users. NCM supports multi-vendor network environments with a rich web-based console providing point-and-click simplicity and easy access to config data

Download a free, fully-functional 30-day trial of Network Configuration Manager - start backing up & tracking network configs in less than an hour
!

Xperf Viewer
In the previous Xperf article (part 1), we learned how Xperf is installed and used to collect event trace logs (ETLs). These trace logs contain data that characterizes the problem. The ETL data can be analyzed on the problem system or it can be copied to another workstation where WPT is installed.
The following Xperf command will analyze the data and then use Xperfview.exe to generate graphs and tables:
Xperf tracedata.etl
As the Xperf tool parses the ETL log file, it performs 2 passes at analyzing the data. Once the data is analyzed, Xperf displays a viewer that you can use to study the various graphs. Of particular concern for a storage bottleneck, you would want to focus on the Disk Utilization and the Disk Utilization by Process graphs. In figure 1 below, you can see the Disk I/O graph displays the various I/O counts for read and write operations. Xperf also shows an expandable frame on the left-hand side allowing you to quickly switch between the different graphs.
Xperf disk I/O graph
Figure 1 (above): Xperf Disk I/O graphs illustrates Read and Write counts
Each graph will reveal more detailed information as you hover the mouse cursor over a particular spot on the graph. In figure 2 below, you can see how the scan32.exe process is exposed in the following Disk Utilization by Process graph when the mouse is positioned over the red line.
Xperf Disk Utilization by Process graph
Figure 2 (above): Xperf Disk Utilization by Process graph expands details by hovering mouse cursor over graph
You can quickly zoom in on a particular area of the graph to get more granularity on the performance counters or events. To do so, simply left-click and drag over the portion of the graph you are interested in, and then right-click to select “Zoom To Selection” as seen in figure 3 below. This will create a new graph that is focused on the region of time you selected. To unzoom the graph, just right-click anywhere on the graph and choose Unzoom.
Xperf graph Zoom To Selection
Figure 3 (above): Xperf graph Zoom To Selection to refine time range



As you can see below in figure 4, the newly refreshed graph shows just the time period you are interested in. You can use the graph legend in the upper right-hand corner to pull-down and select just the items you are interested in graphing. In the example below, you can see a variety of file I/O types such as Create, Close, QueryInfo, etc. that contribute to the spike of activity on disk 0.
Xperf File I/O graph: types of file operations
Figure 4 (above): Xperf File I/O graph displays types of file operations
Finally, you can explode the details even further by right clicking the graph and choosing Summary Table. A table will be generated which displays each of the related performance counters in a tabular format so you can quickly see the minimum, maximum and specific values to help you isolate the storage issue. As you can see in figure 5 below, the scan32.exe (antivirus scanner) is clearly dominating the storage subsystem in terms of read I/Os:
Xperf summary table view
Figure 5 (above): Xperf summary table view provides further details on performance metrics

Summary

As you can see, Xperf quickly revealed the antivirus process scan32.exe was causing a tremendous amount of Read requests which lead to the high spike in activity of disk 0. While the example was contrived in nature, it exemplifies how easy Xperf is to use and identify potential storage bottlenecks. Watch for future articles on using the latest Microsoft tools for resolving Windows storage issues.

No comments:

Post a Comment