EMC VMAX 40K: Mainframe Performance Accelerator

White Paper
EMC VMAX 40K:
Mainframe Performance Accelerator
Abstract
This document describes the Mainframe
Performance Accelerator feature of the VMAX 40K
for FICON environments.
October 2014
Copyright © 2014 EMC Corporation. All Rights
Reserved.
EMC believes the information in this publication is
accurate as of its publication date. The information
is subject to change without notice.
The information in this publication is provided “as
is.” EMC Corporation makes no representations or
warranties of any kind with respect to the
information in this publication, and specifically
disclaims implied warranties of merchantability or
fitness for a particular purpose.
Use, copying, and distribution of any EMC software
described in this publication requires an applicable
software license.
For the most up-to-date listing of EMC product
names, see EMC Corporation Trademarks on
EMC.com.
2
Introduction
EMC is a leading global supplier of mainframe attached storage, with a twenty year history of
supporting the world’s most demanding financial services, insurance, and government
mainframe sites. Over this period of time EMC has continually invested significant R&D
resources into mainframe storage, recently surpassing $1bn in cumulative investment at its
dedicated mainframe labs in Hopkinton, Massachusetts. The heritage of EMC Symmetrix, which
carries into the VMAX product, is that of a mainframe storage array, built, like the mainframe
itself, for high availability, high performance environments.
Users of the IBM System z platform demand high I/O throughput rates and low response times
for the mission critical applications that support their businesses. EMC has a long history of
delivery value to these environments in terms of reliability, availability, and performance, and
this commitment continues with the announcement of the Mainframe Performance Accelerator
for the VMAX 40K.
Overview
The Mainframe Performance Accelerator (MPA) is a no-charge Enginuity upgrade that can be
applied non-disruptively to any VMAX 40K containing FICON interfaces. It improves the
maximum IOPS by up to 60%, and also reduces I/O response times by up to 30%. These
performance improvements are realized when using a single port or both ports on the VMAX twoport FICON adapter.
As a result of implementing the MPA, VMAX 40K FICON configurations can support more I/O on a
given engine configuration, or require fewer FICON engines than would otherwise have been
required for a given workload. This can result in a smaller footprint for the VMAX configuration,
reducing the total cost of ownership and providing improved performance. Additional response
time improvements will be realized during I/O bursts which previously could have saturated the
FICON adapters and resulted in queuing delays. This reduction in queuing can improve both
response time and batch run times, and these improvements will be realized on both standard
and zHPF FICON channels.
Architecture
MPA is essentially a software change that reallocates one core on each of the Intel Westmere 6core processors used in a VMAX 40K engine from the ‘back end’ disk adapter (DA) thread to the
‘front end’ FICON adapter’s 2107 control unit emulation thread.
A VMAX engine is comprised of two director boards in a fault tolerant design. Figure 1 below
depicts this change using a representation of a single VMAX director board before and after the
MPA is applied. MPA only reallocates a core on a processor that is supporting FICON emulation.
It is worth noting that the FICON emulation is actually comprised of two software components:
the link emulation, which supports the FICON link protocol itself, and the 2107 control unit
emulation, which implements the actual IBM 2107 control unit specification. The reallocated
processor core is assigned to the 2107 control unit emulation.
3
Note that in the left image there are four processor cores (shown here in green) supporting the
DA emulation and one processor core (shown in yellow) supporting the 2107 emulation per
processor. In the image on the right a core has been reallocated leaving three cores supporting
the DA emulation, and two cores supporting the 2107 emulation.
Figure 1: Mainframe Performance Accelerator Core reallocation
The MPA will be delivered via maintenance to Enginuity level 5876.268 and is physically
implemented via a configuration option to enable simultaneous multi-processing (SMP) within
both the FICON link and 2107 emulations.
Additionally, MPA enables exploitation of the simultaneous multi-threading (SMT2) hardware
feature of the Intel processor. SMT2 allows a processor core to execute two independent units
of work simultaneously on a single physical core by activating a second hardware thread, called
a hyper-thread.
Enabling MPA has the following effect within Enginuity:

The FICON link emulation has increased I/O processing capacity on a single physical
core by exploiting SMT2.

The 2107 emulation will employ new multi-threading software architecture to process
I/O requests on a second physical core.

Four instances of the disk adapter emulation will now operate on three physical cores
with one of the cores hyper-threaded by SMT2.
The resulting additional I/O processing capacity now possible through the FICON emulation is
illustrated by the red line from the ports to the link emulation and on to the 2107 emulation
depicting additional work running concurrently. The distribution of work between the two
threads within the 2107 emulation is not done on a port boundary, but rather on a logical
control unit (LCU) boundary. One process handles I/O requests bound for even numbered LCUs
and the other process handles I/O requests bound for odd numbered LCUs.
4
Enabling the MPA
Once the maintenance containing the MPA has been applied, MPA is enabled via a new IMPL bin
setting in the CKD options entitled ‘FICON SMP mode’. It has two options: DISABLED (the
default) value, and ‘ENABLE SMP + take 1 DA core’. This is depicted in figure 2 below.
Figure 2: Configuration option to enable Mainframe Performance Accelerator
In addition, within Mainframe Enabler version 7.6, the SRDF Host Component has been
enhanced to show the enabled/disabled state of the mainframe performance accelerator in the
SQ CNFG command output:
EMCGM11I SRDF-HC DISPLAY FOR (5) &SQ CNFG,3C00 816
SERIAL #:0001957-00080/0GKHL MEM:73728 MB TYPE:2107 MODEL:VMAX40K
CNTRL:UIG1
MICROCODE LEVEL: 5876-268
CONCURRENT-RDF
CONCURRENT DRDF: YES
3-DYN-MIRROR
SYMMETRIX DATA ENCRYPTION: DISABLED
FICON ACCELERATOR: ENABLED
SWITCHED-RDF DYNAMIC-RDF NO-AUTO-LINKS RDFGRP
LINKS-OFF-ONPOWERUP
LINKS-DOMINO: RDFGRP
SYNCH_DIRECTION: GLOBAL
LINK: LOCAL
Figure 3: SRDF Host Component Display of MPA status
5
Performance
The Mainframe Performance Accelerator will benefit configurations where FICON front end
capacity is a bottleneck. Simply put, MPA adds additional processing capacity to the 2107
emulation instance. This additional processing capacity can be exploited by a single port, up to
the saturation point of the FICON processor, or by employing both ports on the adapter card
instead of using only one port.
Figure 4 below illustrates that with MPA, enabled IOPS and response time benefits are seen
regardless of whether a single port or both ports are used on the FICON adapter:
Figure 4: MPA equivalent IOPS and response time benefit when using one vs. two ports
Most importantly, MPA provides additional FICON IOPS for workload growth or to absorb
workload peaks. MPA extends the maximum workload that can be supported and maintains
response time during such periods, as this four engine workload graph shows:
6
Figure 5: 4 Engine VMAX 40K, 8 FICON paths active using MPA
Finally, batch run times can be reduced since MPA provides a higher I/O rate as well as
improved response time for large block batch workloads as seen in figure 6 below:
Figure 6: MPA - Large Block increased IOPS and improved response times
7
Conclusion
The VMAX 40K introduced a 6-core processor to the VMAX product line which had been using 4core processors. The two additional processors were assigned to the DA emulation at
introduction to balance overall system performance. What has been observed in mainframe
environments is that the back end workload did not warrant the use of four processors and that
more balanced performance for mainframe workloads would be achieved by evenly deploying
the processor capacity between the front end and back end of the array. This was the motivation
behind the development of the Mainframe Performance Accelerator.
The Mainframe Performance Accelerator also enables the FICON link and disk adapter
emulations to exploit the SMT2 feature of the Intel processor to improve I/O processing capacity.
In addition to the performance benefits of increased IOPS and reduced response time, the
reduction in FICON port requirements provides additional flexibility to add SRDF connectivity or
reduce engine counts in mainframe configurations. Peak workloads can now be absorbed
without the need to add engines to the configuration.
MPA is an enhancement that is designed for easy deployment and will yield significant
performance benefits and configuration savings. EMC performance specialists have modeling
tools which have been updated to reflect the capabilities of the MPA. As with any configuration
change, a detailed analysis using user supplied data should be performed prior to
implementation.
8