White Paper EMC VMAX 40K: Mainframe Performance Accelerator Abstract This document describes the Mainframe Performance Accelerator feature of the VMAX 40K for FICON environments. October 2014 Copyright © 2014 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. 2 Introduction EMC is a leading global supplier of mainframe attached storage, with a twenty year history of supporting the world’s most demanding financial services, insurance, and government mainframe sites. Over this period of time EMC has continually invested significant R&D resources into mainframe storage, recently surpassing $1bn in cumulative investment at its dedicated mainframe labs in Hopkinton, Massachusetts. The heritage of EMC Symmetrix, which carries into the VMAX product, is that of a mainframe storage array, built, like the mainframe itself, for high availability, high performance environments. Users of the IBM System z platform demand high I/O throughput rates and low response times for the mission critical applications that support their businesses. EMC has a long history of delivery value to these environments in terms of reliability, availability, and performance, and this commitment continues with the announcement of the Mainframe Performance Accelerator for the VMAX 40K. Overview The Mainframe Performance Accelerator (MPA) is a no-charge Enginuity upgrade that can be applied non-disruptively to any VMAX 40K containing FICON interfaces. It improves the maximum IOPS by up to 60%, and also reduces I/O response times by up to 30%. These performance improvements are realized when using a single port or both ports on the VMAX twoport FICON adapter. As a result of implementing the MPA, VMAX 40K FICON configurations can support more I/O on a given engine configuration, or require fewer FICON engines than would otherwise have been required for a given workload. This can result in a smaller footprint for the VMAX configuration, reducing the total cost of ownership and providing improved performance. Additional response time improvements will be realized during I/O bursts which previously could have saturated the FICON adapters and resulted in queuing delays. This reduction in queuing can improve both response time and batch run times, and these improvements will be realized on both standard and zHPF FICON channels. Architecture MPA is essentially a software change that reallocates one core on each of the Intel Westmere 6core processors used in a VMAX 40K engine from the ‘back end’ disk adapter (DA) thread to the ‘front end’ FICON adapter’s 2107 control unit emulation thread. A VMAX engine is comprised of two director boards in a fault tolerant design. Figure 1 below depicts this change using a representation of a single VMAX director board before and after the MPA is applied. MPA only reallocates a core on a processor that is supporting FICON emulation. It is worth noting that the FICON emulation is actually comprised of two software components: the link emulation, which supports the FICON link protocol itself, and the 2107 control unit emulation, which implements the actual IBM 2107 control unit specification. The reallocated processor core is assigned to the 2107 control unit emulation. 3 Note that in the left image there are four processor cores (shown here in green) supporting the DA emulation and one processor core (shown in yellow) supporting the 2107 emulation per processor. In the image on the right a core has been reallocated leaving three cores supporting the DA emulation, and two cores supporting the 2107 emulation. Figure 1: Mainframe Performance Accelerator Core reallocation The MPA will be delivered via maintenance to Enginuity level 5876.268 and is physically implemented via a configuration option to enable simultaneous multi-processing (SMP) within both the FICON link and 2107 emulations. Additionally, MPA enables exploitation of the simultaneous multi-threading (SMT2) hardware feature of the Intel processor. SMT2 allows a processor core to execute two independent units of work simultaneously on a single physical core by activating a second hardware thread, called a hyper-thread. Enabling MPA has the following effect within Enginuity: The FICON link emulation has increased I/O processing capacity on a single physical core by exploiting SMT2. The 2107 emulation will employ new multi-threading software architecture to process I/O requests on a second physical core. Four instances of the disk adapter emulation will now operate on three physical cores with one of the cores hyper-threaded by SMT2. The resulting additional I/O processing capacity now possible through the FICON emulation is illustrated by the red line from the ports to the link emulation and on to the 2107 emulation depicting additional work running concurrently. The distribution of work between the two threads within the 2107 emulation is not done on a port boundary, but rather on a logical control unit (LCU) boundary. One process handles I/O requests bound for even numbered LCUs and the other process handles I/O requests bound for odd numbered LCUs. 4 Enabling the MPA Once the maintenance containing the MPA has been applied, MPA is enabled via a new IMPL bin setting in the CKD options entitled ‘FICON SMP mode’. It has two options: DISABLED (the default) value, and ‘ENABLE SMP + take 1 DA core’. This is depicted in figure 2 below. Figure 2: Configuration option to enable Mainframe Performance Accelerator In addition, within Mainframe Enabler version 7.6, the SRDF Host Component has been enhanced to show the enabled/disabled state of the mainframe performance accelerator in the SQ CNFG command output: EMCGM11I SRDF-HC DISPLAY FOR (5) &SQ CNFG,3C00 816 SERIAL #:0001957-00080/0GKHL MEM:73728 MB TYPE:2107 MODEL:VMAX40K CNTRL:UIG1 MICROCODE LEVEL: 5876-268 CONCURRENT-RDF CONCURRENT DRDF: YES 3-DYN-MIRROR SYMMETRIX DATA ENCRYPTION: DISABLED FICON ACCELERATOR: ENABLED SWITCHED-RDF DYNAMIC-RDF NO-AUTO-LINKS RDFGRP LINKS-OFF-ONPOWERUP LINKS-DOMINO: RDFGRP SYNCH_DIRECTION: GLOBAL LINK: LOCAL Figure 3: SRDF Host Component Display of MPA status 5 Performance The Mainframe Performance Accelerator will benefit configurations where FICON front end capacity is a bottleneck. Simply put, MPA adds additional processing capacity to the 2107 emulation instance. This additional processing capacity can be exploited by a single port, up to the saturation point of the FICON processor, or by employing both ports on the adapter card instead of using only one port. Figure 4 below illustrates that with MPA, enabled IOPS and response time benefits are seen regardless of whether a single port or both ports are used on the FICON adapter: Figure 4: MPA equivalent IOPS and response time benefit when using one vs. two ports Most importantly, MPA provides additional FICON IOPS for workload growth or to absorb workload peaks. MPA extends the maximum workload that can be supported and maintains response time during such periods, as this four engine workload graph shows: 6 Figure 5: 4 Engine VMAX 40K, 8 FICON paths active using MPA Finally, batch run times can be reduced since MPA provides a higher I/O rate as well as improved response time for large block batch workloads as seen in figure 6 below: Figure 6: MPA - Large Block increased IOPS and improved response times 7 Conclusion The VMAX 40K introduced a 6-core processor to the VMAX product line which had been using 4core processors. The two additional processors were assigned to the DA emulation at introduction to balance overall system performance. What has been observed in mainframe environments is that the back end workload did not warrant the use of four processors and that more balanced performance for mainframe workloads would be achieved by evenly deploying the processor capacity between the front end and back end of the array. This was the motivation behind the development of the Mainframe Performance Accelerator. The Mainframe Performance Accelerator also enables the FICON link and disk adapter emulations to exploit the SMT2 feature of the Intel processor to improve I/O processing capacity. In addition to the performance benefits of increased IOPS and reduced response time, the reduction in FICON port requirements provides additional flexibility to add SRDF connectivity or reduce engine counts in mainframe configurations. Peak workloads can now be absorbed without the need to add engines to the configuration. MPA is an enhancement that is designed for easy deployment and will yield significant performance benefits and configuration savings. EMC performance specialists have modeling tools which have been updated to reflect the capabilities of the MPA. As with any configuration change, a detailed analysis using user supplied data should be performed prior to implementation. 8
© Copyright 2024