Why we chose HP-UX for our Mission Critical Application Olivier S. Massé Systems Administrator, Hydro Quebec Distribution November 2008 [email protected] 1 14 November 2008 What are we going to cover? • I’ll spend most of my time on this chapter: − Part 1: Making a case for a migration to HP-UX • Then I’ll go quickly on the following two: − Part 2: Easing the transition for your developers − Part 3: Smoothing things up for your system administrators 2 14 November 2008 Part 1: Making a case for a migration to HP-UX Description of our application What factors motivated us to move What were our criteria Why we chose HP-UX What technology was chosen Tech Talk 3 14 November 2008 Who are we? • Hydro-Québec Distribution is one of the largest power distributors in North America and the biggest public electric utility in Canada. • Serving Quebec, a Canadian province spanning 1.6 square million km, Hydro-Québec delivers energy to 3.5 million customers with an innovative distribution management solution based on Integrity servers running HP-UX. • A large portion of our energy is hydro-electric. • Quebec is a land of climatic extremes, and large distances separate generation facilities from load centres. Overview of our project • In 2007, we performed successfully a migration from a timesharing OS to HP-UX − 6 data centers scattered throughout the province (one on an isolated island!). − Up to 800km apart, around 50 servers total • 5 Our migration went very well. This presentation will show you “Why We Chose HP-UX” on Integrity over other platforms. 14 November 2008 What is our Application? • Since the 1980s, when the use of SCADA (Supervisory Control And Data Acquisition) for distribution was rare, we pioneered tools and processes to improve the efficiency of our network − Our network includes more than 100,000 km of overhead lines and underground cables, 500,000 transformers and 190,000 switching devices. • Our SCADA application is a DMS (Distribution Management System). • It manages many tasks that are related to electrical distribution to the end customer. − − − − − 6 Outage and workforce management Mobile communications with work crews Construction projects Automation of the electrical network Interfaces with corporate CRM systems 14 November 2008 What are the components? • The main app itself is a traditionnal client-server. − The clients run on Windows XP. − The workprocesses and databases are on HP-UX 11iv2: • Developed in mostly C/C++, a few bits of Java and legacy Fortran code as well • Oracle instances (Most at 10g, some at 9i) • We have an abundance of other tools and interfaces, all running on HP-UX as well: − SOAP Webservice and FTP/SFTP server for data exchange − Data warehouse − PHP web applications − “Collateral” servers: Stratum-2 NTP servers, SOCKS proxy, Subversion server − Mobile radio application (API with Motorola software) − Etc. 7 14 November 2008 Example: State of the electrical network 8 • Each electrical line, and all its components, have a “state”. • Some states are entered manually, others are acquired real-time with remote interfaces. • Most of the management is done using the Windows client, with the HPUX backend doing all the processing. 14 November 2008 Example: Remote control Inside the box is a remote-controlled motor. An operator manages remotely the state of the whole electrical line, communicating with the switch if necessary Manual cut-offs in the electrical network are managed by automated switches such as this one 9 14 November 2008 Example: PHP Web Application • • Web-based PHP applications are catching on, as simple frontends can be deployed to end-users without the need to develop and package a client. We have a few legacy ASP applications that are slowly being rewritten in PHP as well. Screenshot of our custom bug database, used internally. 10 14 November 2008 Our client systems over the years 1980s: Microvax stations; text-based Mid 1990s: PCs with MFC GUI Since June 2007: PCs with Itanium backend running on Integrity HP-UX 11 14 November 2008 Why did you move away from your older platform? • We already had a very reliable platform. Question is, why bother? − The market share of our previous OS was slowly shrinking. • Our DMS is for sale through a partner and there were commercial interests to port the application to benefit from an increased marketability. • System Administration and development expertise for the other platform was becoming harder to find. − The CPU architecture we were using for years was going to be discontinued • An effort was already required to recompile the code to a new CPU architecture, so there was an opportunity to migrate. − Oracle performance was not up to our expectations. 12 14 November 2008 What criteria were used to choose HP-UX among others? • The usual reasons: − Performance − Reliability − TCO • And a few more: − Security − Scalability − Support from the manufacturer − Manageability − Available applications • 13 I’ll detail these criteria in the next few slides 14 November 2008 Performance 14 14 November 2008 • Performance is most important factor for us • During normal workloads, the application doesn’t consume lots of resources but we need a lot of headroom to be able to meet a sudden demand. • The efficiency of our ground teams, and what the customer thinks of that efficiency, ultimately depend on the performance of our servers. • If there is any major outage due to a natural disaster, the load will go way up. − Case in point: A major ice storm in 1998 broke up large parts of our electrical network and required intensive computing resources − Recent, less spectacular event: Two weeks ago, one region had an early snowfall, with heavy flakes, and this interrupted electricity to more than 50,000 customers. We have similar events many times each year. Reliability and TCO • Planned downtime is allowed, depending on the weather… but any unplanned downtime must be kept to a minimum when teams are on the ground − • TCO is a factor: We’re not looking for the theoretical, 100% uptime that some other platforms strive to offer with a heftier price tag − • 15 Like with slow performance, any downtime reduces operational efficiency and this has a direct, visible impact to the customer. We can live with 99.995%: 26 minutes of downtime per year. Clustering technology must be available − To minimize planned (and unplanned) downtime, and reach that 99.995% − To implement a DR plan with an extended-distance cluster (e.g. Metrocluster) 14 November 2008 Security, Scalability, Support • Increased security awareness requires a platform that has a good security record − Regular audits are done within the company and we must document our security process • The scalability of the platform is important − A natural disaster could require emergency additional capacity − We have 6 data centers Æ future consolidation is possible − The platform should support virtualization • Support from a top tier manufacturer is required − Security patches, training and frontline support − Complete documentation on the OS and subsystems − Long support commitment and sales availability for OS releases 16 14 November 2008 Manageability and applications • We must be able to perform trivial tasks online without disruption, e.g.: − Adding and reducing disk space • Especially thin provisioning in the future − Online addition and replacement of failed components − Dynamic reconfiguration of the OS kernel − Migration from one SAN to another • Minimal maintenance requirements on the OS must be needed − System Administrators must be able to go on vacation sometimes • A good ecosystem of applications must be available − We need to be able to run basic web services − Development tools must be good 17 14 November 2008 How does HP-UX measure up to our criteria? (1/4) The Itanium2 processor was, and still is, a leader in terms of performance. • − Especially with Oracle, which we needed − Sample benchmarks we wrote showed a potential 400% increase in performance • The fact that Integrity platform can run a variety of OS reduced investment risks • We already had experience with other HP products, and going to Integrity was a logical migration path for us 18 − All our Windows environments run on Proliant − Our SANs are based on StorageWorks products; HPUX integrates well with them (especially with Metrocluster) 14 November 2008 How does HP-UX measure up to our criteria? (2/4) • We ordered a study to help us choose the new platform and OS − Concerning HP-UX, without any surprise, the first thing the report did was quote Gartner: • “HP-UX is an excellent operating system choice for the enterprise and it will be a solid migration vehicle for the HP user base as well as developers looking for an evolving platform” (Gartner, 2004) − They also wrote this concerning security: • “It is preferable to minimize security risks by favoring an operating system that has an emphasis on performance, stability and ROI, rather than a mass technological choice. Most proprietary Unix operating systems fall in that category”. − To get performance metrics, they extracted a few results from TPC (Transaction Processing Performance Council): • Integrity servers, either with HP-UX or Linux, were generally among leaders in Oracle 9i performance. 19 14 November 2008 How does HP-UX measure up to our criteria? (3/4) • Linux on Integrity was considered since it was on par with HP-UX for performance, but it had to be dropped in the early stages − We were looking for an outstanding support level • Our corporation was already deploying a major HP-UX environment in another department − A SAP solution on HP-UX replaced the mainframe for our CRM and invoicing (millions of customers). The project went in full production in early 2008. − Since we’re already an HP-UX customer, the learning curve is reduced for any internal resource that could be brought on to work on our environment. 20 14 November 2008 How does HP-UX measure up to our criteria? (4/4) • We needed applications to be able to run our PHP and SOAP web services: − We looked if HP-UX could “get the job done”, and it did: • Apache Server, Tomcat, PHP are bundled in HP-UX Web Server Suite • Axis and XML libraries are available on Internet Express to run SOAP Services, but they’re unsupported • Increased performance was expected by migrating PHP applications from Windows 2003 to HP-UX • The bottom line is that we didn’t need YAP (yet another platform) for our web services. • HP-UX Web Server suite and many of its modules are officially supported by HP − When we were running Apache + PHP on Windows, we had no support. 21 14 November 2008 Building a roadmap • A roadmap was produced, covering every task to follow: − 2005: Design, studies and architecture. − 2006: Porting effort and first phase of equipment purchase for our test environments − 2007: Second phase of acquisition and go live. 22 • We referred to the roadmap all the way throughout the migration, and this helped us keep our vision. • Our older MA-based SANs were migrated to EVAs in 2005 and 2006 to have a head start before the migration to Integrity. 14 November 2008 What technology was chosen? 1/2 • For test environments, we use vPars and Integrity VMs on various servers such as the rx7640 − We’re very satisfied of the flexibility resulting of the use of vPars and Integrity VMs − Vpars are perfect for performance testing since they use “real” CPUs and I/O without any VM-related overhead, and can be reconfigured quickly. rx7640 − VMs are well suited for quickly building test environments where you don’t need to measure strict performance metrics. • Development environments are on a big rx6600 with lots of memory (144Gb) running Integrity Virtual Machines − We were running on a few rx2600’s until recently but they became clogged and unexpandable. 23 14 November 2008 rx6600 What technology was chosen? 2/2 • Mission critical production is on Serviceguard clusters consisting of: − rx7640s (main servers) − rx6600s (standby servers) • Other production is on rx2660s, rx3600s and rx6600s, some with ServiceGuard. − rx2660 are inexpensive, but have a drawback: no OLAR • A multi-node Metrocluster is in production, and used to implement our DR plan • We’re currently investigating BladeServers for future upgrades. − Integrity blades can be mixed with Proliants, which reduces costs. − When combined with a Virtual Connect infrastructure and Ignite-UX, Blades can deployed very quickly. 24 14 November 2008 bl860c Tech talk: What is a Metrocluster? 25 14 November 2008 • A Metrocluster is a ServiceGuard cluster that spans multiple sites and disk arrays • It harnesses the power of your XP or EVA-based CA data replication • It’s a turnkey solution that works well and is well integrated • Drawback: three data centers are required, the third one hosting a quorum server Tech talk: Praise on blade servers and Virtual Connect • I’ve been playing with a few bladesystem chassis since early 2008, along with the Virtual Connect technology, and I find these products to be of an outstanding value • Virtual Connect simplifies network connections and management − If your chassis is well planned, physical installation of a server is as quick and painless as it can be since you don’t have to do any cabling • The architecture encourages you to double up LAN and FC connections, and online firmware upgrades of all the components can be made online. • Good platform for Virtualization − Allows a lot of density. Works wonders with ESX clusters, can’t wait to see what’s cooking for Integrity VM 5.0 26 14 November 2008 Part 2: Easing the transition for your developers Choosing the right development tools Source control Tips for the HP-UX developer Open sources tools you can’t live without 27 14 November 2008 Choosing the right compilers • As far as C goes, you have two choices: − HP’s own C/C++ compiler named « aC++ » • Itanium optimized compiler − The ia64 version of GCC (Gnu C Compiler) • an open-source and well-respected C/C++ compiler • The open source GCC can be used for simple programs, although every major project should use aC++. We chose aC++. − It is not expensive, and available for free to qualified ISVs • We also purchased the HP Fortran compiler. • Our few Java programs use the JDK handed out by HP, as well as some of their tools (such as jmeter) 28 14 November 2008 Hint to help reduce the porting effort • For years, every low-level function in the code of our application has been using UTL, a homemade portable API. • #ifdefs were added throughout the code to be able to generate portable code between the older OS and HP-UX. New operating systems could also be added easily in the future. • Future porting was expected, so this preparation was started a few years ago. − We had a workable PA-RISC version for a while as a proof of concept before recompiling on Itanium2. 29 14 November 2008 Debugging • Our previous OS could not generate core files in a timely manner, so they were disabled − The core file can be used to play back and inspect the source code line by line, as it was at the moment of the crash. • HP-UX, like all Unixes, generates a core file quickly when a process crashes. − • 30 Hint: « kill –3 » can be used on a hung process to have it generate a core file. The wdb debugger included with aC++ is a graphical front-end to gdb (GNU debugger), which is already well known by many developers. 14 November 2008 wdb Source control • The open source Subversion (SVN) product is our source control system − It’s solid: absolutely no outage or problem since we migrated all our server sources to SVN in 2006. • Running a complete SVN server on HP-UX requires a custom compilation, as there are no binaries available. − The basic “svnserve” server available from the porting center it too limited. We favored an Apache+SVN WebDAV server. • 31 The developers liked SVN so much that they migrated the sources of the client application from Microsoft VSS to SVN in 2007. 14 November 2008 Obtaining low-level HP-UX documentation 1/2 • There are few resources available for the programmer who wants/needs to know how HP-UX works at the lowest level. • Quadrants, executable magic numbers, etc are not well explained, and documentation on them is hard to find. • It doesn’t help that the available documentation is old, or out of print. • We decided to stay with a 32-bit application to reduce development efforts, but had to cope with HP-UX’s memory management limits with 32-bit apps. 32 14 November 2008 Obtaining low-level HP-UX documentation 2/2 • Ask your developers to subscribe to the cxx-dev and devtools mailing lists. • Have them read the following whitepapers available on docs.hp.com: − − • HP-UX memory management whitepaper HP-UX process management whitepaper Try to find the following out-of-print books: − HP-UX internals by Chris Cooper (2004) • − PA-RISC oriented, but can be useful to understand memory management HP-UX performance and tuning by Sauers, Ruemmler, Weygant • Also PA-RISC • The DSPP (developer and solution partner) site might have info, but it is hard to navigate; you also need to register as a company. • Look at the new “knowledge on demand” webcasts: http://www.hp.com/go/kod • The book “Advanced Programming in the Unix environment” is an excellent book – strongly recommended if you need to develop lowlevel stuff. 33 14 November 2008 Glance and caliper • It is recommended that you purchase glance, either with the minimum of the VSE OE or separately. Glance has been invaluable to our developers, who use it as a troubleshooting tool. • Caliper, a free profiler for Itanium, has also been of tremendous help to increase software quality − One of our senior developers assisted to a lab at HPTF 2006 in Houston where he discovered Caliper. • 34 Ktracer should be investigated (I haven’t tried it yet) 14 November 2008 Glance in action Quote from our developers “How did we make your application faster than before? Caliper! Caliper! Caliper!” “Caliper has helped us track bugs and performance issues we would have completely missed otherwise” 35 14 November 2008 The development environment • Security restrictions prevent me from setting up an IDE environment with a fancy setup such as the Eclipse IDE using a CIFS shared mountpoint on HP-UX. • So developers write code directly on the server through an SSH session, using emacs. • Emacs is complex, but extremely customizable − − • 36 They built a lot of custom macros on top of it. They also benefited from color syntax highlighting. Emacs was taken from the HP-UX Internet Express DVD 14 November 2008 XEmacs with syntax highlighting Open-Source tools your developers (and sysadmins) can’t live without • • • 37 lsof : Lists open files on the system, including sockets tusc: traces user system calls, helps with debugging tcpdump or wireshark (formerly ethereal): Network sniffers • All are available from the HP-UX Porting Center and/or HP-UX Internet Express • N.B. Perl 5.x is now officially included with the OS 14 November 2008 Wireshark Part 3: Easing the transition for your system administrators and users Change Management Training Host Access Software Customized environment 38 14 November 2008 Things I’ve heard.. And the tips I suggest • I’ll never move to HP-UX. Never. (long pause, breathing deeply) I’m staying with what I have! YOU DIG, PAL? − Seek help with change management • It’s hard to use! I’m too old for this! − Give people proper training − Give them good host access software − Enhance your OS with helper tools • My current OS has been working for 20 years! − That’s a good point. So show that HP-UX does work, too. − Try to find an evangelist to help pushing your agenda − Don’t make promises you can’t keep 39 14 November 2008 “I’ll never move to HP-UX” Change management (1/2) • If you feel like you’re on a warpath with your sysadmins and users, try to seek help from change management professionals. • HR sent us someone who gave us a two-hour presentation on change management. − He showed us how he predicted some would react to our change of platform, from resistance to acceptance. Things happened exactly like that. 40 • This helped us prepare to ensure that the “human” aspect of the migration would go as smoothly as possible under the circumstances. • We also did an extensive risk analysis to ensure nothing was left off during the migration. 14 November 2008 “I’ll never move to HP-UX” Change management (2/2) • A quick look on Wikipedia gives us the « formula for change » by Richard Beckhard and David Gleicher: DxVxF>R • D = Dissatisfaction with how things are now; V = Vision of what is possible; F = First, concrete steps that can be taken towards the vision • If the product of these three factors is greater than R = Resistance • Then change is possible. 41 14 November 2008 ”It’s hard to use!” Plan training for your users and Sysadmins (1/2) • Developers can read any general Unix book or follow the Unix Fundamentals course. • Every System Adminstrator should follow: − Unix Fundamentals, − HP-UX System Administration I − HP-UX System Administration II • Other more advanced courses can be offered to « super » sysadmins based on your environment. − Logical Volume Manager − ServiceGuard − Etc. 42 14 November 2008 ”It’s hard to use!” Plan training for your users and Sysadmins (2/2) • HP can prepare custom training for you upon request. − I had them combine Sysadmin I and II in an accelerated 5-day course by removing concepts that were not relevant to our environment such as printer configuration, NIS administration, etc. − This saved time, and costs. • Don’t be tempted by onsite training at your office − Attendees will have all sorts of reasons to be distracted 43 14 November 2008 “It’s hard to use!” Choose good host access software 44 • As a transition to a new OS can be a hard seller to some people… do your best to at least give them proper host access tools. • Host access software will be the primary interface to your servers. If it sucks, so will be the perception of your OS. • Take your time evaluating this kind of software, and you better be sure to pick software that works! An abundance of features is useless if the software is buggy, or hard to use. 14 November 2008 What Host access software do you use? 1/2 − Telnet is history. SSH is the way to go now. − Be sure to pick an SSH client that supports less known SSH features such as port redirection or X11 tunneling; they might be handy some day. − Terminal types supported by your application should be able to render colors; just emulating plain VT is not enough. • Some programmer’s editors use colors to do syntax highlighting. We use PuTTY, which most administrators are already familiar with. I could spend hours praising this piece of software. For those who can’t use free software for various reasons, Van Dyke’s SecureCRT is a good choice. 45 14 November 2008 PuTTY What Host access software do you use? 2/2 If you disallow FTP for security reasons, you’ll need an SFTP client. • − Putty has a command-line client but nobody will want to use this on Windows − We use WinSCP, it’s free and easy to use but it’s not the best client in my opinion, it has too much features − Van Dyke’s SecureFX seems to be good. WinSCP • 46 An X Server can be useful, especially to run glance. − We purchased Starnet X-Win32 − For those looking for a free product, try XMing − Logging to an X desktop is possible, but we don’t do this. HP-UX comes bundled with CDE (outdated) and MWM (outdated even more) 14 November 2008 Helper tools: Open-Source software to help your new sysadmins and users • The developer tools mentioned earlier (lsof, tusc, tcpdump and wireshark) are not included in the vanilla OS, but are very useful to the sysadmin. Other suggested tools: • Bash shell: Users will be able to call back their history with the arrows. • Curl or wget: Lets you download data directly from an HTTP or FTP url. − • Rsync: File replication program, uses the rsync protocol • Gcc: If you don’t want to purchase aC++, but plan on compiling C programs (open source or your own). − 47 It’s also useful if you need to write scripts for monitoring your web services I use gcc to compile most open source software, even if we purchased aC++ for our own application. • Install alternate text editors to complement vi: Nano is a good choice. • I did a trial of Midnight Commander but it did not catch on, people still prefer a good old shell interface. 14 November 2008 Example: Curl in action • Curl (or wget): These are two similar tools that let you download data directly from an HTTP or FTP url. • I use curl often to quickly retrieve patches from the ITRC. It’s quicker than downloading them to my PC, then transferring them to the server. 48 14 November 2008 Example: Adding colors (1/2) • I added lots of colors and custom shell prompts to help system administrators and support personnel in their everyday use. • It’s friendlier, and most people appreciate the use of colors. • They can be easily customized with Putty by changing its ANSI color settings, or disabled if necessary. 49 14 November 2008 Example: Adding colors (2/2) • • Colored shell prompts replace the blank “$” or “#” I aliased ”ls” to be replaced by GNU’s color-ls. Vanilla HP-UX session System enhanced with a custom colored prompt and color-ls 50 14 November 2008 “My older system has been working for 20 years!” Finding an evangelist • Finding an experienced admin to act as an evangelist will be of immense help. • A strong experience with any Unix is required. With HP-UX, even better. • He will be able to answer common questions quickly to reassure your people, and show to even the most doubtful that the OS does « work ». • Don’t let him promise something you can’t deliver − Example: don’t disclose performance metrics based strictly on advertised performance... Raw performance is not the same as real-world performance. 51 14 November 2008 Tech talk: Ignite-UX and Software Distributor • I use Ignite-UX extensively to install my servers − The software is complex, but at least it’s well documented. Check the quick install guide. − Golden images can be burned on DVD and they help deploy servers quickly our remote sites − Virtual Machines can be installed very quickly when using Ignite-UX with a golden image: I counted 30 minutes. • SD Depots are also very useful to manage the environment − I have a reference SD depot (“Golden Depot”) that I use to keep all the servers up to date − All software I use is stored in various Depots for future use; I rarely have to search around for DVDs 52 14 November 2008 Conclusion 53 14 November 2008 Our migration went well • End-user satisfaction is high − End-users perceived an almost 400% performance increase, as we expected from our early benchmarks − Better development and debugging tools increased overall software quality. • 54 These kinds of things are noticed by management 14 November 2008 What about your support personnel? • Wide acceptance from support personnel was tough to gain in the beginning, but by planning our migration correctly and keeping our promises, we managed a smooth transition. • Their satisfaction is high as well: − System requires minimal maintenance − No unplanned downtime since we went in production; many were surprised with the reliability of the new “Unix” system. − The learning curve was not as steep as some expected 55 14 November 2008 Any questions? 56 14 November 2008 Thanks for attending! Olivier S. Massé [email protected] 57 14 November 2008
© Copyright 2024