Why we chose HP-UX for our Mission Critical Application Olivier S. Massé

Why we chose HP-UX for our
Mission Critical Application
Olivier S. Massé
Systems Administrator, Hydro Quebec Distribution
November 2008
[email protected]
1
14 November
2008
What are we going to cover?
•
I’ll spend most of my time on
this chapter:
− Part 1: Making a case for a
migration to HP-UX
•
Then I’ll go quickly on the
following two:
− Part 2: Easing the transition for
your developers
− Part 3: Smoothing things up for
your system administrators
2
14 November
2008
Part 1: Making a case for a
migration to HP-UX
Description of our application
What factors motivated us to move
What were our criteria
Why we chose HP-UX
What technology was chosen
Tech Talk
3
14 November
2008
Who are we?
•
Hydro-Québec Distribution is one of the largest power distributors
in North America and the biggest public electric utility in Canada.
•
Serving Quebec, a Canadian province spanning 1.6 square
million km, Hydro-Québec delivers energy to 3.5 million
customers with an innovative distribution management solution
based on Integrity servers running HP-UX.
•
A large portion of our
energy is hydro-electric.
•
Quebec is a land of
climatic extremes, and
large distances separate
generation facilities from
load centres.
Overview of our project
•
In 2007, we performed successfully a migration from a timesharing
OS to HP-UX
− 6 data centers scattered throughout the province (one on an isolated island!).
− Up to 800km apart, around 50 servers total
•
5
Our migration went very well. This presentation will show you “Why
We Chose HP-UX” on Integrity over other platforms.
14 November
2008
What is our Application?
•
Since the 1980s, when the use of SCADA (Supervisory Control And Data
Acquisition) for distribution was rare, we pioneered tools and processes to
improve the efficiency of our network
− Our network includes more than 100,000 km of overhead lines and
underground cables, 500,000 transformers and 190,000 switching devices.
•
Our SCADA application is a DMS (Distribution Management System).
•
It manages many tasks that are related to electrical distribution to the end
customer.
−
−
−
−
−
6
Outage and workforce management
Mobile communications with work crews
Construction projects
Automation of the electrical network
Interfaces with corporate CRM systems
14 November
2008
What are the components?
•
The main app itself is a traditionnal client-server.
− The clients run on Windows XP.
− The workprocesses and databases are on HP-UX 11iv2:
• Developed in mostly C/C++, a few bits of Java and legacy Fortran code as well
• Oracle instances (Most at 10g, some at 9i)
•
We have an abundance of other tools and interfaces, all running on HP-UX
as well:
− SOAP Webservice and FTP/SFTP server for data exchange
− Data warehouse
− PHP web applications
− “Collateral” servers: Stratum-2 NTP servers, SOCKS proxy, Subversion server
− Mobile radio application (API with Motorola software)
− Etc.
7
14 November
2008
Example: State of the electrical network
8
•
Each electrical line, and all its components, have a “state”.
•
Some states are entered manually, others are acquired real-time with
remote interfaces.
•
Most of the management is done using the Windows client, with the HPUX backend doing all the processing.
14 November
2008
Example: Remote control
Inside the box is a
remote-controlled
motor.
An operator manages
remotely the state of
the whole electrical
line, communicating
with the switch if
necessary
Manual cut-offs in the electrical
network are managed by automated
switches such as this one
9
14 November
2008
Example: PHP Web Application
•
•
Web-based PHP applications are catching on, as simple frontends can be deployed to end-users without the need to develop
and package a client.
We have a few legacy ASP applications that are slowly being
rewritten in PHP as well.
Screenshot of our custom bug
database, used internally.
10
14 November
2008
Our client systems over the years
1980s: Microvax stations; text-based
Mid 1990s: PCs with MFC GUI
Since June 2007: PCs with Itanium backend running on Integrity HP-UX
11
14 November
2008
Why did you move away from your
older platform?
•
We already had a very reliable platform. Question is,
why bother?
− The market share of our previous OS was slowly shrinking.
• Our DMS is for sale through a partner and there were commercial
interests to port the application to benefit from an increased
marketability.
• System Administration and development expertise for the other platform
was becoming harder to find.
− The CPU architecture we were using for years was going to be
discontinued
• An effort was already required to recompile the code to a new CPU
architecture, so there was an opportunity to migrate.
− Oracle performance was not up to our expectations.
12
14 November
2008
What criteria were used to choose
HP-UX among others?
•
The usual reasons:
− Performance
− Reliability
− TCO
•
And a few more:
− Security
− Scalability
− Support from the manufacturer
− Manageability
− Available applications
•
13
I’ll detail these criteria in the next few slides
14 November
2008
Performance
14
14 November
2008
•
Performance is most important factor for us
•
During normal workloads, the application doesn’t consume lots of
resources but we need a lot of headroom to be able to meet a
sudden demand.
•
The efficiency of our ground teams, and what the customer thinks
of that efficiency, ultimately depend on the performance of our
servers.
•
If there is any major outage due to a natural disaster, the load will
go way up.
−
Case in point: A major ice storm in 1998 broke up large parts of our
electrical network and required intensive computing resources
−
Recent, less spectacular event: Two weeks ago, one region had an
early snowfall, with heavy flakes, and this interrupted electricity to more
than 50,000 customers. We have similar events many times each year.
Reliability and TCO
•
Planned downtime is allowed, depending on the weather…
but any unplanned downtime must be kept to a minimum
when teams are on the ground
−
•
TCO is a factor: We’re not looking for the theoretical, 100%
uptime that some other platforms strive to offer with a
heftier price tag
−
•
15
Like with slow performance, any downtime reduces operational
efficiency and this has a direct, visible impact to the customer.
We can live with 99.995%: 26 minutes of downtime per year.
Clustering technology must be available
−
To minimize planned (and unplanned) downtime, and reach
that 99.995%
−
To implement a DR plan with an extended-distance cluster (e.g.
Metrocluster)
14 November
2008
Security, Scalability, Support
•
Increased security awareness requires a platform that has a
good security record
− Regular audits are done within the company and we must
document our security process
•
The scalability of the platform is important
− A natural disaster could require emergency additional capacity
− We have 6 data centers Æ future consolidation is possible
− The platform should support virtualization
•
Support from a top tier manufacturer is required
− Security patches, training and frontline support
− Complete documentation on the OS and subsystems
− Long support commitment and sales availability for OS releases
16
14 November
2008
Manageability and applications
•
We must be able to perform trivial tasks online without disruption, e.g.:
− Adding and reducing disk space
• Especially thin provisioning in the future
− Online addition and replacement of failed components
− Dynamic reconfiguration of the OS kernel
− Migration from one SAN to another
•
Minimal maintenance requirements on the OS must be needed
− System Administrators must be able to go on vacation sometimes
•
A good ecosystem of applications must be available
− We need to be able to run basic web services
− Development tools must be good
17
14 November
2008
How does HP-UX measure up to our
criteria? (1/4)
The Itanium2 processor was, and still is, a leader in
terms of performance.
•
−
Especially with Oracle, which we needed
−
Sample benchmarks we wrote showed a potential 400%
increase in performance
•
The fact that Integrity platform can run a variety of
OS reduced investment risks
•
We already had experience with other HP products,
and going to Integrity was a logical migration path for
us
18
−
All our Windows environments run on Proliant
−
Our SANs are based on StorageWorks products; HPUX integrates well with them (especially with
Metrocluster)
14 November
2008
How does HP-UX measure up to our
criteria? (2/4)
•
We ordered a study to help us choose the new platform
and OS
− Concerning HP-UX, without any surprise, the first thing the report did
was quote Gartner:
• “HP-UX is an excellent operating system choice for the enterprise and it will be a solid
migration vehicle for the HP user base as well as developers looking for an evolving
platform” (Gartner, 2004)
− They also wrote this concerning security:
• “It is preferable to minimize security risks by favoring an operating system that has an
emphasis on performance, stability and ROI, rather than a mass technological choice.
Most proprietary Unix operating systems fall in that category”.
− To get performance metrics, they extracted a few results from TPC
(Transaction Processing Performance Council):
• Integrity servers, either with HP-UX or Linux, were generally among leaders in Oracle
9i performance.
19
14 November
2008
How does HP-UX measure up to our
criteria? (3/4)
•
Linux on Integrity was considered since it was on par
with HP-UX for performance, but it had to be dropped in
the early stages
− We were looking for an outstanding support level
•
Our corporation was already deploying a major HP-UX
environment in another department
− A SAP solution on HP-UX replaced the mainframe for our
CRM and invoicing (millions of customers). The project
went in full production in early 2008.
− Since we’re already an HP-UX customer, the learning
curve is reduced for any internal resource that could be
brought on to work on our environment.
20
14 November
2008
How does HP-UX measure up to our
criteria? (4/4)
•
We needed applications to be able to run our PHP and
SOAP web services:
− We looked if HP-UX could “get the job done”, and it did:
• Apache Server, Tomcat, PHP are bundled in HP-UX Web Server
Suite
• Axis and XML libraries are available on Internet Express to run SOAP
Services, but they’re unsupported
• Increased performance was expected by migrating PHP applications
from Windows 2003 to HP-UX
• The bottom line is that we didn’t need YAP (yet another platform) for
our web services.
•
HP-UX Web Server suite and many of its modules are
officially supported by HP
− When we were running Apache + PHP on Windows, we had no
support.
21
14 November
2008
Building a roadmap
•
A roadmap was produced, covering every
task to follow:
− 2005: Design, studies and architecture.
− 2006: Porting effort and first phase of
equipment purchase for our test environments
− 2007: Second phase of acquisition and go live.
22
•
We referred to the roadmap all the way
throughout the migration, and this helped us
keep our vision.
•
Our older MA-based SANs were migrated to
EVAs in 2005 and 2006 to have a head start
before the migration to Integrity.
14 November
2008
What technology was chosen? 1/2
•
For test environments, we use vPars and Integrity
VMs on various servers such as the rx7640
− We’re very satisfied of the flexibility resulting of the
use of vPars and Integrity VMs
− Vpars are perfect for performance testing since they
use “real” CPUs and I/O without any VM-related
overhead, and can be reconfigured quickly.
rx7640
− VMs are well suited for quickly building test
environments where you don’t need to measure strict
performance metrics.
•
Development environments are on a big rx6600
with lots of memory (144Gb) running Integrity
Virtual Machines
− We were running on a few rx2600’s until recently but
they became clogged and unexpandable.
23
14 November
2008
rx6600
What technology was chosen? 2/2
•
Mission critical production is on Serviceguard clusters consisting of:
− rx7640s (main servers)
− rx6600s (standby servers)
•
Other production is on rx2660s, rx3600s and rx6600s, some with
ServiceGuard.
− rx2660 are inexpensive, but have a drawback: no OLAR
•
A multi-node Metrocluster is in production, and used to implement
our DR plan
•
We’re currently investigating BladeServers for future upgrades.
− Integrity blades can be mixed with Proliants, which reduces costs.
− When combined with a Virtual Connect infrastructure and Ignite-UX,
Blades can deployed very quickly.
24
14 November
2008
bl860c
Tech talk: What is a Metrocluster?
25
14 November
2008
•
A Metrocluster is a ServiceGuard
cluster that spans multiple sites
and disk arrays
•
It harnesses the power of your
XP or EVA-based CA data
replication
•
It’s a turnkey solution that works
well and is well integrated
•
Drawback: three data centers are
required, the third one hosting a
quorum server
Tech talk: Praise on blade servers and
Virtual Connect
•
I’ve been playing with a few bladesystem chassis since early 2008,
along with the Virtual Connect technology, and I find these products
to be of an outstanding value
•
Virtual Connect simplifies network connections and management
− If your chassis is well planned, physical installation of a server is as quick
and painless as it can be since you don’t have to do any cabling
•
The architecture encourages you to double up LAN and FC
connections, and online firmware upgrades of all the components
can be made online.
•
Good platform for Virtualization
− Allows a lot of density. Works wonders with ESX clusters, can’t wait to
see what’s cooking for Integrity VM 5.0
26
14 November
2008
Part 2: Easing the transition
for your developers
Choosing the right development tools
Source control
Tips for the HP-UX developer
Open sources tools you can’t live without
27
14 November
2008
Choosing the right compilers
•
As far as C goes, you have two choices:
− HP’s own C/C++ compiler named « aC++ »
• Itanium optimized compiler
− The ia64 version of GCC (Gnu C Compiler)
• an open-source and well-respected C/C++ compiler
•
The open source GCC can be used for simple programs, although
every major project should use aC++. We chose aC++.
− It is not expensive, and available for free to qualified ISVs
•
We also purchased the HP Fortran compiler.
•
Our few Java programs use the JDK handed out by HP, as well as
some of their tools (such as jmeter)
28
14 November
2008
Hint to help reduce the porting effort
•
For years, every low-level function in the code of our application has
been using UTL, a homemade portable API.
•
#ifdefs were added throughout the code to be able to generate
portable code between the older OS and HP-UX. New operating
systems could also be added easily in the future.
•
Future porting was expected, so this preparation was started a few
years ago.
− We had a workable PA-RISC version for a while as a proof of concept
before recompiling on Itanium2.
29
14 November
2008
Debugging
•
Our previous OS could not generate
core files in a timely manner, so they
were disabled
− The core file can be used to play
back and inspect the source code
line by line, as it was at the moment
of the crash.
•
HP-UX, like all Unixes, generates a
core file quickly when a process
crashes.
−
•
30
Hint: « kill –3 » can be used on a hung process
to have it generate a core file.
The wdb debugger included with
aC++ is a graphical front-end to gdb
(GNU debugger), which is already
well known by many developers.
14 November
2008
wdb
Source control
•
The open source Subversion (SVN) product is our
source control system
− It’s solid: absolutely no outage or problem since we
migrated all our server sources to SVN in 2006.
•
Running a complete SVN server on HP-UX requires a
custom compilation, as there are no binaries available.
− The basic “svnserve” server available from the porting
center it too limited. We favored an Apache+SVN
WebDAV server.
•
31
The developers liked SVN so much that they migrated
the sources of the client application from Microsoft VSS
to SVN in 2007.
14 November
2008
Obtaining low-level HP-UX
documentation 1/2
•
There are few resources available for the programmer who
wants/needs to know how HP-UX works at the lowest level.
•
Quadrants, executable magic numbers, etc are not well explained,
and documentation on them is hard to find.
•
It doesn’t help that the available documentation is old, or out of print.
•
We decided to stay with a 32-bit application to reduce development
efforts, but had to cope with HP-UX’s memory management limits
with 32-bit apps.
32
14 November
2008
Obtaining low-level HP-UX
documentation 2/2
•
Ask your developers to subscribe to the cxx-dev and devtools mailing
lists.
•
Have them read the following whitepapers available on docs.hp.com:
−
−
•
HP-UX memory management whitepaper
HP-UX process management whitepaper
Try to find the following out-of-print books:
−
HP-UX internals by Chris Cooper (2004)
•
−
PA-RISC oriented, but can be useful to understand memory management
HP-UX performance and tuning by Sauers, Ruemmler, Weygant
•
Also PA-RISC
•
The DSPP (developer and solution partner) site might have info, but it
is hard to navigate; you also need to register as a company.
•
Look at the new “knowledge on demand” webcasts:
http://www.hp.com/go/kod
•
The book “Advanced Programming in the Unix environment” is an
excellent book – strongly recommended if you need to develop lowlevel stuff.
33
14 November
2008
Glance and caliper
•
It is recommended that you purchase glance,
either with the minimum of the VSE OE or
separately. Glance has been invaluable to our
developers, who use it as a troubleshooting tool.
•
Caliper, a free profiler for Itanium, has also been
of tremendous help to increase software quality
− One of our senior developers assisted to a lab at
HPTF 2006 in Houston where he discovered
Caliper.
•
34
Ktracer should be investigated (I haven’t tried it
yet)
14 November
2008
Glance in action
Quote from our developers
“How did we make your application
faster than before? Caliper! Caliper!
Caliper!”
“Caliper has helped us track bugs and
performance issues we would have
completely missed otherwise”
35
14 November
2008
The development environment
•
Security restrictions prevent me from setting
up an IDE environment with a fancy setup
such as the Eclipse IDE using a CIFS shared
mountpoint on HP-UX.
•
So developers write code directly on the
server through an SSH session, using emacs.
•
Emacs is complex, but extremely
customizable
−
−
•
36
They built a lot of custom macros on top of it.
They also benefited from color syntax
highlighting.
Emacs was taken from the HP-UX Internet
Express DVD
14 November
2008
XEmacs with syntax
highlighting
Open-Source tools your developers
(and sysadmins) can’t live without
•
•
•
37
lsof : Lists open files on the system, including
sockets
tusc: traces user system calls, helps with
debugging
tcpdump or wireshark (formerly ethereal):
Network sniffers
•
All are available from the HP-UX Porting
Center and/or HP-UX Internet Express
•
N.B. Perl 5.x is now officially included with the
OS
14 November
2008
Wireshark
Part 3: Easing the transition
for your system
administrators and users
Change Management
Training
Host Access Software
Customized environment
38
14 November
2008
Things I’ve heard.. And the tips I suggest
•
I’ll never move to HP-UX. Never. (long pause, breathing deeply)
I’m staying with what I have! YOU DIG, PAL?
− Seek help with change management
•
It’s hard to use! I’m too old for this!
− Give people proper training
− Give them good host access software
− Enhance your OS with helper tools
•
My current OS has been working for 20 years!
− That’s a good point. So show that HP-UX does work, too.
− Try to find an evangelist to help pushing your agenda
− Don’t make promises you can’t keep
39
14 November
2008
“I’ll never move to HP-UX”
Change management (1/2)
•
If you feel like you’re on a warpath with your
sysadmins and users, try to seek help from change
management professionals.
•
HR sent us someone who gave us a two-hour
presentation on change management.
− He showed us how he predicted some would react to
our change of platform, from resistance to acceptance.
Things happened exactly like that.
40
•
This helped us prepare to ensure that the “human”
aspect of the migration would go as smoothly as
possible under the circumstances.
•
We also did an extensive risk analysis to ensure
nothing was left off during the migration.
14 November
2008
“I’ll never move to HP-UX”
Change management (2/2)
•
A quick look on Wikipedia gives us the « formula for change » by
Richard Beckhard and David Gleicher:
DxVxF>R
•
D = Dissatisfaction with how things are now;
V = Vision of what is possible;
F = First, concrete steps that can be taken towards the vision
•
If the product of these three factors is greater than
R = Resistance
•
Then change is possible.
41
14 November
2008
”It’s hard to use!”
Plan training for your users and Sysadmins (1/2)
•
Developers can read any general Unix book or follow the Unix
Fundamentals course.
•
Every System Adminstrator should follow:
− Unix Fundamentals,
− HP-UX System Administration I
− HP-UX System Administration II
•
Other more advanced courses can be offered to « super »
sysadmins based on your environment.
− Logical Volume Manager
− ServiceGuard
− Etc.
42
14 November
2008
”It’s hard to use!”
Plan training for your users and Sysadmins (2/2)
•
HP can prepare custom training for you upon request.
− I had them combine Sysadmin I and II in an accelerated 5-day
course by removing concepts that were not relevant to our
environment such as printer configuration, NIS administration,
etc.
− This saved time, and costs.
•
Don’t be tempted by onsite training at your office
− Attendees will have all sorts of reasons to be distracted
43
14 November
2008
“It’s hard to use!”
Choose good host access software
44
•
As a transition to a new OS can be a hard seller to some
people… do your best to at least give them proper host access
tools.
•
Host access software will be the primary interface to your
servers. If it sucks, so will be the perception of your OS.
•
Take your time evaluating this kind of software, and you better
be sure to pick software that works! An abundance of features
is useless if the software is buggy, or hard to use.
14 November
2008
What Host access software do you use? 1/2
− Telnet is history. SSH is the way to go now.
− Be sure to pick an SSH client that supports less known SSH features such as
port redirection or X11 tunneling; they might be handy some day.
− Terminal types supported by your application should be able to render colors;
just emulating plain VT is not enough.
• Some programmer’s editors use colors to do syntax highlighting.
We use PuTTY, which most
administrators are already familiar
with. I could spend hours praising
this piece of software.
For those who can’t use free
software for various reasons, Van
Dyke’s SecureCRT is a good
choice.
45
14 November
2008
PuTTY
What Host access software do you use? 2/2
If you disallow FTP for security
reasons, you’ll need an SFTP client.
•
−
Putty has a command-line client but
nobody will want to use this on
Windows
−
We use WinSCP, it’s free and easy to
use but it’s not the best client in my
opinion, it has too much features
−
Van Dyke’s SecureFX seems to be
good.
WinSCP
•
46
An X Server can be useful, especially to run glance.
−
We purchased Starnet X-Win32
−
For those looking for a free product, try XMing
−
Logging to an X desktop is possible, but we don’t do this. HP-UX comes bundled
with CDE (outdated) and MWM (outdated even more)
14 November
2008
Helper tools: Open-Source software to help your new
sysadmins and users
•
The developer tools mentioned earlier (lsof, tusc, tcpdump and wireshark) are
not included in the vanilla OS, but are very useful to the sysadmin.
Other suggested tools:
•
Bash shell: Users will be able to call back their history with the arrows.
•
Curl or wget: Lets you download data directly from an HTTP or FTP url.
−
•
Rsync: File replication program, uses the rsync protocol
•
Gcc: If you don’t want to purchase aC++, but plan on compiling C programs
(open source or your own).
−
47
It’s also useful if you need to write scripts for monitoring your web services
I use gcc to compile most open source software, even if we purchased aC++ for our own application.
•
Install alternate text editors to complement vi: Nano is a good choice.
•
I did a trial of Midnight Commander but it did not catch on, people still prefer a
good old shell interface.
14 November
2008
Example: Curl in action
•
Curl (or wget): These are two similar tools that let you download data
directly from an HTTP or FTP url.
•
I use curl often to quickly retrieve patches from the ITRC. It’s quicker
than downloading them to my PC, then transferring them to the
server.
48
14 November
2008
Example: Adding colors (1/2)
•
I added lots of colors and
custom shell prompts to
help system administrators
and support personnel in
their everyday use.
•
It’s friendlier, and most
people appreciate the use
of colors.
•
They can be easily
customized with Putty by
changing its ANSI color
settings, or disabled if
necessary.
49
14 November
2008
Example: Adding colors (2/2)
•
•
Colored shell prompts replace the blank “$” or “#”
I aliased ”ls” to be replaced by GNU’s color-ls.
Vanilla HP-UX
session
System enhanced
with a custom
colored prompt and
color-ls
50
14 November
2008
“My older system has been working for 20 years!”
Finding an evangelist
•
Finding an experienced admin to act as an evangelist
will be of immense help.
•
A strong experience with any Unix is required. With
HP-UX, even better.
•
He will be able to answer common questions quickly to
reassure your people, and show to even the most
doubtful that the OS does « work ».
•
Don’t let him promise something you can’t deliver
− Example: don’t disclose performance metrics based
strictly on advertised performance... Raw performance is
not the same as real-world performance.
51
14 November
2008
Tech talk: Ignite-UX and Software
Distributor
•
I use Ignite-UX extensively to install my servers
− The software is complex, but at least it’s well documented.
Check the quick install guide.
− Golden images can be burned on DVD and they help deploy
servers quickly our remote sites
− Virtual Machines can be installed very quickly when using
Ignite-UX with a golden image: I counted 30 minutes.
•
SD Depots are also very useful to manage the environment
− I have a reference SD depot (“Golden Depot”) that I use to
keep all the servers up to date
− All software I use is stored in various Depots for future use; I
rarely have to search around for DVDs
52
14 November
2008
Conclusion
53
14 November
2008
Our migration went well
•
End-user satisfaction is high
− End-users perceived an almost 400% performance increase, as we expected from our early
benchmarks
− Better development and debugging tools increased overall software quality.
•
54
These kinds of things are noticed by management
14 November
2008
What about your support personnel?
•
Wide acceptance from support personnel
was tough to gain in the beginning, but by
planning our migration correctly and
keeping our promises, we managed a
smooth transition.
•
Their satisfaction is high as well:
− System requires minimal maintenance
− No unplanned downtime since we went in
production; many were surprised with the
reliability of the new “Unix” system.
− The learning curve was not as steep as
some expected
55
14 November
2008
Any questions?
56
14 November
2008
Thanks for attending!
Olivier S. Massé
[email protected]
57
14 November
2008