Introduction to the Duwamish Online Sample Application

Introduction to the Duwamish Online
Sample Application
Pedro Silva and Michael D. Edwards
Microsoft Developer Network
July 2000
Summary: This article provides an overview of the history of the Duwamish sample
application and discusses the process of turning it into a real, live e-commerce startup.
(14 printed pages)
View the Duwamish5.exe sample code in the MSDN Online Code Center.
Download the Duwamish5.exe sample file (7,800 KB).
Contents
Introduction
Overview
www.DuwamishOnline.com
Duwamish Online Goals
Duwamish Online Application Upgrades
Duwamish Online Deployment
Application Architecture
Layered Architecture
Network Architecture
The Internet Zone
The Server Farm Zone
The Server Hardware and Software
Conclusion
Introduction
Ask anybody who has experienced the launch of a Web application what went wrong,
and you'll get an earful. The problems are widespread, and by no means purely
technical in nature. We should know—through our work on the MSDN® Duwamish
Online project (documented at http://msdn.microsoft.com/voices/sampleapp.asp)
we've been immersed in the task of designing, implementing, deploying, and
operating a worldwide Internet e-commerce startup for the past two years. Now hold
on—before you laugh yourself to tears at the thought of a real Internet startup taking
two years to deploy, stop to consider how the schedule would be impacted by a
primary objective of teaching the rest of the world how to reproduce our success!
For the Duwamish team, this educational objective made identifying and solving all
the problems associated with launching a Web application "only" half the work. The
rest of our time was heavily invested in creating and updating extremely detailed lab
notebooks and procedures, hundreds of pages of application software and network
specifications, and a host of reports and analysis documents.
In other words, successfully launching http://DuwamishOnline.com, but failing to
teach you how to do so yourselves, was not an option. So, what is Duwamish Online,
you ask? Read on, and we'll explain our project objectives, go over technical details
of the Duwamish Online software and network architecture, and give you a taste of
the deployment preparations required to launch your own Microsoft® Windows®
DNA 2000 application, which we will be writing about all summer on MSDN Online.
Overview
Last August MSDN released Phase 4 of the Duwamish Books sample application, an
ambitious project demonstrating a business and architectural migration to the Web.
Phase 1 included a set of monolithic desktop applications for operating a single retail
bookstore. Phase 2 migrated all the data access code into a shared COM component to
support a growing business, now with multiple stores, and client-server architecture.
As the fictional Duwamish bookstore business continued to expand into other cities
and states, Phase 3 demonstrated migrating to a logical, three-tier architecture in order
to support different business rules per store, in a business logic layer. Phase 3.5
integrated Microsoft Transaction Server (MTS) to manage the components in a
physical three-tier architecture and to control transactions. And, finally, Phase 4
migrated the application workflow code into a shared COM component, and the
presentation logic into Active Server Pages (ASP)—now Duwamish Books was a
fully Web-based Windows DNA application.
www.DuwamishOnline.com
Well before we released Duwamish Books, Phase 4, we knew there would be a Phase
5. That's because through Phase 4 we had focused on demonstrating the software
architecture of a Windows DNA application. That's only a third of the problem—
another third is actually deploying a new Web application, and the final third is
operating it. So, immediately upon releasing the Phase 4 milestone, we set out on a
mission to deploy Duwamish live on the Internet.
Duwamish Online Goals
Our primary objective for Duwamish Online is to teach you how to successfully
launch your own Web application. From A-Z we will describe every step in detail.
(The key to Web application success is in the details.)
Given that the Duwamish team had very little experience deploying and operating
Web applications when we finished Phase 4, we decided to actually launch Duwamish
ourselves, and operate it for a substantial period of time. Because we anticipated a
summer 2000 launch, following the Windows 2000 release, we decided on a
secondary objective of demonstrating how (and why) to upgrade the Duwamish
application architecture to COM+ Services—essentially, how to migrate Windows
DNA to Windows DNA 2000.
Duwamish Online Application Upgrades
A new Duwamish phase would be incomplete unless our architectural migration was
stimulated by fundamental business change, and Duwamish Online is no exception.
So, flush with IPO cash and the certainty that we must grow or die, Duwamish Online
expanded its product catalog by acquiring a vendor of official logo casual and sports
wear.
Additional application upgrades were driven by our belief that more complete
application presentation and workflow were required to provide generally applicable
scalability and performance metrics. High availability and reliability requirements led
to further architectural (both software and hardware) enhancements that were not so
much driven by Duwamish Online business changes as by the reality of successfully
operating a Web application (as opposed to "just" building it, as we did in Phase 4).
Database layer
Changes in the database layer were driven by two factors:
•
Our new business requirement to sell apparel and gear, in addition to books
(and the assumption that future acquisitions would further expand the
Duwamish Online product catalog).
•
The substantive expansion of the Duwamish application workflow.
These changes led to modifications in the Duwamish database schema and a
substantial increase in the number of tables, fields, and stored procedures. We took
this opportunity to remove legacy database objects that were no longer relevant to the
application. We also migrated the database to Microsoft SQL Server™ 2000 in order
to take advantage of new features such as full-text search in clustered environment.
Middle tier and presentation
In the middle tier we have redesigned and implemented the order pipeline and added
additional workflow and business logic to support our improved presentation features.
In order to provide higher scalability, availability, and reliability, we introduced a
COM+ Queued Component to handle interoperation with third-party partners. This
allowed us to execute orders much quicker (providing significant scalability gains)
and without depending on our partner's real-time server performance and uptime (thus
making the availability and reliability of our order workflow something we could
completely control on our own domain).
If you downloaded and installed Phase 4, you'll recognize the huge strides we made in
the presentation layer for Duwamish Online. Not only did we dramatically increase
the complexity of each page, we added important additional features such as account
history. We increased application complexity to provide more credible performance
and scalability numbers. (For example, the Duwamish Online home page is a dynamic
page that requires 10 times as much processing to deliver than the static Phase 4 home
page.) But because Duwamish Online is live on the Internet, we also needed a more
complete application to keep you engaged and interested.
Third-party interoperation
We implemented full interoperation with a credit card authorization vendor and a
fulfillment vendor. This was important to do from both a complexity and
completeness perspective. There were a number of interesting problems we had to
solve here, from pull messages off the Queued Component server to auto-generating
e-mail confirmations.
Application setup and build
Because we planned on installing Duwamish Online a multitude of times over its
lifetime, we needed a very robust and maintainable setup application. So, we decided
to throw out the Duwamish Books, Phase 4 setup (an unwieldy piece of code with
origins going back to the Microsoft Visual Basic® Setup Kit—we started using this in
Phase 3.5 to automate MTS component management) and start over. We ended up
with the world's first completely automated, Windows 2000 logo-certified, Web
application setup utility. (At least it's the first one that anybody is giving away for
free!)
The previous phases of Duwamish utilized an ad hoc build procedure (a fancy way of
saying there was no formal build procedure). Because we wanted to enable our
customers to reproduce our testing results, we needed a very reliable and easy-tomodify build utility. It was very important that our customers be able to build and test
the same bits that we built and tested. So, we created a new application build facility,
which we include with the Duwamish Online download.
One of our best engineers spent several weeks on these two enhancements, and we are
very proud of this work.
Duwamish Online Deployment
Half of the resources expended on Duwamish Online were devoted to determining
and applying the 1,001 procedures necessary to deploy a Web application: From
building and testing various network configurations, to making final staging and
production server purchase decisions. From writing database backup utilities and
practicing procedures to restore the database from backups, to testing database fail
over. From purchasing a domain name, to "hardening" the production server farm
against hacker attacks. From researching use scenarios and building load-test scripts,
to isolating lock contention in scale-up testing. These were just a few of the A-Z
details that took us almost a year to accomplish and will take us all summer to tell you
about on MSDN Online.
Application Architecture
Layered Architecture
Duwamish Online extends the similar n-tier design of earlier phases of the sample
application, including the presentation layer, workflow layer, business logic layer,
data access layer, and the data source. Although earlier phases implemented multiple
presentation types, for the release of Duwamish Online we concentrated on HTML
3.2 and CSS 1.0 clients so we could support the largest browser audience possible.
However, the ability to support multiple presentation types is still part of the design
and can easily be extended to take advantage of new browser functionality. Therefore,
all of the XML and XSL transformations must be done on the Web server.
Figure 1. The application layers of the Duwamish Online sample
Data is stored in a relational database, accessed and manipulated with components
running under COM+ Services. It is converted into XML format between the middletier COM+ components and the presentation layer. The presentation layer formats the
pages and transforms XML data into HTML 3.2, which is then returned by Internet
Information Services (IIS).
Table 1. Duwamish Online Layered Application Architecture
Logical n-tier
•
Presentation tier—HTML 3.2
•
Workflow tier—work spanning or incorporating multiple autonomous
business transactions
•
Business logic tier—boundary for autonomous business transactions
•
Data access tier—handles disconnected data access
Database tier
•
SQL Server 2000 database
This logical factoring of an application into these layers allows you to write modular,
reusable, and maintainable code more easily than it would be to write a monolithic
Web application. Thinking about the application in these terms instills the discipline
to design and implement features and entire applications with these things in mind.
Also, this logical factoring of layers doesn't necessarily need to correspond to layers
of COM+ components, although this is how the Duwamish application is divided.
With new scripting functionality like Visual Basic Scripting Edition (VBScript)
classes, code can be easily encapsulated in the classes and all of the layers can be
written in script. Although this might help simplify your application development,
you would not be able to take advantage of other features like COM+ security,
Queued Components, and more.
Although much in the middle-tier components has changed to accommodate new
functionality, including a diverse item catalog and order history, the principles behind
the workflow, business logic, and data access layers has remain similar to those in
Phase 4. Therefore, let's focus on some of the new components—the queued
workflow and fulfillment components—and see how they fit into the overall system.
Queued workflow component
For a Web site handling millions of transactions per day, with usage peaks of
thousands per second, delaying costly operations can vastly improve response time.
Queued operations free up IIS threads, so they can respond to more requests instead
of waiting for costly synchronous operations to complete. In addition, queued
operations can make the site more reliable. If the queued portions of the site go
offline, or if there are a huge number of transactions that need to be processed,
messages accumulate in the queue until the system is back online or there is a lull in
traffic that allows the system to catch up.
The COM+ Queued Components feature makes it easy to implement and configure
objects to run on Microsoft's queuing technologies—in our case MSMQ. The difficult
part is deciding which parts of your site don't require immediate user feedback and
can be queued instead. Database operations are somewhat expensive, but credit card
payment authorization is very expensive. As you know from swiping your credit card
at the gas pump or department store, it can take on the order of a few seconds. When
all of a Web server's worker threads (IIS uses a pool of 25 worker threads) are busy
servicing payment authorization requests, your site's response time goes through the
roof.
Duwamish Online makes extensive use of COM+ Queued Component functionality
for the order pipeline. When a customer clicks Buy, the presentation layer passes the
XML-encoded order information to a local workflow component. The local
component invokes a remote queued workflow component and executes a
ProcessOrder method. All order processing is performed by the remotely hosted
queued workflow component and is entirely out-of-band with IIS.
Order processing includes inserting the sale and payment information into the
database, authorizing the credit card purchase, preparing order data for fulfillment, email notification, and billing the credit card for the purchase after the order has been
fulfilled.
Fulfillment subsystem
We decided to go with a third-party fulfillment company, Interact Inc. This company
will be responsible for keeping the inventory on hand in their warehouse, boxing it,
and shipping it to our customers. We immediately found that our database and
message formats were incompatible with Interact's formats (ah, the joys of businessto-business integration). We could communicate with our fulfillment provider only by
using File Transfer Protocol (FTP) to send messages to an address on their site. With
the widespread adoption of XML messaging and server applications, such as
Microsoft BizTalk™ Server, integration of these types of external services should
soon become easier.
The fulfillment system runs as several scheduled operations using the Microsoft
Windows NT® Task Scheduler service. These scheduled events make calls into the
fulfillment workflow component to send the purchase order to Interact and update
order status and inventory from our provider.
The fulfillment workflow is integrated with the other Duwamish components and uses
the business logic and data access layers to perform database operations in the order
tables, as well as in its own database used specifically for synchronization with
Interact.
Send Purchase Order
Send Purchase Order uses the Duwamish Books business logic workflow layers to
identify order records that are ready to be fulfilled. Once Send Purchase Order
determines that an order is ready to be fulfilled, it transforms the order data from the
Duwamish Online format into a format that is compatible with the external system.
The data is then transmitted via FTP to the external system.
Update Inventory and Update Order Status
Update Inventory and Update Order Status transform inventory status and order status
in files downloaded from the fulfillment server. Then, they update the order status and
inventory information in the Duwamish Online system through the workflow and
business logic layers. Using these components to make these changes preserves the
business rules we have set up to govern changes to the inventory and its status. It also
calls into the queued workflow component to do the final credit card billing.
Network Architecture
One of the greatest distinctions between Duwamish Online and the previous phases of
Duwamish is the extensive work we did to design the network architecture that our
application runs on. Oftentimes the focus of product development is on the software
architecture and features. However, equally important in a Web application is the
network architecture. Many well-designed applications can fail miserably on the
Internet if they are not deployed and operated correctly.
Although Duwamish Online is designed as a logic n-tier application, it is deployed on
two physical tiers. After running a broad set of configuration tests on Duwamish, we
discovered that the physical two-tier approach performed best because it minimized
cross-machine communication—which is a huge performance killer. In this
configuration, the Web servers run all of the ASP pages for the Web site and all of our
COM+ components, and the second tier runs the database server. Load is balanced
between the Web servers using Network Load Balancing (NLB). The tests we ran
with components running on their own middle tier of the machines all had lower
throughput and higher response time.
Table 2. Duwamish Online Two-Tier Architecture
Physical two-tier
Web tier (NLB cluster)
•
VBScript in ASP
•
All HTML generated using XML/XSL transforms
•
Visual C++® ATL Cache component in ASP application space caches infrequently
changing HTML/XML
•
Visual Basic COM+ workflow, business logic, data access components (COM+
library)
Database tier (Windows Cluster Service)
•
SQL Server 2000 using stored procedures
Our production network configuration can be divided into two major network zones:
the Internet and the server farm.
The Internet Zone
The Internet Zone represents the network traffic external to our router and firewall.
We are connected to the Internet through a 1.5 Mbps connection provided by our
Internet Service Provider (ISP)—the Information Technology Group at Microsoft.
We expect network traffic from the Internet by three types of sources: customers,
service providers, and remote monitoring clients.
Customers
Customers are typical Internet users, and will access our site through a variety of Web
browsers. They will be allowed access to the site only through HTTP at port 80. They
will browse our catalog, select items to buy, and make purchases.
Service providers
We will also communicate, through the Internet, with the servers of our vendors that
provide payment and fulfillment processing services. All communication with the
service providers will be confined to the Queued Component (QC) server. Table 3
shows a list of the required communication methods on that server.
Table 3. Required Communication Methods on QC Server
Service
provider
Type of service
Required network
protocol
CyberSource
Payment processing
Custom protocol—Port 80
Interact
Order fulfillment
FTP—Port 21
Remote monitoring clients
The most important consideration is that a Web site be accessible to its customers. It
is essential for us to have a way to check the site's availability from outside our actual
server network. Some network problems—such as losing an ISP connection—cannot
be monitored from within the private network.
We will be setting up at least one client machine outside our firewall to remotely
monitor the health of our service. This client will be pinging key services of our site at
regular intervals, and will alert Operations personnel of failing services. Ideally, we
would deploy multiple client machines through different ISP connections. However,
for our initial deployment, we will simply set up one client machine through an
external connection.
The Server Farm Zone
Our server farm's only external dedicated network connections are the direct Internet
tap from our ISP and the secured dial-up connectivity to the Administration Server for
operation and network management.
Figure 2. Network diagram of Duwamish Online production farm
Our production server farm consists of three network segments. These network
segments are divided by a separate network interface card (NIC) in the machines for
each segment.
Front-end network
This is the public network segment that is accessible from the Internet. All servers in
this segment are connected to a 100-Mbps LAN switch. The front-end network
consists of connections of the following servers and services:
•
Four Web servers, configured as one NLB cluster.
•
One Queued Component (QC) server, with SMTP server enabled.
•
One Primary Domain Controller (PDC)/Domain Name System (DNS) server.
(We also have the QC server configured as a backup domain controller in case
the primary server goes down.)
•
One 1.5-Mbps Internet connection from our ISP, with IP Filtering Firewall
enabled at the router level.
Back-end network
The back-end network is the internal private network segment that allows secured
communications between the front-end servers and the back-end database servers.
This network is not directly accessible from the Internet, so no one or nothing can
connect to our database servers except through the front-end servers.
All servers in this segment are connected to a 100-Mbps LAN switch. The back-end
network consists of connections of the following servers:
•
Four Web servers, configured as one NLB cluster.
•
One Queued Component (QC) server, with SMTP server enabled.
•
One Primary Domain Controller (PDC)/Domain Name System (DNS) server.
(We also have the QC server configured as a backup domain controller in case
the primary server goes down.)
•
Two database servers, configured with Active-to-Passive Server Clustering.
(The two database servers are both connected to an external RAID5 storage
system.)
Management network
The management network is another internal private network segment dedicated to
the operation and management of the individual servers in the production farm. It
consists of all the servers in the back-end network as well as an administration server.
All servers are connected to a 100-Mbps LAN switch.
The administration server will:
•
Provide Terminal Client access to all servers.
•
Monitor the health of all servers.
•
Work as a remote access server (RAS) for remote access to the farm.
•
Serve as a backup server.
Server Hardware and Software
Along with the network setup, it is vital to document the server machine hardware and
software specifications so that everyone on the team knows what machines are in the
farm and what software is installed on each server. At this time, the software list for
our server farm is relatively short, because we're using the new Windows 2000
release. However, as service packs and software updates are released, it becomes even
more important to keep track of what is on each server. Table 4 describes the
hardware and software specifications for the Duwamish Online production server
farm.
Table 4. Duwamish Online Server Hardware and Software Specifications
Server/device
types
# of
devices
Hardware spec
Software spec
Web server
4
Dell PowerEdge 2300
Dual-processors, 2 x 500
MHz
512 MB RAM, 9 GB HD
3 x 100 Mbps NIC
Windows 2000 Advanced
Server
Network Load Balancing
Microsoft Message
Queuing
Database server
Queue
Component
server
2
1
Dell PowerEdge 2300
Dual-processors, 2 x 500
MHz
512 MB RAM, 9 GB HD
3 x 100 Mbps NIC
Windows 2000 Advanced
Server
SQL Server 2000
Dell PowerEdge 2300
Dual-processors, 2 x 500
MHz
512 MB RAM, 9 GB HD
3 x 100 Mbps NIC
Windows 2000 Advanced
Server
Microsoft Message
Queuing
Microsoft Cluster
Services
SMTP
Active Directory™
Services
PDC/DNS server
Dell Precision 610
Single-processor, 550 MHz
256 MB RAM, 9 GB HD
3 x 100 Mbps NIC
Windows 2000 Advanced
Server
1
Dell Precision 610
Single-processor, 550 MHz
256 MB RAM, 9 GB HD
2 x 100 Mbps NIC
56K Modem
20G Backup Tape Drive
Windows 2000 Advanced
Server
SiteScope/Microsoft
Cluster Sentinel
Remote
1
Monitoring Client
Dell Precision 610
Single-processor, 550 MHz
256 MB RAM, 9 GB HD
1 x 100 Mbps NIC
Windows 2000
Professional
100-Mbps LAN
switch
Allied Telesyn CentreCOM
FS708 100 Mbps Ethernet
Switch
N/A
Administration
server
1
3
Active Directory™
Services
Conclusion
Duwamish Online has been an exciting adventure into some of the challenges and
issues you would face as a developer or operations manager deploying a new site to
the Internet. Although this latest release goes a long way to completing our Web
store—with payment and fulfillment processing—there are still areas that were left
undone, because it's not a real business.
Along with operating the site, the Duwamish team will spend the summer releasing
the sample code for the site, continue running further performance tests, and publish
articles describing and explaining how to leverage the work we've done on Duwamish
into your own applications. Visit our column on MSDN Online
(http://msdn.microsoft.com/voices/sampleapp.asp) throughout the summer to
download our source code and for more in-depth articles about Duwamish Online.
Send feedback to MSDN. Look here for MSDN Online resources.