What is Cloud Computing?

What is Cloud Computing?
This is the first of four-part series on cloud computing for BI professionals.
There is a lot of confusion about cloud computing, even among professionals in the field. But that’s true
of any new, fast moving technology in which there are a lot of new technologies and methods. After
reading a few definitions of cloud computing that caused me to nod off at my keyboard, I created a
simpler one:
Shared, online compute resources that you rent from a service provider and dynamically configure
yourself.
Let’s unpack this definition a bit:
•
•
•
•
•
•
Shared: You share compute resources with other groups or companies, even your direct
competitors! Obviously, this raises security and privacy concerns.
Online: You access compute resources via a Web browser or a programmatic Web application
programming interface. In this respect, cloud computing delivers online “services.”
Compute resources: Compute resources consist of the infrastructure (servers, storage, and
networks), development tools, and applications. Basically, the whole stack, accessible via a Web
browser or service call.
Rent: You only pay for what you use and you can terminate the service at any time (although
there may be exit fees.) This is value-based pricing. Cloud infrastructure vendors generally
charge by the hour while cloud software providers generally charge by user per month.
Service provider: A service provider could be your internal IT department (private cloud) or an
external company (public cloud).
Dynamically configure: Unlike traditional hardware and software, you don’t purchase, install,
test, tune, and maintain cloud-based resources. With cloud-based infrastructure, you simply
configure a virtual image of your compute environment (hardware, storage, network) using a
Web browser. With cloud-based software, you simply configure your application using a Web
browser to conform with your branding and workflow requirements.
Three Services
As you probably have already surmised, cloud computing is divided into three classes of services, each of
which can be applied to the business intelligence market: 1) software as a service (applications) 2)
platform-as-a-service (development), and infrastructure as a service (compute resources). (See figure 1.)
Figure 1. Three Types of Cloud Services with BI Examples
Software-as-a-Service (SaaS). SaaS delivers packaged applications tailored to specific workflows and
users. SaaS was first popularized by Salesforce.com, which was founded in 1996 to deliver online sales
applications to small- and medium-sized businesses. Salesforce.com now has 92,000 customers of all
sizes and has spawned a multitude of imitators. A big benefit of SaaS is that it obviates the need for
customers to maintain and upgrade application code and infrastructure. Many SaaS customers are
astonished to see new software features automatically appear in their application without notice or
additional expense.
Within the BI market, many startups and established BI players offer SaaS BI services that deliver readymade reports and dashboards for specific commercial applications, such as Salesforce, NetSuite,
Microsoft Dynamics, and others. SaaS BI vendors include Birst, PivotLink, GoodData, Indicee, Rosslyn
Analytics, and SAP, among others.
Platform-as-a-Service (PaaS). PaaS enables developers to build applications online. PaaS services
provide development environments, such as programming languages and databases, so developers can
create and deliver applications without having to purchase and install hardware. In the BI market, the
SaaS BI vendors (above) for the most part double as PaaS BI vendors.
In a PaaS environment, a developer must first build a data mart, which is often tedious and highly
customized work since it involves integrating data from multiple sources, cleaning and standardizing the
data, and finally modeling and transforming the data. Although SaaS BI applications deploy quickly, PaaS
BI applications are not. Basically, SaaS BI are packaged applications while PaaS BI are custom
applications. In the world of BI, most applications are custom. This is the primary reason why growth of
Cloud BI in general is slower than anticipated. (See part III in this series.)
SaaS BI are packaged applications and PaaS BI are custom applications. In the
world of BI, most applications are custom.
Infrastructure-as-a-Service (IaaS). IaaS provides online computing resources (servers, storage, and
networking) which customers use to augment or replace their existing compute resources. In 2006,
Amazon popularized IaaS when it began renting space in its own data center using virtualization services
to outside parties. Some BI vendors are beginning to offer software infrastructure within public cloud or
hosted environments. For example, analytic databases Vertica and Teradata are now available as public
services within Amazon EC2, while Kognitio offers a private hosted service. ETL vendors Informatica and
SnapLogic also offer services in the cloud.
Key Characteristics of the Cloud
Virtualization is the foundation of cloud computing. You can’t do cloud computing without
virtualization; but virtualization by itself doesn’t constitute cloud computing.
Virtualization abstracts or virtualizes the underlying compute infrastructure using a piece of software
called a hypervisor. With virtualization, you create virtual servers (or virtual machines) to run your
applications. Your virtual server can have a different operating system than the physical hardware upon
which it is running. For the most part, users no longer have to worry whether they have the right
operating system, hardware, and networking to support a BI or other application. Virtualization shields
users and developers from the underlying complexity of the compute infrastructure (as long as the IT
department has created appropriate virtual machines for them to use.)
Figure 2 shows that with virtualization, organizations can run multiple, heterogeneous virtual servers on
a single physical server to maximize utilization (A), or they can run a single virtual server on multiple
physical servers to increase scalability (B). They also can spawn multiple instances of a single application
using virtual servers and run them in parallel on a single physical server to improve application
performance and throughput. (C) In addition, because virtualization decouples applications from the
underlying hardware, IT administrators can migrate applications to new hardware without having to
reinstall software.
Figure 2. Virtualization Use Cases
Apps
Apps
Apps
Apps
Apps
Apps
Apps
Apps
Application
App A
App A
App A
Guest OS
Guest OS
Guest OS
Guest
OS
Guest
OS
Guest
OS
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Server (HW/OS)
A) Maximize server
utilization
Server
(HW/OS)
Server
(HW/OS)
B) Improve application
scalability
Server (HW/OS)
C) Increase application
efficiency (parallelize)
In short, virtualization increases the flexibility, scalability, efficiency, and availability of data center
resources, and it dramatically lowers data center costs by enabling the IT department to consolidate
servers and reduce power, cooling, space, and staffing overhead.
To the Cloud: Dynamic Provisioning
Browser Interface. To turn virtualization into cloud computing, you need to add software that enables
business users to dynamically provision their own virtual servers and use the servers as long as they
desire.
For instance, developers using a Web browser can configure a custom virtual server to support a new
development and test bed. Or, they can select a virtual image (i.e., server and applications) from a
library of virtual images created in advance by the IT department. Once the developers are finished
using the virtual images, they “release” them. Thus, developers no longer need to submit requests to
the IT department for servers, storage, and networking capacity. They either configure their own virtual
machine or select one from a library that meets their application’s processing requirements. They no
longer have to wait for purchasing and legal to execute a purchase order or the IT department to install,
tune, test, and deploy the systems.
Services Interface. To make the leap to cloud computing, you also need a services interface so
administrators can programmatically provision servers based on a schedule or events (e.g., an ETL job
that begins). Administrators use Web services interfaces to support auto-scaling, failover, and backups.
With auto-scaling, a BI administrator uses a cloud services interface to automatically provision and
release virtual BI servers during the course of a day to efficiently allocate processing power among
servers to support various BI workloads. For example, at 2 a.m. in a typical BI environment, the system
fires up an ETL server and database server to run nightly ETL jobs, while at 4 a.m. it releases the ETL
server and provisions a BI server to process and burst daily reports. At 10 a.m. it provisions an additional
BI server and database server to handle peak usage. Failovers and backups work much the same way.
Cloud Management Software. Cloud computing also requires management software to help IT
administrators keep track of all the moving parts in a virtualized environment. Cloud management
software enables IT administrators to define systems-level policies (e.g., security and usage), create and
manage virtual images which enforce the policies, manage virtual server versions, monitor servers and
performance, manage user roles and access, track usage, and manage chargebacks or accounting,
among other things. There are a variety of vendors that offer cloud management software, including
cloud data center providers, such as Amazon.com and Rackspace, and independent software vendors,
such as Eucalyptus and RightScale.
Multi-tenancy
Another key characteristic of cloud computing (in particular, Software-as-a-Service) is that applications
are multi-tenant, which means multiple users from different organizations run the same application
code running on the same hardware. This is different from a traditional hosting or outsourcing
environment in which each customer owns or rents a dedicated set of hardware and software in the
service provider’s data center. The hosted model leads to a lot of wasted compute resources since
customers only use their own compute resources even when other machines in the data center are idle.
In contrast, multi-tenancy makes much more efficient use of hardware and software resources,
delivering economies of scale that make cloud computing an attractive business model to service
providers, as long as they can attract enough customers.
One problem with multi-tenancy is that applications must be designed from scratch to support it. Multitenancy creates virtual partitions within the application and database for each distinct customer.
Customers usually configure the application to match their unique branding and workflow
requirements. On the data side, customer data is either interleaved by row and separated using unique
identifiers or partitioned into separate tables or database instances.
Legacy applications not designed for multi-tenancy have to fudge it. Either the service provider must
create dedicated environments for each customer, which is highly inefficient (e.g., the old application
service provider model) or they use virtualization software to run parallel instances of each application
(e.g., a virtual appliance.) In some respects, the virtual appliance approach is more flexible than multitenancy because the virtual appliances can be ported to run on almost any hardware. However, it also
requires managing many different application and database instances (See figure 3.)
Figure 3. Application Architectures
Application
Database
Server
Customer
Application Application
Database
Database
Server
SSL
Customer
SSL
Customer
Application
Database
Server
Customer
Customer
SSL
Customer
D. Virtual Appliance
SSL
C. Multi-Tenant
VPN
LAN
Users
B. Single Tenant
(Hosted)
VPN
A. Traditional
On Premise
Application
Application
Database
Database
Virtualization Virtualization
Server
Traditional on-premise software (A) tightly couples logic and data to hardware in a LAN environment. A
hosted environment (B) gives each customer their own dedicated hardware and software resources in a
third party data center which they access via a virtual private network. A true multi-tenant environment
(C) partitions a single application and database so different customers get their own unique views while
sharing the same application, database, hardware, and network connection. A virtual appliance model
(D) enables legacy software not written for multi-tenancy to run parallel instances, essentially
virtualizing multi tenancy.
SaaS BI vendors have long waged battles over whether their respective software is truly multi-tenant or
not. The virtual appliance model gives legacy software vendors venturing into SaaS a more equal footing
on which to compete.
Summary
This article defined cloud computing and discussed some of its more salient attributes. However, there
are several ways to deploy the cloud, and these deployment options have significant implications on
costs, security, and staffing. The next article in this series discusses the differences between public
clouds, private clouds, and hybrid clouds and shows how an organization might architect its BI
environment to leverage public cloud offerings.
Deployment Options for Cloud Computing
This is the second in a four-part series on cloud computing for BI professionals.
Cloud computing offers a compelling new way for organizations to manage and consume compute
resources. Rather than purchase, install, and maintain hardware and software, organizations rent shared
resources from an online service provider and dynamically configure the services themselves. This
model of computing dramatically speeds deployment times and lowers costs. (See prior article “What is
Cloud Computing?”)
Although cloud computing shares the above attributes, it can be deployed in several different ways. The
key factor is whether the cloud service provider is an external vendor or an internal IT department.
There are three deployment options for cloud computing:
•
•
•
Public Cloud. Application and compute resources are managed by a third party services
provider.
Private Cloud. Application and compute resources are managed by an internal data center
team.
Hybrid Cloud. Either a private cloud that leverages the public cloud to handle peak capacity, or a
reserved “private” space within a public cloud, or a hybrid architecture in which some
components run in a data center and others in the public cloud.
Public Cloud. Most of the discussion about cloud computing in the press refers to public cloud offerings.
The public cloud offers the most potential benefits and the greatest potential risks. With a public cloud,
organizations can obtain application and computing resources without having to make an upfront
capital expenditure or use internal IT resources. Moreover, customers only pay for what they use on a
usage or monthly subscription basis, and they can terminate at any time. Thus, public clouds accelerate
deployments and reduce costs, at least in the short run. This is sweet news to BI teams that often must
spend millions of dollars and months of development time before they can deliver their first application.
The public cloud offers the most potential benefits and the greatest potential
risks.
In addition, a public cloud frees up IT departments to focus on more value-added activities rather than
hardware and software upgrades and maintenance. In short, there is something for everyone to like
about the public cloud.
Security and Privacy. But the public cloud also comes with risks. Security and privacy are the biggest
bugaboos. Some executives fear that moving data and processing beyond their own firewalls exposes
them to security and privacy risks. They fear that moving data across public networks and comingling it
with other company’s data in a public cloud might make it easier for sensitive corporate data to get into
the wrong hands.
While security and privacy are always an issue, the fact is that most corporate resources are more
secure in the public cloud than in a corporate data center. Public cloud providers, after all, specialize in
data center operations and must meet the most stringent requirements for security and privacy.
However, there are compliance regulations that legally require some organizations to maintain data
within corporate firewalls or pinpoint the exact location of their data, which is generally impossible in a
public cloud which virtualizes data and processing across a grid of computers and possibly
geographically distinct data centers.
Other Challenges. The public cloud poses other challenges:
•
•
•
•
Reliability. Executives may question the reliability of public cloud resources. For example,
Amazon EC2 has had two short, but high profile, outages, causing companies that ran mission
critical parts of their business there to be left stranded without much visibility into the nature or
longevity of the outage.
Costs. It can be extremely difficult to estimate public cloud costs because pricing is complex and
often companies can’t accurately estimate their usage (which is why they want to migrate
workloads to the cloud in the first place.)
Blank Slate. Administrators must redefine corporate policies and application workflows from
scratch in the public cloud, which generally provides plain vanilla services.
Vendor and Technology Viability. The public cloud market is evolving fast so it’s difficult to
know which vendors and technologies will be around in the future.
Private Clouds. Because of the above reasons, many organizations are beginning their journey into the
cloud with private clouds. This is especially true in the infrastructure-as-a-service arena where IT
administrators are implementing virtualization software to consolidate servers and increase overall
server utilization, flexibility, and efficiency. In addition, a private cloud gives an organization greater
control over its processing and data resources, providing ease of mind for worried executives, if not
greater security and privacy for sensitive data. And since a private cloud runs in an existing data center,
IT administrators don’t have recreate security and other policies from scratch in a new environment.
But the private cloud has its own challenges. IT administrators have to learn and install new software
(hypervisors and cloud management utilities). They need to manage two compute environments side by
side and keep IT policies aligned in both. This adds to complexity and staff workload. And it goes without
saying that a private cloud still runs in an existing corporate data center with all of its staffing and other
fixed costs.
Hybrid Cloud. As a result, companies are increasingly pursuing a two-pronged strategy that uses the
private cloud for the bulk of processing and the public cloud to handle peak loads. The key to a hybrid
cloud is obtaining cloud management software that spans both private and public cloud environments.
The software supports the same hypervisors used in each environment (ideally it’s the same hypervisor)
and has built-in interfaces to the public cloud provider so internal IT policies and virtual images can be
transferred to the public cloud environment.
In addition, many public cloud vendors allow customers to carve out private clouds within the public
cloud domain. For example, Amazon.com offers a virtual private cloud within its Elastic Compute Cloud
(EC2) environment that lets customers reserve dedicated machines and static IP addresses, which they
can link to their internal data centers via virtual private networks. Hybrid clouds are obviously more
complex and challenging to manage. Currently, few people have experience blending private and public
clouds in a seamless way.
Adding Public Cloud Components to a BI Architecture
Another form of hybrid cloud uses public cloud facilities to enhance an existing architecture. In a BI
environment, there are several ways that organizations can mix and match on premises and public cloud
offerings.
Scenario #1 - Analytic Sandbox. When a data warehouse runs at full capacity, administrators might
consider offloading complex, ad hoc queries submitted by a handful of business analysts to a replica
database in the public cloud. In this scenario, the IT staff decides that it’s faster and cheaper to provision
a new analytic data mart in the public cloud and load it with a replica of data warehouse data than to
bulk up the data warehouse with additional, expensive hardware. The IT staff (or analysts) increase or
decrease capacity as usage patterns change using self-provisioning capabilities of the public cloud. (See
figure 4.)
Figure 4. Analytic Sandbox Using a Public Cloud
The primary challenge in this scenario is the cost and time required to move data across the internet
from an internal data center to the cloud. Since the initial load may take days or weeks depending on
data volumes, IT staff will usually ship a disk to the cloud provider to load manually. Thereafter, the IT
staff needs to figure out whether it can move daily deltas across the internet within an allotted batch
window. Considering that it takes six days to move 100GB across a T-1 line, organizations may need to
skip doing batch loads and instead trickle feed data into the data warehouse replica. In addition, it is
often difficult to estimate pricing for such data transfers and charges may add up quickly. Cloud
providers generally charge for transferring data in and out of the cloud and storing it. (Amazon,
however, has recently discontinued fees for transferring data into EC2.)
Also, depending on the speed of network connections, the business analysts might experience delays in
query response times due to internet latency. Invariably, internet speeds won’t match internal LAN
speeds so users might notice a difference. Finally, there are security and privacy issues discussed in the
previous article. (See “What is Cloud Computing?”)
Scenario #2. Cloud-based Departmental Dashboard
A more common scenario is when a department head purchases a SaaS or PaaS BI solution from a Cloud
BI vendor. Here, an organization’s source systems and data warehouse remain in the corporate data
center but the dashboard and associated data mart run in the cloud. (See figure 5.)
Cloud BI tools are popular among department heads who want a dashboard on the cheap and don’t
want to involve corporate IT. While SaaS applications are quick and easy to deploy, most Cloud BI
implementations require custom data design and development using PaaS tools. Unfortunately,
designing a data mart, whether in the cloud or on premise, is never easy or quick, especially if it involves
integrating multiple operational sources. This is not a problem if organizations are willing to pay the
costs of creating a custom data mart and wait three to four months, which is the time it usually takes to
design and build a relatively complex, custom BI environment. However, if they believe the cloud
provides quick, easy, and inexpensive deployments for any type of BI application, they will be
disappointed. Also, they still need to transfer data to the cloud and users may experience response time
delays due to internet latencies.
Figure 5. Cloud-based Departmental Dashboard
Scenario #3. BI in the Cloud Without the Data
To eliminate security, privacy, and data transfer issues, companies may want to keep data locally in a
corporate data center while maintaining the BI application in the cloud. (See figure 6.) BI developers can
configure the SaaS BI tool to meet their branding and workflow requirements, gaining the speed and
cost advantages of cloud deployments, while minimizing data security and privacy problems.
While this scenario sounds like it optimally balances the risks and rewards of cloud-based BI
deployments, it has a major deficiency: it requires the IT department to open a port in the corporate
firewall to support incoming queries. If the organization is worried enough about data security to want
to keep data locally where it’s safe, they will not allow this option as soon as they recognize the security
vulnerability it presents.
Figure 6. BI in the Cloud Without Data
Scenario #4. Data Warehouse in the Cloud.
The final scenario is to put the entire data warehousing environment in the cloud. (See figure 7.) Today,
this only makes sense if all your operational applications also run in the cloud. Obviously, this scenario
only applies to few companies, namely internet startups that have fully embraced cloud computing for
all application processing. However, these companies have to manage all the problems associated with
the public cloud (i.e., security, reliability, availability, and vendor viability). At some point in the future,
this architecture may prove dominant once we get past security and latency hurdles.
Figure 7. Data Warehouse in the Cloud
Summary
There are three major deployment options for cloud computing: public, private, and hybrid. As in most
things in life, there is rarely a clear cut solution. Organizations will experiment with public and private
clouds, and most will probably have a mix of both. Most data center shops have already implemented
virtualization, which is the first step to a private cloud. Once they get comfortable with private clouds,
they will experiment with hybrid clouds to support peak periods of processing rather than spend
millions of dollars on new hardware. If the data is somewhat sensitive, they may opt for a private virtual
cloud inside a public cloud to ease their fears about security, privacy, and reliability of the public cloud.
When push comes to shove, economics and convenience always trump principles and ideals. This is how
e-commerce overcame the security bogeyman and gained its footing in the consumer marketplace. I
suspect the same will happen with cloud computing.
Expectations Versus Reality: Understanding the Dynamics of Cloud BI Market
This is the third in a four-part series on cloud computing for BI professionals.
There are no shortcuts in business intelligence (BI). And Cloud BI vendors and some of their customers
are finding this out the hard way.
I’m a firm believer that most computing will eventually move to the Cloud but I’ve been surprised that
the adoption of Cloud BI services has been slower than expected. Most pureplay Cloud BI vendors today
are small, and leading BI vendors no longer market their Cloud BI solutions to a significant degree (if at
all.)
Red Herrings. The two most commonly cited obstacles to Cloud BI adoption are security and data
transfer rates. The security issue is mostly a red herring, in my opinion, except at organizations with
strict compliance regulations. Data is actually safer in the Cloud than in many corporate data centers. In
terms of data transfer rates, a majority of organizations simply don’t generate enough daily data to
overwhelm a reasonable internet connection. And internet speeds are getting faster and cheaper all the
time. Another red herring.
The Missing Link
There is something deeper going on. There is a serious flaw in the Cloud BI equation. And I think I’ve
found it.
But first, it’s important to recognize that there is a lot to like about the Cloud. There are numerous
benefits to running applications and infrastructure as a service. There is no hardware and software to
buy, install, tune, and upgrade. Consequently, there are no IT people to hire, pay, and manage.
Applications upgrade automagically and can be scaled seamlessly. As a result, the Cloud drives down
costs and speeds delivery. What’s not to like?
Preparing Data. Nonetheless, we’ve hit a speed bump on the way to Cloud BI nirvana. That’s because
the hard part about delivering BI applications is not what users see—the graphical report or
dashboard—it’s collecting, cleaning, normalizing, integrating, and aggregating data from various systems
so it can be viewed in a clear, coherent way by business users. This is true for the majority of BI
applications, which by necessity, are custom built applications that source data from many unique
combinations of applications and files.
Preparing data is hard, tedious work, but it’s the foundation of BI. Do it right, and you can ice your cake
with sweet-tasting frosting. Do it wrong or not at all, and there is no cake to ice! Too many vendors have
peddled the icing and downplayed the need to bake the cake. Someone, somewhere has to do the dirty
work of preparing data or else everyone goes hungry.
Software Services or Professional Services?
Let me take another slight digression: What’s the difference between a Cloud BI vendor and a BI
consultancy? Not much.
Custom Data Marts. On one hand, you can argue that pureplay Cloud BI vendors, such as GoodData,
Indicee, Birst, and PivotLink, offer software, which consultancies don’t, and that the best offer true
multi-tenant BI services that run in a virtualized environment to achieve economies of processing scale.
But on the other hand, Cloud BI vendors, just like BI consultancies, provide professional services to build
custom data marts for their customers. Like consultants, they need to gather requirements, build a data
model, extract and map source data, and build reports. This is a lot of work. If you peel back the covers
on many Cloud BI deployments, they are really custom consulting jobs masquerading as a software
service. But that’s not the end of it.
Operational Management. There is one big difference between Cloud BI providers and BI consultancies:
once the development work is done, BI consultancies go home or move on to the next job, but Cloud BI
vendors have to stick around and run the BI environment, just like an inhouse IT staff would. They have
to schedule and execute jobs to extract and clean data and then transform and load it into the data
mart. They have to manage change control and error processes, troubleshoot problems, and staff a help
desk to answer any questions customers might have. And before they can upgrade their software, they
need to test every customization that they’ve built for every customer (which happens to undermine
one of the major benefits of Cloud-based services, which is rapid delivery of software upgrades.)
Fixed Costs. Adding insult to injury, before Cloud BI vendors can begin collecting money, they have to
build out and staff a highly secure and scalable data center that offers full backup/recovery, failover, and
disaster recovery services. Customers have been trained to demand the highest level of IT
administration services possible from a Cloud or hosting vendor, even though many would not pay for
the same level of services in their own data centers.
Subscription Pricing. Obviously, all of this involves a lot of work and is very expensive. So you would
think that Cloud BI vendors command premium prices, right? Well, not really. In fact, mostly the
opposite. Customers pay only for what they use on a monthly basis and they can cancel their
subscription at any time (although there may be exit fees.) Compared to on-premise software where
vendors get all their money upfront, Cloud BI vendors have to wait several years before they accrue a
comparable sum. But, in the meantime, they have to finance an expensive technical and organizational
infrastructure that requires large upfront and ongoing outlays. In short, the business model for Cloud BI
is problematic.
So you would think that Cloud BI vendors command premium prices, right? Well,
not really. In fact, mostly the opposite.
Wrong Audience? Some Cloud BI vendors have backed themselves into this corner by touting their
services as low cost, easy to use, and fast to deploy. They’ve had a receptive audience among the
unwashed masses of small- and medium-sized businesses that have no or minimal IT budget and staff
and little knowledge of BI. They’ve also done well selling to department heads at large companies which
have clamped down hard on IT budgets. In short, Cloud BI vendors have done a good job of selling an
information-rich vision to data-hungry business people who have few capital dollars, tight budgets, and
minimal understanding of BI.
Cloud BI vendors have done a good job of selling an information-rich vision to
data-hungry business people who have few capital dollars, tight budgets, and
minimal understanding of BI.
Unfortunately, unlike on premises software vendors, Cloud BI vendors have to back up their claims.
They can’t sell a promise and then vacate the premises. They have to live daily with the expectations
that they’ve created among their customers who expect low-cost, high-speed delivery of robust BI
services. They are held to a higher standard than other BI vendors.
Market Strategies – The Way Forward
As I see it, Cloud BI vendors have several options to thread this needle:
1) Customize—Call a Spade a Spade. If Cloud BI vendors want to deliver a complete BI solution that
solves real business problems, they could shift from selling software services to professional services,
and compete head-on with BI consultancies. Cloud BI vendors would have several advantages here:
a) Cloud BI vendors can not only develop custom solutions, they can run them. And they can do so
in a cost-effective (but not inexpensive) way due to the economies of scale of a virtualized,
hosted infrastructure.
b) They can also develop solutions faster than BI consultancies because they can leverage prebuilt
software, models, and metrics built for other customers (although veteran consultancies will
also have at least prebuilt models and metrics.)
I haven’t come across any Cloud BI vendor that is taking this approach overtly, although many are doing
so in practice. Perhaps the closest is SAP BusinessObjects OnDemand.
2) Simplify and Shift. Another approach is for Cloud BI vendors to strip out all the custom work from the
equation by making the application as simple as possible for the customer to implement without
assistance. Here, the burden of uploading, modeling, and mapping data shifts from the Cloud BI vendor
to the customer. In other words, the Cloud BI vendor does the easy stuff and the customer does the
hard stuff.
The challenge is here making the modeling and mapping tools both easy to use and suitably
sophisticated. This is a devilish tradeoff and, in most cases, a Cloud BI vendor will side with simplicity
rather than power and flexibility. This means that their customers will likely hit the wall with such tools
once they want to do something complex, and then the Cloud BI vendor will need to step in and provide
custom development work. And if the application is really simple, then the customer might find it is
more cost effective to build it in Excel than in the Cloud. Of all the Cloud BI vendors, Indicee seems to be
following this path most closely.
3) Package and Configure. Another way to minimize the amount of custom development is to deliver
packaged analytic applications that come with canned but configurable data mappings, data models,
metrics, and reports. The mappings extract, transform, and load data from a specific source application
(e.g., Salesforce.com) to a target data model with predefined dimensions, hierarchies, and metrics.
Packaged analytic applications streamline development and accelerate deployment.
The challenge with packaged analytic applications is that they only work if the customer has the same
source application that the package supports and they can live with the packaged reports, dashboards,
and metrics that come with the package. Packages typically fall apart when customers want to
customize rather than configure the application or they want to extract data from more than one source
application to feed the data models and reports. Then, the Cloud BI vendor must customize the
environment to meet the customer’s need, which becomes a consulting engagement. The trick here for
vendors is to build out a sizable portfolio of packaged applications so the majority of customers can
work with off-the-shelf solutions and only a minority need customization services.
GoodData seems to be following this approach. It packages back-end connections with various Cloudbased applications, such as Salesforce.com, Microsoft Dynamics CRM Online, and SugarCRM, while
leaving the front-end fully customizable through templates that embed metrics and reports for each
source application. In essence, GoodData delivers packaged operational reports for various Cloud-based
applications and customization services to extend those packages.
4) Go On Premise. The last option is for vendors to abandon the Cloud, either in part or in full, and
deliver on-premise software. Here, the vendor gets its money upfront and leaves customers with the
responsibility of creating their own custom BI solution. Offering an on-premises solution in conjunction
with a Cloud-based one can highlight the advantages of going with a Cloud solution. The vendor can use
the price differential between the two offerings to educate customers about the true cost of building
and maintaining custom BI solutions. This might convince some customers that the price of a Cloud BI
solution is worth paying.
The challenge here is that vendors who straddle both models need to offer both Cloud services and onpremises software. This is a mixed business model which perhaps isn’t sustainable, especially for
startups. They will still have the overhead costs of the Cloud model with fewer customers, while at the
same time, it has to provide standard maintenance and support to its on-premises customers. It could
work but it’s unproven. Currently, Birst is pursuing this course.
Conclusion
The only way to make money in the Cloud is to have a lot of customers. The only way to get a lot of
customers quickly is to give everyone the same application and do a minimal amount of custom
development work. In the Cloud, economies of scale are everything. But BI is largely a custom
development effort. Unfortunately, most business customers don’t realize this and most Cloud BI
vendors have done little to disabuse them of the notion. In addition, most Cloud BI vendors have
underestimated the challenge of delivering robust BI services that address real business needs and are
now struggling to find a sustainable business model that will deliver real profitability.
Ultimately, the industry will figure out a way to make Cloud BI work for everyone involved. We may
have to ratchet down our expectations on both sides of the equation. But there is too much value in
running applications remotely in a virtualized environment for Cloud BI not to succeed in the long run.
Cloud BI Adoption: Gauging Market Demand
This is the fourth in a four-part series on cloud computing for business intelligence (BI) professionals.
Business intelligence in the Cloud is inevitable. In fact, it’s already happening. Although Cloud BI hasn’t
slowed the growth of on-premises BI software, an increasing number of organizations are using Cloudbased BI services and many more are waiting until the time is right.
By Cloud BI, I primarily mean reports and dashboards that run in a multi-tenant, hosted environment
and which users access via a Web browser. The reports and applications can be packaged (i.e., Softwareas-a-Service or SaaS) or custom-built by the service provider or customer using Platform-as-a-Service
(PaaS) tools. (See Part II of this series for more detailed explanations of SaaS and PaaS.)
Benefits. The Cloud offers numerous benefits for organizations that want to run reports and
dashboards. There is no hardware and software to buy, install, tune, and upgrade. Consequently, there
are no IT people to hire, pay, and manage. Applications upgrade automagically and can be scaled
seamlessly. Customers pay a monthly subscription based on usage rather than an upfront licensing fee.
Essentially, the Cloud speeds delivery and drives down costs. What’s not to like?
Impediments. But there are concerns. One of the biggest impediments to Cloud BI today is security, or
at least the perception that data in the Cloud is less secure than data housed in a corporate data center.
In reality, data is actually safer in the Cloud than in many corporate data centers. The real issue is not
security, it’s “control.” Executives today simply feel safer if their data is housed in a corporate data
center. (Ironically, most companies have already outsourced sensitive data, like payroll and sales, to
third party providers.) To be fair, some companies, especially those in financial services, must comply
with regulations that currently require them to keep data on premise.
E-Commerce. Interestingly, many industry experts raised the same security bogeyman in regards to ecommerce. In the late 1990s, many experts said, “Consumers will never type their credit card number
into a Web browser and ship it off to an unknown destination via the public internet because it could be
stolen.” Of course, we know how this story played out. In 2009, more than 154 million people in the U.S.
bought something online, and online sales are growing four times faster than retail sales in general,
according to Forrester Research. When it comes to security, convenience trumps fear, especially when
fear isn’t grounded in reality. The same appears to be happening with the Cloud.
When it comes to security, convenience trumps fear, especially when fear isn’t
grounded in reality.
The biggest challenge with running BI in the Cloud involves packaging custom BI development into a
cost-effective, online service. By its nature, BI involves creating custom applications that integrate data
from unique combinations of data sources. Cloud BI vendors are still figuring out how to deliver BI
services without losing their shirts or turning into a custom development shop. (See Part III of this
series.)
Finally, some Cloud BI vendors only deliver interactive reports and dashboards (e.g. SAP, Indicee, and
GoodData), while only a few offer more in-depth analysis using on-line analytical processing (OLAP)
(e.g., Birst) or pivot table functionality against big data (e.g, PivotLink). However, for most organizations
getting started with BI, reporting and dashboarding functionality is more than sufficient to satisfy their
information appetites.
Gaining Traction
Despite these obstacles, Cloud BI is gaining ground, according to a recent survey of BI Leadership Forum,
a global network of BI Directors and other BI professionals. (See www.bileadership.com.) More than
one-third of organizations are currently using the Cloud for some part of their BI program, according to
the survey. (See figure 1.)
Figure 1. Are you using the Cloud for any part of your BI program?
Not Sure - 1%
Yes - 36%
No- 63%
Source: BI Leadership Forum, June, 2011. Based on 112 responses. www.bileadership.com.
Organizations that have embraced the Cloud point to “speed of deployment” (30%) and “reduced
maintenance” (30%), followed by “flexibility” (19%) and “cost” (11%). (See figure 2.)
Figure 2. Motivating Factors
What is your top reason for using Cloud BI?
Reduced maintenance
30%
Speed of implementation
30%
Flexibility
19%
Cost
11%
Performance
5%
Other
5%
Momentum. So far, Cloud BI users are happy campers. Almost two-thirds (65%) said they plan to
increase their usage of Cloud BI in the next 12 months. Only 3% said they would decrease usage while
another 16% will keep their implementation the same and 16% weren’t sure. (See figure 3.)
Among respondents who are not using Cloud BI, 16% said they plan to implement Cloud BI in the next
12 months and 32% were not sure. So Cloud BI has momentum. However, it may take a five to 10 years
for Cloud BI to reach the tipping point where it becomes a mainstream component of every BI program.
Given Cloud BI’s benefits, this trajectory is inevitable.
Figure 3. Future Usage
Are you planning to increase or decrease your
use of Cloud BI in the next 12 months?
Increase
Decrease
65%
3%
Stay the same
16%
Not sure
16%
Small Companies Lead the Way
A closer look at the data confirms what many pundits have said about the target market for Cloud-BI
software: that it’s currently ideal for small companies with few IT resources, limited capital to spend on
servers and software, and minimal to no BI expertise. Almost half of small companies under $100M in
annual revenues (46%) use Cloud BI in some shape or form. In contrast, large companies with over $1B
in annual revenues are almost less than half as likely to adopt the cloud (29%), while medium-sized
companies with between $100M and $1B in annual revenues lag further behind with less than one-fifth
using BI in the Cloud (18%). (See figure 4.)
Figure 4. Cloud BI Deployment by Company Size
Are you using the Cloud for any elements of your BI
program?
46%
54%
Small Companies (<$100M)
Medium Companies ($100M to $1B)
Large Companies ($1B+)
Yes
18%
77%
No
29%
71%
Small Companies. For small businesses without legacy BI applications, Cloud BI services are a godsend.
The economics and convenience are compelling. Instead of passing around spreadsheets, small
companies can implement a Cloud BI service to standardize reports and dashboards and make them
available to all employees anywhere via a Web browser.
“What’s refreshing for me is that I can go in at any time of day and [run a] report on any metric in our
organization, such as item received delivered, inspected at the category, personnel, or employee level
and track it by any time period,” says Wayne Deer, vice president of operations, at Gazelle, an
electronics recycler, which uses GoodData’s Cloud BI service.
Large Companies. Interestingly, large companies are the next most prevalent users of Cloud BI services.
Often, it’s a department head who wants to build a BI application quickly without getting corporate IT
involved. Like small companies, departments at larger companies often have limited budgets and BI
expertise and most don’t want the headaches and expense of having to maintain servers and software.
But enterprise BI managers have assessed the potential of Cloud BI and like what they see.
Unfortunately, many are hamstrung by legacy BI implementations. “I see us moving very slowly with
adoption because of installed base and switching costs,” wrote Darrell Piatt, Director and Chief Architect
at a large professional services firm based in Virginia, in a BI Leadership Forum discussion thread. “When
and if we decide to replace our BI infrastructure, Cloud BI offerings will be seriously considered.”
Mid-size Companies. According to figure 2, medium-sized companies are least likely to adopt Cloud BI
services. The reason is that most have already implemented an enterprise IT platform, usually Microsoft
SQL Server, which bundles BI tools and applications for free. If the organization has assigned an analyst
or IT administrator to build and maintain enterprise reports and dashboards using the platform, it likely
has little bandwidth, incentive, or capital to change courses and introduce an alternative BI stack, unless
it is having difficulty meeting user requirements.
Cloud BI Vendor Perspective
Given that they provide a service that makes them an extension of an organization’s IT team, Cloud BI
vendors have a good handle on who their customers are and what they are doing with their services.
For example, Sam Boonin, vice president of products and marketing at GoodData, says that his
company’s customers fall into three camps: 1) fast-growing technology companies that run all their
applications in the Cloud, 2) departments in larger companies, many of which have implemented
Salesforce.com and are comfortable with the SaaS model, and 3) SaaS vendors who OEM their product.
Boonin also said that 90% of GoodData’s customers start by using one of its packaged applications,
which generate reports against a single SaaS-based, front-office application, such as Salesforce.com,
ZenDesk, and Google Analytics, or an on-premise package, such as Microsoft Dynamics CRM. (GoodData
currently offers 20 packaged applications.) These applications, which deploy in hours, start at $1,000 a
month and are often bundled into third party, SaaS products.
Many customers then extend their packaged SaaS BI application by customizing data models and adding
other data sources. GoodData configures or customizes the application to the customer’s specifications.
The process takes roughly six weeks and typically raises the monthly subscription price to between
$3,000 and $10,000 a month. “For companies used to spending $500,000 on BI and getting virtually
nothing, they see us as a godsend,” says Boonin.
Today, almost half (43%) of GoodData’s customers generate reports that run against multiple
applications. These customers generate the majority of GoodData’s revenues. Currently, GoodData has
about 100 direct customers and 3,000 indirect customers through OEMs. This is comparable to other
pureplay Cloud BI vendors. “Business is good,” says Boonin. “We have a $25 billion failed market to
disrupt.”
Conclusion
Some experts see dark clouds in the Cloud BI market. While ramp up of Cloud BI services hasn’t been as
fast as some anticipated, it’s clearly catching on. The BI market poses unique challenges compared to
other SaaS market segments that automate operational business processes. That’s because BI
applications are generally custom built and require companies to integrate data from multiple sources.
Cloud BI vendors have taken different approaches to “servitize” a custom application, which is a logical
contradiction. Not all have succeeded, but those still in the market are making headway. The value of
Cloud computing is high, and the BI industry will eventually find a way to succeed with it.