How to Select BI Tools for Different Types of Analysis

How to Select BI Tools for Different
Types of Analysis
A White Paper
WebFOCUS
iWay Software
Omni
Table of Contents
1
Executive Summary
2
Data Discovery Vs. Ad Hoc Analysis
3
Data Discovery Vs. Statistics
4
The Data Discovery Market Gap
5
Fully Integrated Data Discovery, Analysis, and Dashboarding
5
Information Consumers
5
Power Users, Business Analysts, and Developers
5
Business Analysts and Business Users
5
WebFOCUS InfoAssist
6
WebFOCUS Active Technologies
8
New Advanced Analytic Components in InfoDiscovery
8
WebFOCUS InfoDiscovery
9
Next-Generation Multivariate Analysis
11Conclusion
11
About Information Builders
Executive Summary
Data discovery is a quickly maturing discipline within the business intelligence (BI) and analytics
industry. Its growing popularity can be attributed to the rising demand to extract more insight
from business data to drive product innovation, marketing advances, and customer acquisition
and loyalty. The increasing need for data discovery is driven by the realization that knowledge
trapped in vast amount of data can have a profound impact on the bottom line. It is also driven
by the fact that information is largely underused in organizations, which are now seeking to
maximize returns on their information-related investments.
There is a huge shortage of data scientists who can analyze the data and extract the insights.
Therefore, vendors are driven to create easier-to-use tools to enable business analysts to perform
deeper, more sophisticated types of analysis, without having to undergo extensive training.
Most vendors have focused on creating separate tools for data discovery, because they are
startups that saw an opportunity to fill a niche in a mature industry. This, however, requires
users to learn a new tool and forces IT to struggle to integrate the new tool into their existing
architectures.
This white paper discusses the importance of employing advanced data visualization as part
of a broader enterprise business intelligence solution. It demonstrates how this approach will
expand the scope of analytic capabilities to include ad hoc report and dashboard creation, so
organizations can uncover insights and measure related outcomes – while leveraging existing
tools, talent, and infrastructure.
1
Information Builders
Data Discovery Vs. Ad Hoc Analysis
Data discovery is the process of using visual components to interactively explore data to identify
underlying patterns. It differs fundamentally from ad hoc query and analysis because the analyst
does not have prior knowledge of the data, assumptions, or a hypothesis. In other words, ad
hoc query starts with a known question: What are the total sales for our Eastern locations for the
last three months? How do this year’s sales differ from last year’s sales? Such questions are well
formulated, and the analyst can drag and drop fields to get the answers and facts.
On the other hand, an analyst may be tasked with identifying the most profitable customer
segments. He or she has to explore relationships between different variables, and determine
what constitutes a profitable segment. Furthermore, he or she has to analyze and learn about the
behaviors of the customers within this segment, and understand how this segment differs from
other customer segments. The learnings are then used to formulate an actionable strategy, such
as how to drive similar behaviors in laggard segments to increase their spend and share-of-wallet.
Typically, ad hoc analysis is a factual reporting project, while data discovery is more of a learning
project. This fundamental difference drives the varying methods and visual components needed
to perform each project. For ad hoc analysis, facts are clearly displayed and their significance is
indicated visually, e.g. by using conditional styling. For data discovery correlations and relationships
between variables need to be highlighted. For ad hoc analysis, analysts must be able to drill from
one fact to another to get to the root causes behind the facts. In data discovery, analysts need
the ability to subset the data (create new segments, combine segments, etc.) to see how the
patterns and relationships change, or what drives certain behaviors. In other words, does income,
education, or occupation contribute to higher customer profitability? How does spending change
if certain income categories are included or excluded? Learning such impact allows the analyst to
recommend more precise targeting for customer acquisition campaigns, for example.
Based on the needs for each type of analyst, the workflows – how analysts access data, build
components, apply filters, create calculations, etc. – do not differ significantly. What do differ
are the types of components and the logical interactions between them. Therefore, the most
effective approach to data discovery is to implement it as a module within existing ad hoc tools,
which offers the following advantages:
■■
■■
■■
■■
Provides a single tool for all types of analysis
Allows organizations to leverage existing skills and upgrade analysts to the more advanced
module
Minimizes installation and administration
Allows analysts to combine factual and insight content into single reports, dashboards,
and documents
2
How to Select BI Tools for Different Types of Analysis
Data Discovery Vs. Statistics
Pattern detection is not a new discipline. Statisticians have long used algorithms (i.e. data mining) to
derive insights from data. The simplest form of statistical discovery is to find correlations between
two variables. Correlations tell the researcher whether two things occur together. For example,
people frequently buy fish and white wine. Therefore, one can expect that if a person buys fish,
they will likely also buy white wine. This knowledge can be used to make recommendations. That is
why coupons or advertisements for white wine are often placed among fish recipes.
Relationships between two variables are easy to detect and display on scatter plot charts. What
happens, however, if there are more than two variables (known as a multivariate analysis)? How
do you detect the relationships among all those variables? Machine learning is often used to
analyze such multivariate data sets and to discover the relationships between all variables. For
example, clustering can be used to analyze all customer attributes and activities to infer customer
groupings. Clustering is based on a simple concept: Birds of a feather flock together. If customer
groups can be determined, they can be targeted more precisely to influence the behavior of each
group in a desirable way.
The problem is that the machine produces a formula that only the statisticians can understand.
It takes years to educate and train a statistician, and there is a tremendous shortage of statistical
talent. Hence, data visualization tools are evolving to provide an intermediate solution where more
analysts can use advanced data visualization components to perform more complex analysis.
While the first generation of data discovery tools were entirely focused on bivariate analysis, the
next generation will incorporate machine learning directly into the process to ensure higher
accuracy. Machine learning also detects patterns faster than a human being, thus allowing the
analysts to accelerate the analysis of large multivariate data sets. It is not unusual for a data mining
expert to look at millions of records across 800 different variables. It takes time to understand the
variables, as well as the patterns. Machine learning does this much faster.
As with any new technology, more features will be added to data discovery solutions; therefore
the toolset must be extensible.
3
Information Builders
The Data Discovery Market Gap
The data discovery vendors have built standalone, single function tools dedicated strictly to data
visualization and simple dashboarding. This process is a natural outcome of their emergence in the
BI space. As start-ups, they were looking for clear differentiation and niche markets to penetrate
the space without incurring huge launch and marketing expenses. Furthermore, their resources
were limited, so narrow specialization led to very constrained functional sets. Their primary focus
was, and still is, on the components and interactions needed for data discovery. This leaves huge
functional gaps in the areas of reporting, parameterized dashboards, metadata management, etc.
Those functional constraints naturally limited the use and scope of the tools, and forced
organizations to maintain two sets of solutions – one for ad hoc reporting and one for data
discovery. That implies that either different users are using different tools, or the same users had
to be trained on two different sets of tools. Making matters worse is the lack of interoperability, as
dashboards developed within the data visualization tools could not be deployed directly in the BI
platform. This leads to further integration, maintenance, and licensing gaps.
Why does traditional reporting and dashboarding need to be integrated with data discovery?
Why can’t those tools be run separately? As stated earlier, data discovery is more of a research
project. Thus, while businesses can benefit tremendously from discoveries and insights, they
cannot be run on insights alone. As the saying goes: “What cannot be measured cannot be
managed.” If insights help to craft strategies and tactics, those strategies and tactics still need to
be operationalized to ensure successful execution. This requires reporting and dashboarding to
monitor the outcomes.
If insights help to craft strategies and tactics, those strategies and tactics
still need to be operationalized to ensure successful execution.
It appears only natural to extend the functionality of the tools in the BI platform with pattern and
insight discovery capabilities so that the information system as a whole can deliver the right tools
to ensure discoveries, while enabling the monitoring of the outcomes from those strategies based
on these discoveries. Organizations need a quick way to transform discoveries into manageable
actions, and this is what a robust BI platform provides.
As mentioned in the prior section, to increase the accuracy and validity of the analysis, data
discovery tools have to be enhanced to include more robust machine-learning algorithms in
the process. At the same time, they must hide the complexity of the process from the end user.
Platforms that already have advanced statistics can make the leapfrog to this next-generation
evolution easier than non-platform vendors.
4
How to Select BI Tools for Different Types of Analysis
Fully Integrated Data Discovery, Analysis, and Dashboarding
Information Builders’ robust WebFOCUS BI and analytics platform provides fully integrated data
discovery, plus advanced statistical modeling. It offers the broadest range of solutions to support
data discovery, as well as reporting, dashboarding, and other types of analysis. This enables users
to participate in rich, interactive information visualization, while eliminating the gaps associated
with standalone data discovery tools.
With WebFOCUS, you can easily deploy BI applications that are tailored for different types of users
– self-service analytics for everyone.
Information Consumers
Anyone who needs information to do their job needs an InfoApp to help them get insights and
answers from their data. Sometimes these users are outside the firewall (customers, partners,
agents, suppliers, etc).
InfoApps are designed to make BI content and data visualizations more readily accessible to nontechnical users who don’t need tools for reporting, analysis, or data discovery – they just need an
interactive app.
Power Users, Business Analysts, and Developers
Some people want to generate their own reports, run their own queries, and conduct their own
analyses without the assistance of IT staff. They know how the organization’s data is organized and
want to “surf” this information themselves.
Business Analysts and Business Users
Analysts and other users may want to explore relationships between multiple variables, and discover
patterns and insights using data from multiple sources including external data. This is a great way to
gain new kinds of insight, previously unattainable by traditional analysis of enterprise data.
The security, scalability, and licensing models in WebFOCUS allow the content produced by the
power users and analysts to be shared and viewed by everyone.
WebFOCUS InfoAssist
WebFOCUS InfoAssist is a robust web-based tool for query, reporting, drag-and-drop ad hoc
analysis, and interactive InfoApp development. With InfoAssist, business users can rapidly create
intricate InfoApps, reports, and dashboards in an interactive, fully customizable WYSIWYG (What
You See Is What You Get) development environment.
Business analysts often need to perform in-depth analysis to determine why something has
happened – why business is slow, why sales are going down, etc. InfoAssist enables a business user
to report against any data sources, drag and drop objects against predefined metadata, and answer
questions about sales by region, by customer segments, or by different product categories.
5
Information Builders
InfoAssist extends the power of WebFOCUS, allowing business users to access data from multiple
enterprise sources, including multidimensional sources. There is no need to worry about errors or
data integrity caused by complex data structures. Users can reach root causes by dragging and
dropping objects, filtering, applying conditions, etc.
Many of these reports run routinely to allow business managers to monitor performance and
desired outcomes over time. Businesses are run based on measures. The ad hoc analysts create
reports and dashboards around those measures. When an exception or change of trend occurs
in the business, the analysts create additional reports and measures, so those new trends and
outcomes can be measured and managed.
Users can take advantage of a high-quality HTML5 data visualization library, and easily progress to
more sophisticated activities such as publishing and sharing documents. They can visualize measures and their performance in Active Dashboards, which can be easily shared with other users.
Best of all, InfoAssist is user-friendly, with an easy-to-use ribbon-based interface that resembles the
Microsoft Office tools that most users are already familiar with.
InfoAssist is user-friendly, with an easy-to-use ribbon-based interface that resembles the Microsoft Office tools that most users are already familiar with.
WebFOCUS Active Technologies
As more users throughout an organization require access to ever-increasing amount of data, huge
amounts of IT and end-user time and effort are spent struggling with spreadsheets. WebFOCUS
Active Technologies was created to facilitate the sharing of the results of ad hoc analysis among
business users in an interactive format. With Active Technologies, users can not only review
measures and dashboards, but also continue with their own analysis and share again and again.
6
How to Select BI Tools for Different Types of Analysis
Over time, Active Technologies has evolved into a powerful visualization platform with a robust
embedded data transformation engine.
Active Technologies combines data and interactive analytic capabilities into InfoApps, reports,
and dashboards that can be accessed on the web, or delivered via e-mail or to any type of mobile
device. Users can sort, filter, calculate, and interactively rearrange data in any way using the built-in
visualizations library to make comparisons and discover patterns and relationships.
Active Technologies also dynamically exploits the gestures, animations, and interactivity of any type
of smartphone or tablet, without the need for additional development or specialized software.
Whether users want HTML5 dashboards to run on their iPhones, or highly visualized interactive
dashboards in PDF to access via their desktops, they are guaranteed a truly seamless experience.
This approach delivers the right information to the right person at the right time, while minimizing
costs and eliminating the need for user training. Most importantly, Active Technologies helps
organizations capitalize on their BI investments by allowing any user inside or outside the
enterprise to readily analyze and explore data in real time to uncover critical trends and patterns.
Many users, especially business managers, rely on dashboards to support their day-to-day operations.
Active Technologies has become an important interactive visualization platform because of its easeof-use and its simplified and intuitive user experience. Active Technologies allows even non-analysts
and non-technical business users to quickly and easily perform ad hoc analysis. Therefore, it has
become the backbone for performing univariate analysis with advanced visualization.
WebFOCUS users can use Active Technologies to distribute InfoApps, reports,
and dashboards to peers or users outside the firewall. Active Technologies
comes with the analytics engine already embedded, allowing users to continue
their own analysis, unlike other tools that only have static viewers.
7
Information Builders
New Advanced Analytic Components in InfoDiscovery
The advanced data visualization modules in WebFOCUS InfoDiscovery are built on Active
Technologies, and give users a step-up option from InfoAssist to perform univariate and bivariate
analysis. Univariate analysis is the simplest form of quantitative (statistical) analysis. The analysis is
carried out with the description of a single variable in terms of the applicable unit of analysis. For
example, if the variable “age” was the subject of the analysis, the researcher would look at how
many subjects fall into given age attribute categories.
Univariate analysis contrasts with bivariate analysis (the analysis of two variables simultaneously) or
multivariate analysis. Univariate analysis is commonly used in the first, descriptive stages of research,
before being supplemented by more advanced, inferential bivariate or multivariate analysis.
WebFOCUS InfoDiscovery
The advanced data visualization modules in InfoDiscovery combine and extend the power
of InfoAssist and the usability of Active Technologies. Business users who are already familiar
with InfoAssist can follow the same familiar workflow they use for building reports, charts, and
dashboards to create data visualization modules.
Users can drag and drop fields, switch from a report to chart in a single click, or select advanced
data visualization types, such as maps, fisheye mekkos, histograms, paraboxes, and tree maps/
heat maps, in addition to hundreds of chart types already available in the HTML5 chart library.
Each data visualization module can be dragged and snapped into a grid, taking advantage of auto
resizing and auto layout to rapidly build an interactive dashboard. Selectors such as list boxes,
sliders, and calendars come with built-in intelligence for associative coloring. Users can highlight
the selected items, while unselected items are grayed out using different color. Related items
can also be color-coordinated across the dashboard to provide an additional layer of visibility
into data relationships. Automatic binding allows selectors to be bound together to create
multiple conditions automatically. Additionally, selectors can be used to apply grouping, filtering,
calculations, and selections of fields.
All chart types in the advanced data visualization modules have also been enhanced with additional
interactivity such as scrolling, natural indicators, and automatic zoom based on the movement of the
mouse over the chart. Users can categorize fields in the data visualization modules by group, field,
color, or associated fields to provide them with different perspectives of data.
Users can also apply filters from a selection of a chart (i.e., a bar, a slice of a pie, etc.), which will
then filter everything in the dashboard. This is the same analytic interaction within an Active
Technologies dashboard, which analysts are already familiar with. So there is no learning curve
8
How to Select BI Tools for Different Types of Analysis
required to interact with the data visualization dashboard. Users can make selections on one or
more components and engage in further analysis by comparing the highlighted data against the
unselected items or against the overall information displayed in the dashboard. They can also
apply recalculation to the measures on all other components to see the variations.
Data visualization uses charts, lines, geometric shapes, colors, and proximity to visually represent
data. It helps analysts find deeper insights into the nature of a problem and discover new
understanding that would normally go undetected in spreadsheet rows of data.
Great visualizations allow users to look at vast amounts of data to quickly and easily identify outliers
that stand out. The advanced data visualization modules incorporate common interactions that
users are already familiar with, such as dragging the mouse to group and selecting a section of a
chart (the unusual data or problem area) to highlight or apply filter. It grays out the unselected parts
in the dashboard automatically. Comparisons are highlighted by visually displaying the selected and
unselected values to reveal patterns, trends, and correlations in the dashboard. Because the data
visualization modules are seamlessly integrated with WebFOCUS, users can then pass all selections
and the result of their analysis to another report or application and share with other users.
The advanced data visualization modules in InfoDiscovery combine and extend
the power of InfoAssist and the usability of Active Technologies.
Next-Generation Multivariate Analysis
As analysts are working with larger data sets, there is a need to move from univariate and bivariate
analysis to multivariate analysis.
9
Information Builders
By taking the advantage of WebFOCUS RStat in the same BI product stack, the data set that the user
has selected for advanced visualization can be automatically passed through a machine-learning
algorithm behind the scenes. The algorithm determines the number of patterns or clusters, and
labels each record based on which pattern or group it falls into. The patterns are then displayed on a
special visual map that gives the user a high-level conceptual overview of the patterns, the variables
that go into each, and their relative sizes. Machine learning detects patterns, but does not explain
them. Therefore, the user has to explore the patterns with advanced data visualization charts to infer
the business meaning behind them. This is where the combination of machine learning and data
discovery is particularly beneficial. Users can leverage the pattern map to navigate and filter the data
discovery charts to understand their meaning.
Users can leverage the pattern map to navigate and filter the data discovery charts
to understand their meaning.
10
How to Select BI Tools for Different Types of Analysis
Conclusion
Although many companies are seeking to derive value from standalone data discovery tools,
there is a better, more effective way to enable efficient data analysis and visualization. By making
visualization part of a broader-reaching BI platform, organizations can make visual analytics a
crucial part of their decision-making culture and enterprise architecture.
Information Builders’ strategic approach offers advanced data visualization as a module within our
comprehensive suite of web-based ad hoc tools. Users who are familiar with the tool workflows
can easily leverage the advanced data visualization module with minimal training. This empowers
organizations to combine data visualization with predictive modeling, ad hoc reporting, and
dashboard creation to provide a more complete view of their business – from the discovery of
critical insights to the development of strategies and tactics through the tracking of execution and
outcomes – from a single BI platform.
About Information Builders
Information Builders helps organizations transform data into business value. Our software
solutions for business intelligence and analytics, integration, and data integrity empower people
to make smarter decisions, strengthen customer relationships, and drive growth. Our dedication
to customer success is unmatched in the industry. That’s why tens of thousands of leading
organizations rely on Information Builders to be their trusted partner. Founded in 1975, Information
Builders is headquartered in New York, NY, with offices around the world, and remains one of the
largest independent, privately held companies in the industry. Visit us at informationbuilders.com,
follow us on Twitter at @infobldrs, like us on Facebook, and visit our LinkedIn page.
You Might Also Like
The Top 6 Worst Practices in BI and
Analytics – and tips to turn a failure into
a success. Check out Slideshare today.
11
Information Builders
Worldwide Offices
Corporate Headquarters
International
Two Penn Plaza
New York, NY 10121-2898
(212) 736-4433
(800) 969-4636
Australia*
Melbourne 61-3-9631-7900
Sydney 61-2-8223-0600
United States
Atlanta, GA* (770) 395-9913
Boston, MA* (781) 224-7660
Channels (770) 677-9923
Chicago, IL* (630) 971-6700
Cincinnati, OH* (513) 891-2338
Dallas, TX* (972) 398-4100
Denver, CO* (303) 770-4440
Detroit, MI* (248) 641-8820
Federal Systems, D.C.* (703) 276-9006
Florham Park, NJ (973) 593-0022
Houston, TX* (713) 952-4800
Los Angeles, CA* (310) 615-0735
Minneapolis, MN* (651) 602-9100
New York, NY* (212) 736-4433
Philadelphia, PA* (610) 940-0790
Pittsburgh, PA (412) 494-9699
San Jose, CA* (408) 453-7600
Seattle, WA (206) 624-9055
St. Louis, MO* (636) 519-1411, ext. 321
Tampa, FL (813) 639-4251
Washington, D.C.* (703) 276-9006
Austria Raffeisen Informatik Consulting GmbH
Wien 43-1-211-36-3344
Brazil
São Paulo 55-11-2847-4519
Canada
Calgary (403) 718-9828
Montreal* (514) 421-1555
Ottawa (416) 364-2760
Toronto* (416) 364-2760
Vancouver (604) 688-2499
China
Beijing 86-10-5128-9680
Estonia InfoBuild Estonia ÖÜ
Tallinn 372-618-1585
Finland InfoBuild Oy
Espoo 358-207-580-840
France*
Puteaux +33 (0)1-49-00-66-00
Germany
Eschborn* 49-6196-775-76-0
Greece Applied Science Ltd.
Athens 30-210-699-8225
Guatemala IDS de Centroamerica
Guatemala City (502) 2412-4212
India* InfoBuild India
Chennai 91-44-42177082
Israel SRL Software Products Ltd.
Petah-Tikva 972-3-9787273
Italy
Agrate Brianza 39-039-596620
Japan KK Ashisuto
Tokyo 81-3-5276-5863
Latvia InfoBuild Lithuania, UAB
Vilnius 370-5-268-3327
Lithuania InfoBuild Lithuania, UAB
Vilnius 370-5-268-3327
Middle East Innovative Corner Est.
Riyadh 966-1-2939007
n Iraq n Lebanon n Oman n Saudi Arabia
n United Arab Emirates (UAE)
Netherlands*
Amstelveen 31 (0)20-4563333
n Belgium n Luxembourg
Nigeria InfoBuild Nigeria
Garki-Abuja 234-9-290-2621
Norway InfoBuild Norge AS c/o Okonor
Tynset 358-0-207-580-840
Portugal
Lisboa 351-217-217-400
Singapore Automatic Identification Technology Ltd.
Singapore 65-69080191/92
South Africa InfoBuild (Pty) Ltd.
Johannesburg 27-11-510-0070
South Korea UVANSYS, Inc.
Seoul 82-2-832-0705
Southeast Asia
Singapore 60-172980912
n Bangladesh n Brunei n Burma n Cambodia
n Indonesia n Malaysia n Papua New Guinea
n Thailand n The Philippines n Vietnam
Spain
Barcelona 34-93-452-63-85
Bilbao 34-94-400-88-05
Madrid* 34-91-710-22-75
Sweden InfoBuild AB
Stockholm 46-8-76-46-000
Switzerland
Dietlikon 41-44-839-49-49
Taiwan Galaxy Software Services, Inc.
Taipei (866) 2-2586-7890, ext. 114
United Kingdom*
Uxbridge Middlesex 0845-658-8484
Venezuela InfoServices Consulting
Caracas 58212-763-1653
* Training facilities are located at these offices.
Mexico
Mexico City 52-55-5062-0660
Corporate Headquarters Two Penn Plaza, New York, NY 10121-2898 (212) 736-4433 Fax (212) 967-6406
Connect With Us
informationbuilders.com [email protected]
DN7507678.0514
Copyright © 2014 by Information Builders. All rights reserved. [117] All products and product names mentioned in this publication are
trademarks or registered trademarks of their respective companies.