How to Select BI Tools for Different Types of Analysis A White Paper WebFOCUS iWay Software Omni Table of Contents 1 Executive Summary 2 Data Discovery Vs. Ad Hoc Analysis 3 Data Discovery Vs. Statistics 4 The Data Discovery Market Gap 5 Fully Integrated Data Discovery, Analysis, and Dashboarding 5 Information Consumers 5 Power Users, Business Analysts, and Developers 5 Business Analysts and Business Users 5 WebFOCUS InfoAssist 6 WebFOCUS Active Technologies 8 New Advanced Analytic Components in InfoDiscovery 8 WebFOCUS InfoDiscovery 9 Next-Generation Multivariate Analysis 11Conclusion 11 About Information Builders Executive Summary Data discovery is a quickly maturing discipline within the business intelligence (BI) and analytics industry. Its growing popularity can be attributed to the rising demand to extract more insight from business data to drive product innovation, marketing advances, and customer acquisition and loyalty. The increasing need for data discovery is driven by the realization that knowledge trapped in vast amount of data can have a profound impact on the bottom line. It is also driven by the fact that information is largely underused in organizations, which are now seeking to maximize returns on their information-related investments. There is a huge shortage of data scientists who can analyze the data and extract the insights. Therefore, vendors are driven to create easier-to-use tools to enable business analysts to perform deeper, more sophisticated types of analysis, without having to undergo extensive training. Most vendors have focused on creating separate tools for data discovery, because they are startups that saw an opportunity to fill a niche in a mature industry. This, however, requires users to learn a new tool and forces IT to struggle to integrate the new tool into their existing architectures. This white paper discusses the importance of employing advanced data visualization as part of a broader enterprise business intelligence solution. It demonstrates how this approach will expand the scope of analytic capabilities to include ad hoc report and dashboard creation, so organizations can uncover insights and measure related outcomes – while leveraging existing tools, talent, and infrastructure. 1 Information Builders Data Discovery Vs. Ad Hoc Analysis Data discovery is the process of using visual components to interactively explore data to identify underlying patterns. It differs fundamentally from ad hoc query and analysis because the analyst does not have prior knowledge of the data, assumptions, or a hypothesis. In other words, ad hoc query starts with a known question: What are the total sales for our Eastern locations for the last three months? How do this year’s sales differ from last year’s sales? Such questions are well formulated, and the analyst can drag and drop fields to get the answers and facts. On the other hand, an analyst may be tasked with identifying the most profitable customer segments. He or she has to explore relationships between different variables, and determine what constitutes a profitable segment. Furthermore, he or she has to analyze and learn about the behaviors of the customers within this segment, and understand how this segment differs from other customer segments. The learnings are then used to formulate an actionable strategy, such as how to drive similar behaviors in laggard segments to increase their spend and share-of-wallet. Typically, ad hoc analysis is a factual reporting project, while data discovery is more of a learning project. This fundamental difference drives the varying methods and visual components needed to perform each project. For ad hoc analysis, facts are clearly displayed and their significance is indicated visually, e.g. by using conditional styling. For data discovery correlations and relationships between variables need to be highlighted. For ad hoc analysis, analysts must be able to drill from one fact to another to get to the root causes behind the facts. In data discovery, analysts need the ability to subset the data (create new segments, combine segments, etc.) to see how the patterns and relationships change, or what drives certain behaviors. In other words, does income, education, or occupation contribute to higher customer profitability? How does spending change if certain income categories are included or excluded? Learning such impact allows the analyst to recommend more precise targeting for customer acquisition campaigns, for example. Based on the needs for each type of analyst, the workflows – how analysts access data, build components, apply filters, create calculations, etc. – do not differ significantly. What do differ are the types of components and the logical interactions between them. Therefore, the most effective approach to data discovery is to implement it as a module within existing ad hoc tools, which offers the following advantages: ■■ ■■ ■■ ■■ Provides a single tool for all types of analysis Allows organizations to leverage existing skills and upgrade analysts to the more advanced module Minimizes installation and administration Allows analysts to combine factual and insight content into single reports, dashboards, and documents 2 How to Select BI Tools for Different Types of Analysis Data Discovery Vs. Statistics Pattern detection is not a new discipline. Statisticians have long used algorithms (i.e. data mining) to derive insights from data. The simplest form of statistical discovery is to find correlations between two variables. Correlations tell the researcher whether two things occur together. For example, people frequently buy fish and white wine. Therefore, one can expect that if a person buys fish, they will likely also buy white wine. This knowledge can be used to make recommendations. That is why coupons or advertisements for white wine are often placed among fish recipes. Relationships between two variables are easy to detect and display on scatter plot charts. What happens, however, if there are more than two variables (known as a multivariate analysis)? How do you detect the relationships among all those variables? Machine learning is often used to analyze such multivariate data sets and to discover the relationships between all variables. For example, clustering can be used to analyze all customer attributes and activities to infer customer groupings. Clustering is based on a simple concept: Birds of a feather flock together. If customer groups can be determined, they can be targeted more precisely to influence the behavior of each group in a desirable way. The problem is that the machine produces a formula that only the statisticians can understand. It takes years to educate and train a statistician, and there is a tremendous shortage of statistical talent. Hence, data visualization tools are evolving to provide an intermediate solution where more analysts can use advanced data visualization components to perform more complex analysis. While the first generation of data discovery tools were entirely focused on bivariate analysis, the next generation will incorporate machine learning directly into the process to ensure higher accuracy. Machine learning also detects patterns faster than a human being, thus allowing the analysts to accelerate the analysis of large multivariate data sets. It is not unusual for a data mining expert to look at millions of records across 800 different variables. It takes time to understand the variables, as well as the patterns. Machine learning does this much faster. As with any new technology, more features will be added to data discovery solutions; therefore the toolset must be extensible. 3 Information Builders The Data Discovery Market Gap The data discovery vendors have built standalone, single function tools dedicated strictly to data visualization and simple dashboarding. This process is a natural outcome of their emergence in the BI space. As start-ups, they were looking for clear differentiation and niche markets to penetrate the space without incurring huge launch and marketing expenses. Furthermore, their resources were limited, so narrow specialization led to very constrained functional sets. Their primary focus was, and still is, on the components and interactions needed for data discovery. This leaves huge functional gaps in the areas of reporting, parameterized dashboards, metadata management, etc. Those functional constraints naturally limited the use and scope of the tools, and forced organizations to maintain two sets of solutions – one for ad hoc reporting and one for data discovery. That implies that either different users are using different tools, or the same users had to be trained on two different sets of tools. Making matters worse is the lack of interoperability, as dashboards developed within the data visualization tools could not be deployed directly in the BI platform. This leads to further integration, maintenance, and licensing gaps. Why does traditional reporting and dashboarding need to be integrated with data discovery? Why can’t those tools be run separately? As stated earlier, data discovery is more of a research project. Thus, while businesses can benefit tremendously from discoveries and insights, they cannot be run on insights alone. As the saying goes: “What cannot be measured cannot be managed.” If insights help to craft strategies and tactics, those strategies and tactics still need to be operationalized to ensure successful execution. This requires reporting and dashboarding to monitor the outcomes. If insights help to craft strategies and tactics, those strategies and tactics still need to be operationalized to ensure successful execution. It appears only natural to extend the functionality of the tools in the BI platform with pattern and insight discovery capabilities so that the information system as a whole can deliver the right tools to ensure discoveries, while enabling the monitoring of the outcomes from those strategies based on these discoveries. Organizations need a quick way to transform discoveries into manageable actions, and this is what a robust BI platform provides. As mentioned in the prior section, to increase the accuracy and validity of the analysis, data discovery tools have to be enhanced to include more robust machine-learning algorithms in the process. At the same time, they must hide the complexity of the process from the end user. Platforms that already have advanced statistics can make the leapfrog to this next-generation evolution easier than non-platform vendors. 4 How to Select BI Tools for Different Types of Analysis Fully Integrated Data Discovery, Analysis, and Dashboarding Information Builders’ robust WebFOCUS BI and analytics platform provides fully integrated data discovery, plus advanced statistical modeling. It offers the broadest range of solutions to support data discovery, as well as reporting, dashboarding, and other types of analysis. This enables users to participate in rich, interactive information visualization, while eliminating the gaps associated with standalone data discovery tools. With WebFOCUS, you can easily deploy BI applications that are tailored for different types of users – self-service analytics for everyone. Information Consumers Anyone who needs information to do their job needs an InfoApp to help them get insights and answers from their data. Sometimes these users are outside the firewall (customers, partners, agents, suppliers, etc). InfoApps are designed to make BI content and data visualizations more readily accessible to nontechnical users who don’t need tools for reporting, analysis, or data discovery – they just need an interactive app. Power Users, Business Analysts, and Developers Some people want to generate their own reports, run their own queries, and conduct their own analyses without the assistance of IT staff. They know how the organization’s data is organized and want to “surf” this information themselves. Business Analysts and Business Users Analysts and other users may want to explore relationships between multiple variables, and discover patterns and insights using data from multiple sources including external data. This is a great way to gain new kinds of insight, previously unattainable by traditional analysis of enterprise data. The security, scalability, and licensing models in WebFOCUS allow the content produced by the power users and analysts to be shared and viewed by everyone. WebFOCUS InfoAssist WebFOCUS InfoAssist is a robust web-based tool for query, reporting, drag-and-drop ad hoc analysis, and interactive InfoApp development. With InfoAssist, business users can rapidly create intricate InfoApps, reports, and dashboards in an interactive, fully customizable WYSIWYG (What You See Is What You Get) development environment. Business analysts often need to perform in-depth analysis to determine why something has happened – why business is slow, why sales are going down, etc. InfoAssist enables a business user to report against any data sources, drag and drop objects against predefined metadata, and answer questions about sales by region, by customer segments, or by different product categories. 5 Information Builders InfoAssist extends the power of WebFOCUS, allowing business users to access data from multiple enterprise sources, including multidimensional sources. There is no need to worry about errors or data integrity caused by complex data structures. Users can reach root causes by dragging and dropping objects, filtering, applying conditions, etc. Many of these reports run routinely to allow business managers to monitor performance and desired outcomes over time. Businesses are run based on measures. The ad hoc analysts create reports and dashboards around those measures. When an exception or change of trend occurs in the business, the analysts create additional reports and measures, so those new trends and outcomes can be measured and managed. Users can take advantage of a high-quality HTML5 data visualization library, and easily progress to more sophisticated activities such as publishing and sharing documents. They can visualize measures and their performance in Active Dashboards, which can be easily shared with other users. Best of all, InfoAssist is user-friendly, with an easy-to-use ribbon-based interface that resembles the Microsoft Office tools that most users are already familiar with. InfoAssist is user-friendly, with an easy-to-use ribbon-based interface that resembles the Microsoft Office tools that most users are already familiar with. WebFOCUS Active Technologies As more users throughout an organization require access to ever-increasing amount of data, huge amounts of IT and end-user time and effort are spent struggling with spreadsheets. WebFOCUS Active Technologies was created to facilitate the sharing of the results of ad hoc analysis among business users in an interactive format. With Active Technologies, users can not only review measures and dashboards, but also continue with their own analysis and share again and again. 6 How to Select BI Tools for Different Types of Analysis Over time, Active Technologies has evolved into a powerful visualization platform with a robust embedded data transformation engine. Active Technologies combines data and interactive analytic capabilities into InfoApps, reports, and dashboards that can be accessed on the web, or delivered via e-mail or to any type of mobile device. Users can sort, filter, calculate, and interactively rearrange data in any way using the built-in visualizations library to make comparisons and discover patterns and relationships. Active Technologies also dynamically exploits the gestures, animations, and interactivity of any type of smartphone or tablet, without the need for additional development or specialized software. Whether users want HTML5 dashboards to run on their iPhones, or highly visualized interactive dashboards in PDF to access via their desktops, they are guaranteed a truly seamless experience. This approach delivers the right information to the right person at the right time, while minimizing costs and eliminating the need for user training. Most importantly, Active Technologies helps organizations capitalize on their BI investments by allowing any user inside or outside the enterprise to readily analyze and explore data in real time to uncover critical trends and patterns. Many users, especially business managers, rely on dashboards to support their day-to-day operations. Active Technologies has become an important interactive visualization platform because of its easeof-use and its simplified and intuitive user experience. Active Technologies allows even non-analysts and non-technical business users to quickly and easily perform ad hoc analysis. Therefore, it has become the backbone for performing univariate analysis with advanced visualization. WebFOCUS users can use Active Technologies to distribute InfoApps, reports, and dashboards to peers or users outside the firewall. Active Technologies comes with the analytics engine already embedded, allowing users to continue their own analysis, unlike other tools that only have static viewers. 7 Information Builders New Advanced Analytic Components in InfoDiscovery The advanced data visualization modules in WebFOCUS InfoDiscovery are built on Active Technologies, and give users a step-up option from InfoAssist to perform univariate and bivariate analysis. Univariate analysis is the simplest form of quantitative (statistical) analysis. The analysis is carried out with the description of a single variable in terms of the applicable unit of analysis. For example, if the variable “age” was the subject of the analysis, the researcher would look at how many subjects fall into given age attribute categories. Univariate analysis contrasts with bivariate analysis (the analysis of two variables simultaneously) or multivariate analysis. Univariate analysis is commonly used in the first, descriptive stages of research, before being supplemented by more advanced, inferential bivariate or multivariate analysis. WebFOCUS InfoDiscovery The advanced data visualization modules in InfoDiscovery combine and extend the power of InfoAssist and the usability of Active Technologies. Business users who are already familiar with InfoAssist can follow the same familiar workflow they use for building reports, charts, and dashboards to create data visualization modules. Users can drag and drop fields, switch from a report to chart in a single click, or select advanced data visualization types, such as maps, fisheye mekkos, histograms, paraboxes, and tree maps/ heat maps, in addition to hundreds of chart types already available in the HTML5 chart library. Each data visualization module can be dragged and snapped into a grid, taking advantage of auto resizing and auto layout to rapidly build an interactive dashboard. Selectors such as list boxes, sliders, and calendars come with built-in intelligence for associative coloring. Users can highlight the selected items, while unselected items are grayed out using different color. Related items can also be color-coordinated across the dashboard to provide an additional layer of visibility into data relationships. Automatic binding allows selectors to be bound together to create multiple conditions automatically. Additionally, selectors can be used to apply grouping, filtering, calculations, and selections of fields. All chart types in the advanced data visualization modules have also been enhanced with additional interactivity such as scrolling, natural indicators, and automatic zoom based on the movement of the mouse over the chart. Users can categorize fields in the data visualization modules by group, field, color, or associated fields to provide them with different perspectives of data. Users can also apply filters from a selection of a chart (i.e., a bar, a slice of a pie, etc.), which will then filter everything in the dashboard. This is the same analytic interaction within an Active Technologies dashboard, which analysts are already familiar with. So there is no learning curve 8 How to Select BI Tools for Different Types of Analysis required to interact with the data visualization dashboard. Users can make selections on one or more components and engage in further analysis by comparing the highlighted data against the unselected items or against the overall information displayed in the dashboard. They can also apply recalculation to the measures on all other components to see the variations. Data visualization uses charts, lines, geometric shapes, colors, and proximity to visually represent data. It helps analysts find deeper insights into the nature of a problem and discover new understanding that would normally go undetected in spreadsheet rows of data. Great visualizations allow users to look at vast amounts of data to quickly and easily identify outliers that stand out. The advanced data visualization modules incorporate common interactions that users are already familiar with, such as dragging the mouse to group and selecting a section of a chart (the unusual data or problem area) to highlight or apply filter. It grays out the unselected parts in the dashboard automatically. Comparisons are highlighted by visually displaying the selected and unselected values to reveal patterns, trends, and correlations in the dashboard. Because the data visualization modules are seamlessly integrated with WebFOCUS, users can then pass all selections and the result of their analysis to another report or application and share with other users. The advanced data visualization modules in InfoDiscovery combine and extend the power of InfoAssist and the usability of Active Technologies. Next-Generation Multivariate Analysis As analysts are working with larger data sets, there is a need to move from univariate and bivariate analysis to multivariate analysis. 9 Information Builders By taking the advantage of WebFOCUS RStat in the same BI product stack, the data set that the user has selected for advanced visualization can be automatically passed through a machine-learning algorithm behind the scenes. The algorithm determines the number of patterns or clusters, and labels each record based on which pattern or group it falls into. The patterns are then displayed on a special visual map that gives the user a high-level conceptual overview of the patterns, the variables that go into each, and their relative sizes. Machine learning detects patterns, but does not explain them. Therefore, the user has to explore the patterns with advanced data visualization charts to infer the business meaning behind them. This is where the combination of machine learning and data discovery is particularly beneficial. Users can leverage the pattern map to navigate and filter the data discovery charts to understand their meaning. Users can leverage the pattern map to navigate and filter the data discovery charts to understand their meaning. 10 How to Select BI Tools for Different Types of Analysis Conclusion Although many companies are seeking to derive value from standalone data discovery tools, there is a better, more effective way to enable efficient data analysis and visualization. By making visualization part of a broader-reaching BI platform, organizations can make visual analytics a crucial part of their decision-making culture and enterprise architecture. Information Builders’ strategic approach offers advanced data visualization as a module within our comprehensive suite of web-based ad hoc tools. Users who are familiar with the tool workflows can easily leverage the advanced data visualization module with minimal training. This empowers organizations to combine data visualization with predictive modeling, ad hoc reporting, and dashboard creation to provide a more complete view of their business – from the discovery of critical insights to the development of strategies and tactics through the tracking of execution and outcomes – from a single BI platform. About Information Builders Information Builders helps organizations transform data into business value. Our software solutions for business intelligence and analytics, integration, and data integrity empower people to make smarter decisions, strengthen customer relationships, and drive growth. Our dedication to customer success is unmatched in the industry. That’s why tens of thousands of leading organizations rely on Information Builders to be their trusted partner. Founded in 1975, Information Builders is headquartered in New York, NY, with offices around the world, and remains one of the largest independent, privately held companies in the industry. Visit us at informationbuilders.com, follow us on Twitter at @infobldrs, like us on Facebook, and visit our LinkedIn page. You Might Also Like The Top 6 Worst Practices in BI and Analytics – and tips to turn a failure into a success. Check out Slideshare today. 11 Information Builders Worldwide Offices Corporate Headquarters International Two Penn Plaza New York, NY 10121-2898 (212) 736-4433 (800) 969-4636 Australia* Melbourne 61-3-9631-7900 Sydney 61-2-8223-0600 United States Atlanta, GA* (770) 395-9913 Boston, MA* (781) 224-7660 Channels (770) 677-9923 Chicago, IL* (630) 971-6700 Cincinnati, OH* (513) 891-2338 Dallas, TX* (972) 398-4100 Denver, CO* (303) 770-4440 Detroit, MI* (248) 641-8820 Federal Systems, D.C.* (703) 276-9006 Florham Park, NJ (973) 593-0022 Houston, TX* (713) 952-4800 Los Angeles, CA* (310) 615-0735 Minneapolis, MN* (651) 602-9100 New York, NY* (212) 736-4433 Philadelphia, PA* (610) 940-0790 Pittsburgh, PA (412) 494-9699 San Jose, CA* (408) 453-7600 Seattle, WA (206) 624-9055 St. Louis, MO* (636) 519-1411, ext. 321 Tampa, FL (813) 639-4251 Washington, D.C.* (703) 276-9006 Austria Raffeisen Informatik Consulting GmbH Wien 43-1-211-36-3344 Brazil São Paulo 55-11-2847-4519 Canada Calgary (403) 718-9828 Montreal* (514) 421-1555 Ottawa (416) 364-2760 Toronto* (416) 364-2760 Vancouver (604) 688-2499 China Beijing 86-10-5128-9680 Estonia InfoBuild Estonia ÖÜ Tallinn 372-618-1585 Finland InfoBuild Oy Espoo 358-207-580-840 France* Puteaux +33 (0)1-49-00-66-00 Germany Eschborn* 49-6196-775-76-0 Greece Applied Science Ltd. Athens 30-210-699-8225 Guatemala IDS de Centroamerica Guatemala City (502) 2412-4212 India* InfoBuild India Chennai 91-44-42177082 Israel SRL Software Products Ltd. Petah-Tikva 972-3-9787273 Italy Agrate Brianza 39-039-596620 Japan KK Ashisuto Tokyo 81-3-5276-5863 Latvia InfoBuild Lithuania, UAB Vilnius 370-5-268-3327 Lithuania InfoBuild Lithuania, UAB Vilnius 370-5-268-3327 Middle East Innovative Corner Est. Riyadh 966-1-2939007 n Iraq n Lebanon n Oman n Saudi Arabia n United Arab Emirates (UAE) Netherlands* Amstelveen 31 (0)20-4563333 n Belgium n Luxembourg Nigeria InfoBuild Nigeria Garki-Abuja 234-9-290-2621 Norway InfoBuild Norge AS c/o Okonor Tynset 358-0-207-580-840 Portugal Lisboa 351-217-217-400 Singapore Automatic Identification Technology Ltd. Singapore 65-69080191/92 South Africa InfoBuild (Pty) Ltd. Johannesburg 27-11-510-0070 South Korea UVANSYS, Inc. Seoul 82-2-832-0705 Southeast Asia Singapore 60-172980912 n Bangladesh n Brunei n Burma n Cambodia n Indonesia n Malaysia n Papua New Guinea n Thailand n The Philippines n Vietnam Spain Barcelona 34-93-452-63-85 Bilbao 34-94-400-88-05 Madrid* 34-91-710-22-75 Sweden InfoBuild AB Stockholm 46-8-76-46-000 Switzerland Dietlikon 41-44-839-49-49 Taiwan Galaxy Software Services, Inc. Taipei (866) 2-2586-7890, ext. 114 United Kingdom* Uxbridge Middlesex 0845-658-8484 Venezuela InfoServices Consulting Caracas 58212-763-1653 * Training facilities are located at these offices. Mexico Mexico City 52-55-5062-0660 Corporate Headquarters Two Penn Plaza, New York, NY 10121-2898 (212) 736-4433 Fax (212) 967-6406 Connect With Us informationbuilders.com [email protected] DN7507678.0514 Copyright © 2014 by Information Builders. All rights reserved. [117] All products and product names mentioned in this publication are trademarks or registered trademarks of their respective companies.
© Copyright 2024