The Power To Do More Big Data at Ops Speed Apps, Bits, and Ops Data Ingestion without Ops Indigestion Glen Campbell – Dell IT Summit 2015 April 9, 2015 Dell World Executive Summit No… Not THAT Glen Campbell Net Positive Effort… Some people never open their mouths without subtracting from the sum of human knowledge. Thomas Brackett Reed I’m working not to resemble that statement… #DellPTDM Question #1 – What are You Trying to Do in BD Land? Make Money? Or SAVE Money? #DellPTDM Question #1 – What are You Trying to Do in BD Land? Or SAVE Money? #DellPTDM Of prime consideration to many financial decision makers at organizations is how to tighten their CapEx:OpEx ratios. …and then shrink their absolute CapEx costs. Apps and Ops are part of the same conversation. #DellPTDM Uno, Dos, Tres (Catorce…) 1. 2. 3. Anecdotal or Scientific: Can we really ignore this? Industry a. VMware’s BDE b. Microsoft’s HDInsight Community a. OpenStack Sahara #DellPTDM Start the conversation in your Org How Do These Things Relate? 9 Ops is the Reality of an Idea Over Time ~1/20th of a second ~7,000 / second ~7 billion CRUD activities / day #DellPTDM Venn Ops What Should I Do? What Can I Do? Sometimes… How Do I Do It? #DellPTDM Common Elements Extract Get, buy, steal every last piece of information you can. Even if you think it ISN’T related. 12 Common Elements Transform Perform, learn the business alchemy to transform that data into something known to be useful. 13 Common Elements Load Feed the business beast NEVER sated with enough information, nor enough of it, fast enough. 14 The Most Successful Pattern-Matching Species in History (but we still need help) #DellPTDM The Most Successful Pattern-Matching Species in History (but we still need help) #DellPTDM The Elephants in the Room – VMware and Hadoop Project Serengeti Tadahari! #DellPTDM #DellPTDM #DellPTDM #DellPTDM #DellPTDM #DellPTDM #DellPTDM #DellPTDM What’s New in the New Two Zoo? › Exposed Cloudera Manager / Ambari endpoint configuration in UI › HBase-only Clusters Allowing the integration with EXISTING HDFS › Compute-only Clusters for Hadoop Data (worker) Nodes Allowing integration with an EXISTING phys/virt Hadoop implementation › New Integration with Apache BigTop Excellent means of building / customizing / smoke-testing Hadoop builds for a customer-specific environment › Upgrade Engine Allowing the upgrade of BDE from rev-to-rev leaving data undisturbed, configuration intact #DellPTDM Offering: Cloudera Hadoop › Consumer VMware portal • • • • Add Cloud users Monitor resource consumption Provision new Hadoop clusters Scale up & down Hadoop nodes Alfa End User Alfa Admin Operate Cloudera Hadoop RBAC to Hadoop / Datasets Submit data Submit processing jobs Toad or other Hadoop client tools • Submit data • Submit jobs #DellPTDM Bravo End User Toad Toad Portal HUE HUE Portal & Cloudera Manager • • • • Bravo Admin Portal vRealize Automation Portal HUE VRO VM Node VM Node BDE vCenter ESXi CPU NET DSK VM Node VM Node Provider L3 IP Logical #DellPTDM #DellPTDM #DellPTDM Windows Azure HDInsight #DellPTDM HDInsight - Overview Microsoft’s Hadoop Distribution in the Cloud #DellPTDM Offers Hadoop on Windows Platform Tightly integrated with Microsoft Technology Stack Based on Hortonworks Data Platform (HDP) HDInsight - Architecture #DellPTDM Microsoft Data Platform and Enterprise BI Ecosystem #DellPTDM HDInsight Versions COMPONENT VERSION 1.6 VERSION 2.1 VERSION 3.0 VERSION 3.1 (Current/Default) Hortonworks Data Platform (HDP) 1.1 1.3 2.0 2.1.7 Apache Hadoop & YARN 1.0.3 1.2.0 2.2.0 2.4.0 Tez 0.4.0 Apache Pig 0.9.3 0.11.0 0.12.0 0.12.1 Apache Hive & HCatalog 0.9.0 0.11.0 0.12.0 0.13.1 HBase 0.98.0 Apache Sqoop 1.4.2 1.4.3 1.4.4 1.4.4 Apache Oozie 3.2.0 3.3.2 4.0.0 4.0.0 Apache HCatalog 0.4.1 Merged with Hive Merged with Hive Merged with Hive Apache Templeton 0.1.4 Merged with Hive Merged with Hive Merged with Hive API v1.0 1.4.1 >=1.5.1 3.4.5 3.4.5 Ambari Zookeeper Storm 0.9.1 Mahout 0.9.0 Phoenix 4.0.0.2.1.7.0-2162 #DellPTDM 34 HDInsight Use Case - ETL Automation #DellPTDM 35 HDInsight Use Case - BI Integration #DellPTDM 36 Typical Implementation Social Reporting and Analytics Multi-Node HDInsight Cluster MapReduce • Hive • Java Web Logs Clickstream Azure Blob Files Blob Blob Blob Blob (TXT, XML, JSON, ..) • SSRS • Excel • Power BI Collaboration Transactional Warehouse #DellPTDM 37 Office 365 / SharePoint Social Typical Implementation (Contd…) PowerShell / SSIS / SQL Agent Subscription & Cluster Management | Data Movement | Job Execution Customers E-Commerce Web Logs Azure Web Logs Blob Blob Blob Blob Blob Blob MapReduce Hive Blob Blob Storage OLTP Sqoop Or AzCopy Transactional Hive Metastore Internal Systems Team Internal Systems Warehouse Internal Systems • SSRS • Excel • Power BI Collaboration, Reporting, and Analytics #DellPTDM 38 Multi-Node HDInsight Cluster MapReduce • Hive • Pig • Java • Python With Open Source, if you’re USING the boat, you’re participating in how it moves. The dynamic of Open Source software is one where participation in that community, not solely the usage of its technology, is part of the bargain. #DellPTDM The viability of many pieces of software is increasingly being dictated NOT merely by its functionality or the vendor, but by the community and ecosystem around it. Multi-Voice, Multi-Need The Community Approach #DellPTDM The Importance of EDP in the Open Source Cloud The OpenStack Sahara Project #DellPTDM #DellPTDM #DellPTDM Whose Hadoop, What Versions, What Jobs…? • Vanilla Apache Hadoop • - 1.2.1, 2.3.0, and 2.4.1 (2.6 just out) • Cloudera Distribution of Hadoop (CDH) • - CDH5 • Hortonworks Data Platform (HDP) • - 1.3.2, and 2.0.6 • Spark • - 0.9.1, 1.0.0, 1.0.2… Supported Job Types: • Jar, Pig, Hive #DellPTDM Supported Workflows: • Oozie • Mistral…? OpenStack “Ironic” Pending for Bare Metal Plus • Cloudera Manager • Apache Ambari #DellPTDM #DellPTDM #DellPTDM Apps, Ops, and DATA are Elements of the SAME Conversation #DellPTDM Apps, Ops, and DATA are Elements of the SAME Conversation #DellPTDM Apps, Ops, and DATA are Elements of the SAME Conversation #DellPTDM Dell’s 360° In the Analytics and Big Data Ecosystem #DellPTDM Design, Analyse Manage, Change Pull, Push Diagnose, Resolve 52 Confidential Us, Them Software Group 53 Confidential Software Group A Rich Portfolio of Software Assets to Drive Your Big Data Needs • software.dell.com/Dell-Statistica • software.dell.com/solutions/big-data-analytics • software.dell.com/products/boomi-atomsphere/ • software.dell.com/products/toad-intelligencecentral/ • software.dell.com/products/toad-data-point/ • dell.com/bigdata • dell.cloudera.com/ • dell.com/learn/us/en/555/solutions/hadoop-bigdata-solution Talk to your Dell Account Team 54 Confidential Software Group Marry the Models Consider Your BDaaS Self #DellPTDM Thank You! Questions 56 SharePlex Enterprise Technologists – Content with Context Global Marketing
© Copyright 2024