Big Data – New Reference Architectures for Information Management Luis Campos Big Data Solutions Lead, Oracle EMEA @luigicampos 1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Big Data New Reference Architectures for Information Management AGENDA - The New Information - From 3-Tier to N-Tier Architecture - What about High Performance Computing? - The New Reference Architectures - New Technologies and the role of Oracle Corp. - Challenges of the main industries. 2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Where’s the New Information? New Sources from Any Data 3 New Analytics on All Data Copyright © 2013, Oracle and/or its affiliates. All rights reserved. New Integrations of Data New Orchestrations Any Computing model What does “New Data” really means? Any Data, Any Source Absorb All Dimensions of Data = 360º 4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. What does “All Data” really means? Any Data, Any Source 5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Stop Throwing Data Away = Know More About What’s Going On in your Business What does “Any Data” really means? Any Data, Any Source 6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Tap Any Data = New Revenue Streams From 3-Tier to N-Tier Architecture Presentation Tier • Created partly to split Presentation and Logic Layer • Pushing away Logic from the Data created new challenges Logic Tier Data Tier 7 • Data would need to be moved around in massive amounts, using a plethora of protocols and caching layers Copyright © 2013, Oracle and/or its affiliates. All rights reserved. What about High Performance Computing? Does everyone have the need for Supercomputing? • HPC: solving extraordinary real life problems with extraordinary computing power • Vertical Computing: Supercomputers • Distributed Computing: Massive Computer Clusters 8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. The New Reference Architectures Emerging Challenges Call for New Solution Mix • Low Latency Systems • Pattern Recognition • Data Science as a Service 9 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Low Latency Systems Real Time processing for the masses • Mobile Computing • Critical element in User Experience • Element of responsiveness in any user interface Users don’t need this message anymore: “Your request is being processed...” 10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Pattern Recognition Predictive systems • Act on pattern, not re-act • Elements: • Agents (Sensors) • Event Processing engine • Rules Engine • Action Broadcast system • Self Learning mixed with Supervised Learning Input: Lots of Low Density Data Output: Immediate Actions inside a context Examples: Guided navigation, While-you-browse recommendations, manufacturing lines, retail in-store promos 11 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Data Science as a Service Lend the power of science and technology to everyday problems • Incorporate non-deterministic data When you can’t ask questions outside the function • Generation G: “I need the system to tell them what I want” Enterprise Applications: • Government Intelligence • Enterprise Security • Fraud Detection 12 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Oracle Big Data Reference Architecture Source Data Layer Information Access Processes COTS/ERP Staging Data Layer Strongly Typed Data Enterprise Data with full history External Data Quality Social/Text Foundation Layer Performance Layer Embedded Data Marts Weakly Typed Data Sensors Knowledge Discovery Layer Streaming Security and Metadata 13 Data Mining Sandbox Data Integration Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Rapid Dev Sandbox BI Abstraction & Query Federation Enterprise Data Warehouse Performance Management Alerts, Dashboards, Reporting Services Information Discovery Advanced Analysis & Data Science Translated into Oracle Product Architecture Source Data Layer Information Access Processes COTS/ERP Staging Data Layer Strongly Typed Data External Data Quality Social/Text Sensors 14 Oracle Database Performance Layer Enterprise Data with full historyAnalytics & OLAP -Advanced Embedded - Spatial and Graph Data Marts - Industry Models Oracle NoSQL Database Knowledge Discovery Layer CDH Weakly Typed Data Streaming Security and Metadata Foundation Layer Data Mining Sandbox Data Integration Oracle Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Endeca Information Discovery Rapid Dev Sandbox Big Data Connectors BI Abstraction & Query Federation Enterprise Data Warehouse Performance Management Alerts, Dashboards, Oracle BI Reporting Foundation Services Endeca Information Information Discovery Discovery Advanced Analysis & Data Science Translated into Oracle Engineered Systems Source Data Layer Information Access Processes COTS/ERP Staging Data Layer Strongly Typed Data Foundation Layer Performance Layer Oracle Embedded Exadata Enterprise Data with full history External Data Quality Data Marts Social/Text Sensors Streaming Security and Metadata 15 Weakly Typed Data Oracle Big Data Knowledge Discovery Layer Appliance Data Mining Sandbox DataData Integration Big Connectors Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Endeca Rapid Dev Sandbox BI Abstraction & Query Federation Enterprise Data Warehouse Performance Management Alerts, Dashboards, Reporting Oracle Exalytics Services Information Discovery Advanced Analysis & Data Science Big Data Appliance Hadoop Ecosystem for the Enterprises Oracle Big Data Appliance Cloudera Dist. Hadoop Oracle NoSQL BD Connectors 16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 18 Nodes 648TB, 288 CPUs 12 Nodes (U) 6 Nodes 216TB, 96 CPUs Oracle’s Big Data Connectors Unlock the power of Hadoop integration Hadoop Oracle Database Oracle Big Data Connectors 17 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. (1) Oracle Data Integrator Application Adapters for Hadoop Transforms Via MapReduce(HIVE) Benefits Consistent tooling across BI/DW, SOA, Integration and Big Data Oracle Data Integrator Activates Reduce complexities : Oracle Loader for Hadoop graphical tooling Loads Improves productivity Oracle Database Improving Productivity and Efficiency for Big Data 18 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. (2) Oracle SQL Connector for Hadoop Accessing HDFS Data from Oracle Database Features HDFS Access or load into the database in parallel using external table mechanism Oracle Database Access and analyze data in place on HDFS Query and join data on HDFS with database resident data Load into the database using SQL if required Automatic load balancing to maximize performance 19 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. SQL Query ODCH ODCH ODCH HDFS Client External Table (3) Oracle R Connector for Hadoop R Analytics leveraging Hadoop and HDFS Oracle R Client Linearly Scale a Robust Set of R Algorithms MAP MAP REDUCE MAP MAP Hadoop REDUCE HDFS 20 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Leverage MapReduce for R Calculations Compute Intensive Parallelism for Simulations What is ? • Brings R’s statistical functionality to the Oracle Database • Eliminates R’s memory constraints • Allows R to run on very large data sets • Oracle R is architected for enterprise production infrastructure • Automatically exploits database parallelism without requiring parallel R programming • Oracle R leverages the latest R algorithms and packages • R is an embedded component of the DBMS server • Part of Oracle Advanced Analytics (+ODM) Oracle R Architecture R workspace console Function push-down – data transformation & Oracle statistics engine statistics Development Production OBIEE, Web Services Consumption • Leverages SQL for data prep, analysis and enhanced statistics engine • R engine runs on database nodes for production enablement of R models • Leverages Exadata—Oracle R workloads run in-database and can be bound to database nodes for workload isolation • Enriches OBIEE dashboards with Oracle R statistics and analytics Oracle Data Mining (ODM) Data mining can answer questions that cannot be addressed through simple query and reporting techniques. • Data Mining: Insight from discovering relationships • Knowledge about what happened in the past • Characterization, segmentation, comparisons, discrimination • Descriptive models of patterns • Predictive Analytics: Making better decisions and forecasts • Knowledge about what is happening right now and in the future • Classification and prediction of patterns • Rule-and-model driven Data Mining – Some Definitions Supervised Learning Problem Classification Sample Problem Classification Predict customer response to an affinity card program Regression Predict customer’s age Attribute Importance Find the most significant predictors, data preparation A1 A2 A3 A4 A5 A6 A7 Data Mining – Some Definitions Unsupervised Learning Problem Classification Sample Problem Anomaly Detection Identify customer purchasing behavior that is significantly different from the norm Association Rules Find the items that tend to be purchased together and specify their relationship – market basket analysis Segment demographic data into clusters and rank the probability that an individual will belong to a given cluster Group the attributes into general characteristics of the customers Clustering Feature Extraction F1 F2 F3 F4 Endeca Information Discovery Sandbox and Production mode Endeca Information Discovery Studio Endeca MDEX Server Intergration Suite 26 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. What is the world doing today Large Spanish Clothes Manufacturer • Automation • Sensory Event Processing • Quality Assurance 27 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. What is the world doing today Second Largest Bank in United States of America • Analysis of data xLoB: Loans, Insurance, on-line banking, card products • Market assessment • Risk Analysis • Revenue lift for new & existing products 28 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Telco Industry Deep, Big and Fast Deep • SNA*, Find Influencers, RA** Big • Network Optimization, • CDR Analysis Fast • Sentiment Analysis • Location Based Services • Click stream Analysis © 2012 Oracle Corporation – Proprietary and Confidential * Social Network Analysis (Rate plan optimization) ** Revenue Assurance Retail Industry Marketing, Merchandising and Supply Chain Marketing • In-store behaviour analysis • Sentiment Analysis + Micro segmentation Merchandising • Assortment optimization Supply Chain • Distribution and logistics optimization • Informing supplier negotiations © 2012 Oracle Corporation – Proprietary and Confidential Oil and Gas Use Cases Hadoop and Seismic Data Processing 31 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Life Sciences / Pharmaceutical Life Sciences • DNA Sequencing, Diseases Correlation Pharmaceutical • Clinical Trial – meds simulation © 2012 Oracle Corporation – Proprietary and Confidential Wrap Up New Challenges and the New Information N-Tier, HPC New Reference Architectures for New Data The role of Oracle Corp in developing New Technologies Challenges across all industries © 2012 Oracle Corporation – Proprietary and Confidential Thank You @luigicampos 34 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 35 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
© Copyright 2024