Copyright 2014 Hewlett-Packard Development Company, LP The

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The innovative HP Big Data Technology stack
and the use cases
Realtime Analytics of extreme data
Helmut Schmitt
Sales Manager DACH
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data is a Massive Disruptor
“A 100 fold multiplication in the amount of data is a 10,000 fold
multiplication in the number of patterns we can see in that data.”
Philip Evans: Boston Consulting Group Fellow, Ted Talk
3
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Industry-leading breath & depth of capabilities
Haven Big Data Platform
Contextual
Search
Data
Exploration
Core Big Data Business Capabilities
Access Explore Enrich Analyze Predict Serve Act
Image/Video
Analytics
Accelerated
Analytics
On-premise
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Geospatial
Analytics
Sentiment
Analysis
SQL on
Hadoop
And
more…..
Predicative
Analytics
In the Cloud
DATA is an organization’s most strategic asset
Monetize
Differentiate
Personalize
Monitor
Meter
Optimize
Predict
…and more
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
…and its greatest risk
Monetize
Differentiate
Personalize
Monitor
Meter
Optimize
Predict
…and more
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Regulate
Comply
Control
Secure
Address
Ensure
The Big Data Balance Sheet
Monetize
Regulate
Differentiate
Personalize
Monitor
Comply
Assets Liabilities
Meter
Optimize
Predict
…and more
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Control
Secure
Address
Ensure
We will be the trusted partner for every organization
*
Store
Serve
Explore
*
*
Protect
*
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Govern
*
The Big Data flow
Store & Explore
Unstructured enterprise
data repositories
Address business &
operational objectives
Structured enterprise
data repositories
Enterprise Content
Management
Enterprise Search
& Collaboration
Cloud-based
repositories
Mobile & social
media
Offsite or
removable
data repositories
9
Govern & Protect
Data
Legacy Data
Cleanup
Address legal &
compliance objectives
Information
Archiving
eDiscovery
Legal Holds
Records
Management
Address information
management objectives
Backup & Recovery
Disaster Recovery
Business Resiliency
Long-Term Retention
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Serve
Business resiliency
Operations Analytics
Predictive Maintenance
Smart Metering
Patient analytics
Fraud prevention
Records Management
Advertising analytics
Legal & Compliance
Vehicle Recognition
Used in association with ANPR
• Match Make and/or Model
– Easy to train
– Real-time matching
• Alert or Search for Vehicle without
registration
• Validate database using ANPR result to
identify illegal plated vehicles
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Core Capabilities – Built for Speed
• We boost performance
What 1000% means:
Use to take
Now takes
1 hour
3.6 Seconds
8 hours (overnight)
Under 30 seconds
"When we did the first queries, they were done so
fast, we thought they were broken.“
- Michael Relich, Guess?
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Secrets to Achieving Performance Increases
Columnar
Storage
Speeds Query Time
by Reading Only
Necessary Data
Compression
MPP Scale-Out
Distributed
Query
Lowers costly I/O to
boost overall
performance
Provides high
scalability on
clusters with no
name node or other
single point of
failure
Any node can
initiate the queries
and use other nodes
for work. No single
point of failure
CPU
Projections
CPU
CPU
Memory
Memory
Memory
Disk
Disk
Disk
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Combine high
availability with
special
optimizations for
query performance
A B D
C E A
Query Optimization Comparison
Traditional Materialized Views
Vertica Projections
• Are secondary storage
• Are rigid: Practically limited
to columns and query needs,
more columns = more I/O
• Are mostly batch updated
• Provide high data latency
• Are primary storage – no base tables are required
• Can be segmented, partitioned, sorted, compressed and
encoded to suit your needs
• Have a simple physical design
• Are efficient to load & maintain
• Are versatile – they can support any data model
• Allow you to work with the detailed data
• Provide near-real time low data latency
• Combine high availability with special optimizations for
query performance
14 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Traditional Indexes
• Are secondary storage
pointing to base table
data
• Support one clustered
index at most – tough to
scale out
• Require complex design
choices
• Are expensive to update
• Provide high data latency
Analytical Features of Vertica
Vertica SQL
Vertica Extended-SQL
Vertica Innovations
Standard SQL-99 Conventions
Advanced Analytics with SQL
Advanced Analytics using Custom Logic
Aggregate
Sessionization
Regression Testing
Analytical
Time Series
Statistical Modeling
•
•
•
•
Window Functions
Time slice
Interpolation (Constant & Linear)
Gap Filling
Aggregate
Event-based Windows
Classification Algorithms
• Conditional Change Event
• Conditional True Event
Graph
Event Series Joins
Page Rank
Monte Carlo
Social Media/Pulse
Text-mining
• Text Mining
• Patterns/Trends
Geospatial
Pattern Matching
Geospatial (Place)
• Match, Define, Pattern Keywords
• Funnel Analysis
Statistical
15 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Vertica User Defined Extensions
Analytics
• C++
• Java
• R
Connection
• ODBC/JDBC
• HIVE
• Hadoop
• Flex Zone
HP Vertica Distributed R
R-based Analytics
Challenge: Customers want to use R for analytics. However, R
scalability is always a question
SOLUTION: HP Distributed R
Benefit:
• Analyze data sets too large for standard R
• Perform complex analyses much more quickly (20x faster than
Hadoop)
• Use familiar R environment to explore data, develop, and execute
algorithms
• Operate on full data set (no down sampling)
16 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
R
CPU
CPU
CPU
Memory
Memory
Memory
Disk
Disk
Disk
R
R
Algorithm
Use cases
Linear Regression
(GLM)
Risk Analysis, Trend Analysis, etc.
Logistic Regression
(GLM)
Customer Response modeling,
Healthcare analytics (Disease analysis)
Random Forest
Customer churn, Market campaign
analysis
K-Means Clustering
Customer segmentation, Fraud
detection, Anomaly detection
Page Rank
Identify influencers
Introducing HP Vertica for SQL on Hadoop
• HP Vertica for SQL on Hadoop offers the only
full-featured query engine on Hadoop
- Same Core Engine
- Hadoop Distribution Agnostic
- Enterprise-ready Solution
- World-class Enterprise Support and Services
- Open platform
- Ready for Haven
Vertica ANSI SQL
Data
Exploration
• Competitive price point
Hadoop Storage
17 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
One Query Engine to Serve it all
Store Data in HP Vertica or any Hadoop Distribution
•
•
•
•
Query data in place in Hadoop Formats
Co-Locate and leverage existing Hadoop infrastructure
HP Vertica performance on lower-cost infrastructure
Single query engine across diverse formats and infrastructure
HP Vertica
ANSI SQL
Query Engine
Format
File System
Vertica Optimized (ROS, Flex Tables)
Vertica (EXT4)
Hadoop (ORC, Parquet, et al)
Hadoop (HDP, CDH, MapR NFS)
18 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Which Version Is Right for You?
HP Vertica EE
SQL on Hadoop
HP Vertica for SQL on
Hadoop
• Discover Data
• Control Costs
• Leverage Hadoop
Infrastructure
• No Frills, No
Brainer
• For Hadoop environments only
• Full MPP SQL engine
• Includes JOINs, time series
analysis and Key Value
• Management tools including
workload management, database
designer and back-up and restore
• Hadoop Agnostic Compatibility
• Flex Zone
• Compression and Columnar Store
• Java UDx
19 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Accelerated Analytics
, Live Aggregate
projections,
Geospatial and
Sentiment Analysis
Highly Optimized
HP Vertica EXT4 file
system
C++ UDx / UDL
HP Vertica Enterprise
Edition
• Boost Performance
• Faster Analytics
• Deeper Analytics
• Customize Analytics
Infrastructure
• All the bells and
whistles
High End Scalability
Think Big – Start Small
Vertica Community edition:
Up to 3 nodes
Up to 1 Terabyte
Free for productive use
Scale up to Enterprise edition
Add nodes on the fly
Scale up to PB
Embed Hadoop
20 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Leaders don’t make compromises
• Promotional Testing
• Behavior Analytics
• Claims Analyses
• Click Stream Analyses
• Patient Analyses
• Network Analyses
• Clinical data Analyses
• Customer Analytics
• Fraud Monitoring
• Compliance Testing
• Financial Tracking
• Loyalty Analysis
• Trading Analytics
• Marketing Analytics
21 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP Vertica’s Top Use Cases & Verticals
Click to view Use Case
Communications,
Media & Ent.
Consumer
Web
Health & Life
Sciences
Retail
Financial
Services
Clickstream Analytics
✓
✓
✓
✓
✓
Customer Analytics
✓
✓
✓
✓
✓
Energy
Public Sector
✓
✓
✓
Hadoop Accelration
EDW Modernization
Fraud Detection
✓
✓
✓
Transaction Analytics
✓
✓
Compliance
✓
✓
Security
✓
✓
Operations Analytics
✓
Sensor Data Analytics
✓
✓
22 © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
✓
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data @ World Tour
Ausstellung:
Präsentationen:
Transformation Experience Workshop:
Cape-to-Cape
14:20 - 14:50 Uhr- Big Data
Anwendungsfälle – handfeste Demos
Bernd Mußmann
Goals
Understanding what Big Data is
Defining customer’s Big Data challenges
Evaluating business and IT priorities
Introducing HP Big Data solutions
Building a Big Data transformation roadmap
HP Big Data
Referenzarchitektur
HP Big Data für die IT
HP Software Technologie-Stack:
HP Big Data Services
15:20 - 15:50 Uhr - Der innovative HP
Big Data Technologiestack und seine
Einsatzgebiete
Helmut Schmitt
16:00 - 16:30 Uhr - Big Data
Infrastrukturen und Services für das
datenorientierte Unternehmen
Philipp Koik & Jochen Mohr
16:40 - 17:10 Uhr - Big Data as a
Service – Herangehensweisen und
Beispiele
Jens Scheffler
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Customer Benefits
Understand the benefits, scope, scale and critical
success factors
Leverage best practices
Gain stakeholder commitment
Establish a common understanding
Participants
C-level, senior staff/initiative owners (5- 8 persons)
2-3 Sr. HP consultants, HP Sales
HP Big Data Transformation Workshop
Goals
• Understanding what Big Data is
• Defining your Big Data challenges
• Evaluating business and IT priorities
• Introducing HP Big Data
• Building a customized Big Data transformation roadmap
Your Benefits
• Understand the benefits, scope, scale and critical success factors
• Leverage best practices
• Gain stakeholder commitment
• Establish a common understanding, consensus and alignment
Participants :
• C-level, senior staff/initiative owners (5- 8 persons)
• Senior HP consultants, HP Sales
Location & Time-slots :
• Reception/Check-in desk for Big Data Transformation Workshop, Level 1, Kap Europa
• Information desk for Transformation Workshops, Level 4- entrance of exhibition hall
• Session Options:-11:00 -11:30, 12:00 -12:30, 13:30-14:00, 14:20 – 14:50, 15:20 –
15:50, 16:00 – 16:30, 16:40 – 17:10
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Rekordjäger Rainer Zietlow
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thank you
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.