Design Patterns for Large- Scale Data Management Robert Hodges OSCON 2013

Design Patterns for LargeScale Data Management
Robert Hodges
OSCON 2013
©Continuent 2013
The Start-Up Dilemma
1. You are releasing Online Storefront V 1.0
2. It could be a complete bust
3. But it could be *really* big
How do you implement data management at
hacking velocity without compromising the future?
©Continuent 2013
2
The Answer: Data Fabric Design
App
Tier
App
Server
App
Server
App
Server
High Availability
Large Data Volumes
Data
Tier
Run Anywhere
Cross-Site Distribution
Polyglot DBMS
Data Fabric
©Continuent 2013
3
Technology Upgrades
Use App Design Principles for Data
Client Interface(s)
Responsibilities
Capabilities
DBMS Server
©Continuent 2013
Data Service
4
Compose Services via Design Patterns
Connector
Connector
Connector
Fabric
Connector
Transactional
Data Service
Bridge
Fault-Tolerant
Data Service
Real-Time
Data Bridge
Sharded
Data Service
Multi-Site
Data Service
©Continuent 2013
5
1. Transactional Data Service
•
Responsibility: Reliable and accessible data
storage
•
Capabilities:
©Continuent 2013
•
•
•
•
Well-de!ned client interface
Transactions and recovery
Backup and restore
Log for replication
6
MySQL: A Mature Building Block
Relational
modeling
Single
join
type
Sales Table
id
cust_id prod_id ...
1
335301
532
...
2
2378
6235
...
3
...
...
...
Prod_ID Index
Product Table
id
sku
type
prod_id
id
532
533
...
C00135
S09957
...
consumer
specialty
532
6235
...
1
2
...
Indexes speed
query, slow
updates
Single-threaded query
Transaction support
©Continuent 2013
Fast row
operations,
slow scans
of columns
7
MySQL Implementation
SQL Interface (MySQL Dialect)
MySQL Wire Protocol
MySQL 5.6/InnoDB
Percona
XtraBackup
MySQL Binlog
©Continuent 2013
8
Backup
Images
Comments and Trade-O!s
•
•
•
•
©Continuent 2013
Interface need not be SQL
Backups are critical to restore broken replicas
Pay close attention to DBMS con"guration
Design Key: Use ACID transactions (No
MyISAM!)
9
2. Fault-Tolerant Data Service
•
Responsibility: Continously available data
using shared-nothing replicas
•
Capabilities:
©Continuent 2013
•
•
•
Real-time replication between replicas
Provision/reprovision from backups
Failover and rolling maintenance
10
Replication Technology Overview
•
•
•
•
Synchronous vs. asynchronous
Master/slave vs. multi-master
Logical vs. binary
Trigger vs. log-based
Master/
Slave
©Continuent 2013
MultiMaster
11
Implementation using MySQL & MHA
Master
MHA
Manager
Daemon
Oracle
MySQL 5.6
(Performs
failover and
switch)
Async
Native
Replication
Oracle
MySQL 5.6
Oracle
MySQL 5.6
Slave
Slave
©Continuent 2013
12
Comments and Trade-O!s
•
•
•
•
©Continuent 2013
Design key: Pick the right replication method
Capabilities vary greatly
Failover is hard to implement from scratch
Design key: Address rolling maintenance
13
3. Fabric Connector
•
Responsibility: Connect applications
transparently to replicas
•
Capabilities:
©Continuent 2013
•
•
•
•
Encapsulation of DBMS server location
Transparent request proxying
Noti!cation protocol for failover/switch operations
Load balancing for performance
14
Lots of Places to Intercept Requests
Application Logic
Access Library
DBMS driver
Proxy Driver
Client proxy
Client Wire
Protocol
Level 7 router
Virtual IP Address
DNS name switching
Proxy DBMS server (to other DBMS
processes)
DBMS
Server
©Continuent 2013
Access Library Wrapper
15
Quorum and Split Brain
App
Server
App
Server
Master/Online
©Continuent 2013
App
Server
Slave/Online
16
App
Server
Slave/Offline
Library Wrapper + ZooKeeper
Application Logic
ZooKeeper Cluster
Access Library Wrapper
zk1
Java ORM
MySQL JDBC Driver
zk2
zk3
DBMS locations stored
in ZK directory; failover
through ZK event
notifications
Fault Tolerant
Data Service
©Continuent 2013
17
Comments and Trade-O!s
•
VIPs and DNS renaming are easy but
treacherous
•
•
Wire protocol translation is hard
•
Design key: balance between availability and
split brain
©Continuent 2013
Wrapper libraries are a good Goldilocks
solution
18
4. Sharded Data Service
•
Responsibility: Partition a large dataset
into manageable subsets
•
Capabilities:
©Continuent 2013
•
•
•
•
“Buckets” to contain groups of shards
Partitioning function to locate shards
Provisioning and shard migration
Intra- and cross-shard query
19
Easy vs. Hard Sharding Problems
Independent, small
subsets
©Continuent 2013
Very large subsets with
dependent data
20
Sharding Is Easy when It’s Easy
Application Logic
ZooKeeper Cluster
Access Library Wrapper
zk1
Java ORM
MySQL JDBC Driver
zk2
zk3
DBMS locations stored
in ZK directory; failover
through ZK event
notifications
Sharded
Data Service
©Continuent 2013
21
Comments and Trade-O!s
•
Design key: factor data model to enable
sharding
•
•
•
•
Sharding scheme must be simple to work
©Continuent 2013
Moving shards is di#cult
Push cross-shard queries into data warehouse
Some problems really need Cassandra or
HBase, not a SQL DBMS
22
5. Multi-Site Data Service
•
Responsibility: Spread data across multiple
geographic locations
•
Capabilities:
©Continuent 2013
•
•
•
Fast cross-site replication
Primary/DR topology for business continuity
Master/master topology to push writes closer to
users
23
Multi-Master vs. Primary/DR
Amazon East
EU West
US East
APAC Tokyo
Rackspace DFW
Write-anywhere model
No writes on DR sites
Applications w/ few con!cts
Any application
Simpler failover model
Easier to set up and manage
Problems with constraints,
large transactions
Does not a"ect SQL features
©Continuent 2013
24
Tungsten Primary/DR Service
App Server
+ Connector
App Server
+ Connector
San Jose
NYC
Slave
Master
©Continuent 2013
25
Comments and Trade-O!s
•
•
Primary/DR is robust, easy to add later
•
Design key: Keep data model simple to
enable multi-master
•
SQL DBMS do not support multi-master as
well as NoSQL (cf. Riak, Cassandra, HBase,...)
©Continuent 2013
Multi-master is a better model if you can
make it work
26
6. Real-Time Data Bridge
•
Responsibility: Transfer data in real-time
between di!erent DBMS types
•
Capabilities:
©Continuent 2013
•
•
•
•
Heterogeneous replication
[Near] Real-time
High performance without app changes
E#cient conversion between storage models
27
Why Polyglot Persistence?
Online Storefront Application
User Session
Service
Web Sales
Service
Sales Analytics
Service
Key-value store with
serialized user
sessions
Row store with
strong transactional
support
Column store w/
compression + time
series analysis
©Continuent 2013
28
Implementing Real-Time Data Bridge
Sharded
Data Service
Tungsten Master
Replication Services
Tungsten Slave
Replication Services
oltp1
oltp1
oltp2
oltp2
oltp3
Parallel loading via
CSV files into Vertica
oltp3
Serial extraction from
MySQL binlog
©Continuent 2013
29
Comments and Trade-O!s
•
Heterogeneous replication is time-consuming
to deploy
•
•
•
Can apply the pattern after the fact
©Continuent 2013
Avoid transformation as much as possible
Design key: Balance complexity of data
movement vs. getting data into the right silo
for processing
30
Data Fabric Design Pattern Recap
Connector
Connector
Connector
Fabric
Connector
Transactional
Data Service
Bridge
Fault-Tolerant
Data Service
Real-Time
Data Bridge
Sharded
Data Service
Multi-Site
Data Service
©Continuent 2013
31
Conclusion
•
Use application architecture principles to
create data services
•
Use design patterns to combine services into
data fabrics
•
•
Apply patterns incrementally
©Continuent 2013
Early focus on design keys pays o! big in the
long run
32
560 S. Winchester Blvd., Suite 500
San Jose, CA 95128
Tel +1 (866) 998-3642
Fax +1 (408) 668-1009
e-mail: [email protected]
Our Blogs:
http://scale-out-blog.blogspot.com
http://datacharmer.blogspot.com
http://www.continuent.com/news/blogs
Continuent Web Page:
http://www.continuent.com
Tungsten Replicator:
http://code.google.com/p/tungsten-replicator
©Continuent 2013