Backup of Distributed MySQL Applications Taking snapshot of a thousand dancing dolphins Chander Kant CEO Paddy Sreenivasan VP Engineering www.zmanda.com Twitter: @zmanda Cloud Backup Twitter @zmanda Open Source Backup 1 Zmanda • Worldwide Leader in Open Source Backup • 500,000+ Protected Systems • Open Source, Open APIs, Open Formats • Smashes traditional backup business model • MySQL Backup Specialist • Zmanda Recovery Manager for MySQL • Zmanda Cloud Backup Cloud Backup Twitter @zmanda Open Source Backup 2 Protected by Zmanda Subscribers of Enterprise Editions Web and Media Government Research & Education Telecom & Service Providers Manufacturing & Services Cloud Backup Twitter @zmanda Open Source Backup 3 Top 5 MySQL Backup Requirements • Backup live database with minimal impact on application and users • Versatile • Scale Out = Multitude of servers • Scale up = Large Databases with no increase in lock times • Backup of local or remote MySQL servers • Intelligent Recovery • Precise restore to a particular point-in-time or database event • Fast restore in case of failure • Global Enterprise Management • Manage all databases from a single entity • Backup automation from scheduling, monitoring to reporting • Easy to Use and Secure Cloud Backup Twitter @zmanda Open Source Backup 4 Zmanda Recovery Manager for MySQL ZRM remote to MySQL ZRM local to MySQL Enterprise-wide MySQL backup Cloud Backup ZRM to MySQL Cluster Twitter @zmanda Open Source Backup 5 Zmanda Recovery Manager (ZRM) for MySQL As easy as What, Where, When and How. Cloud Backup Twitter @zmanda Open Source Backup 6 Backups of MySQL Running on Amazon EC2 Zmanda Management Console EC2 EBS EBS Backup Catalog Incremental Backups EBS S3 Full Backups Cloud Backup Twitter @zmanda Open Source Backup 7 Blazing Fast Snapshot based Full Backups Scenario: • 100+GB of database growing into Terabytes • 24x7 application (i.e. no backup window) • Active OLTP workload • Need ability to restore to specific database event Solution: • Storage Snapshot + MySQL Logs + Automated point-and-click restore • Solaris 10 x86 • ZFS Snapshot • MySQL Enterprise 5.0 • ZRM • Raw copy speed of 500 GB/hr Cloud Backup Twitter @zmanda Open Source Backup 8 Point-in-time Recovery 1. ZRM creates unified snapshots of data and MySQL binary log 2. For point-in-time recovery between T2 and T3, ZRM reads data from snapshot T2 and replays transactions from Binlog T3 up to RPO. 1 3. Note that ZRM can treat inplace snapshot as a backup (which is ideal for EBS Snapshots) Cloud Backup Twitter @zmanda 2 Open Source Backup 9 Take a Snapshot of a Thousand Dancing Dolphins Cloud Backup Twitter @zmanda Open Source Backup 10 Backup & DR Needs for a large-scale MySQL Implementation • Application managers desire a point-in-time restore which is coordinated across multiple servers • IT managers want to have as identical configuration across all nodes - so process of replacing nodes becomes simple • Depending on the application, retention policy could be several years • Overall application should be able to recover from multiple node failures, human errors or sabotage, and geographic problems (disaster, connectivity etc.) Cloud Backup Twitter @zmanda Open Source Backup 11 Coordinated Backups vs. Coordinated Restore Coordinated Backups • Backup all nodes consistent to a specific event • E.g. all rows are backed up until a specific Global Sequence Number (GSN) or create a checkpoint event specifically for backup purposes • Cleanest backup images but periodic hiccups Cloud Backup Twitter @zmanda Open Source Backup 12 Coordinated Backups vs. Coordinated Restore Coordinated Restore • Each individual node backed up completely independent of each other • No checkpoint event • However more processing required at the time of recovery • ZRM can be scripted to identify this point in the backed up binary logs for every shard • Visual log analyzer feature of ZRM helps DBAs to efficiently search for these points • Clock synchronization helps Cloud Backup Twitter @zmanda Open Source Backup 13 Cloud Backup Twitter @zmanda Open Source Backup 14 Cloud Backup Twitter @zmanda Open Source Backup 15 Recover Anytime, Anywhere. Cloud Backup Twitter @zmanda Open Source Backup 16 Case Study: ZRM configuration with MySQL Shards 100 database nodes Consolidated Meta ZRM server ZRM servers LVM Snapshots Converted full and incremental backups NFS Remote Remote Data Center Shared Storage (with Deduplication) Cloud Backup Twitter @zmanda Open Source Backup 17 Case Study: Restoration Scenarios • Recovery from application errors • Apply transactions for the node (or across nodes) • Recovery from failed disk or node • Apply full backup and incremental backups to latest checkpoint • ZRM provides portable backup images Cloud Backup Twitter @zmanda Open Source Backup 18 Backup images • Local full backup image is a LVM snapshot on the local node • The LVM snapshot is converted into regular backups on a weekly basis in the background • The incremental backup data is available over NFS to ZRM meta backup server • The backup images and the catalog from shared storage are replicated to a remote datacenter Cloud Backup Twitter @zmanda Open Source Backup 19 Backup policies • The full and incremental backups are compressed • Unless deduplication based storage is deployed • The shared storage for backups can use deduplication Cloud Backup Twitter @zmanda Open Source Backup 20 Restoration steps (Operator error) • Identify offending record change • Use Visual Log Analyzer of ZRM on hosts for the record • Reasonable time synchronization is helpful here • Identify prior event for the key • Use Search in Zmanda Management Console • Coordinated Restore Script • Application level script takes input from ZRM and commits new records for all effected nodes. Cloud Backup Twitter @zmanda Open Source Backup 21 Restoration steps (Failed node) • Restore failed node to last available backup • Use Meta ZRM server for restoration • If a checkpoint is present, use Visual Log Analyzer of ZRM to identify the last restored checkpoint • Call Application level node synchronization procedure Cloud Backup Twitter @zmanda Open Source Backup 22 Zmanda: Backup To Cloud Cloud Backup Twitter @zmanda Open Source Backup 23 Zmanda: Backup To Cloud Cloud Backup Twitter @zmanda Open Source Backup 24 Zmanda Cloud Backup (For MySQL on Windows) • Apps: Exchange, SQL Server, Oracle, SharePoint and MySQL • Compliant with EU Data Protection Directive 95/46 • Network Drive support • Logical full backups only • Can backup remote MySQL databases Cloud Backup Twitter @zmanda Open Source Backup 25 Zmanda Recovery Manager in Action • More than one million new Athletes created every month. • Each with the ability to customize their avatars, accumulate game credits and buy virtual prizes. • Combination of users, identities, games-in-play, credits and prizes generates a lot of data at a very fast pace — all of which is core to the company's success. • Multiple Storage Engines: InnoDB, MyISAM and Archive • In addition to regular full backups, the company must complete an incremental backup of MySQL every 15 minutes. Cloud Backup Twitter @zmanda Open Source Backup 26 Zmanda Recovery Manager in Action “ZRM helps us formalize and automate the backup process for all our production data, and consolidates all backups from different systems into one consistent platform.... Furthermore, the ZRM platform greatly simplified our production systems' recovery scenarios by reducing the number of steps required in the data recovery process.” Franck Leveneur, Senior Data Architect, Six Degrees Games, Inc Cloud Backup Twitter @zmanda Open Source Backup 27 Protected by Zmanda Subscribers of Enterprise Editions Web and Media Government Research & Education Telecom & Service Providers Manufacturing & Services Cloud Backup Twitter @zmanda Open Source Backup 28
© Copyright 2024