SpotCheck: Designing a Derivative IaaS Cloud on the Spot Market Prateek Sharma, Stephen Lee, Tian Guo, David Irwin, Prashant Shenoy University of Massachusetts — Amherst O UR S OLUTION : D ERIVATIVE C LOUDS • System to manage mix of spot and on-demand instances • Intermediate layer between users and IaaS cloud provider • Provide similar interface to users as provided by IaaS (Virtual Machines) • Transparently migrate VMs between pools User VMs Cloud servers have different costs and availability tradeoffs: On-demand Servers: • Fixed price per unit time • Non-revocable On-demand pool Spot Instance Pool Migrate Spot Servers: • Variable prices based on market conditions • Revocable =⇒ lower availability • Prices tend to be lower than On-demand servers • Allow cloud provider to sell surplus capacity • 2nd price auction determines spot price • (Spot price > Bid) =⇒ termination • Small termination warning(~2 minutes) 0.6 0.4 0.6 Spot−price Ondemand−price ratio 0.8 Spot instances are really cheap! 1.0 1. VM dirty memory pages transferred to a backup server continuously, incrementally, and asynchronously 2. Backup server able to support multiple (~50) VMs 3. Memory checkpoint lazily restored from backup server if spot instance terminates — fetch page on first access On-demand Instance Spot Instance 0.10 Instance Terminated Price ($/hr) 0.08 Spot price bid price 0.06 User VM Xen Blanket User VM Xen Blanket Linux Kernel (Dom 0) Linux Kernel (Dom 0) 0.04 Write Dirty Memory Pages 0.02 0.00 0 10 20 30 Time 40 20 30 40 50 50 Lazily Restore VM pages Backup Server Xen Live migration Unoptimized Full restore SpotCheck with Full restore SpotCheck with Lazy restore 0.03 0.02 0.01 0.00 1-Pool 2-Pools 4-Pools 4-Pools Equal Distributed Cost 4-Pools Stability Save 80% on your EC2 bill AVAILABILITY • Unavailability is due to migrations from spot to on-demand • Small downtime(~20 seconds) during migration • Due to latency of IaaS operations detaching & reattaching network & storage Xen Live migration Unoptimized Full restore 0.20 Unavailability (%) Availability CDF 0.8 0.2 10 0.04 B OUNDED TIME VM MIGRATION 1.0 0.0 50 TPC-W response time 0.05 Average cost per hour ($) 1. Ability to run interactive, disruption-intolerant applications 2. Not lose application state 3. Provide servers to customers at low cost 4. Not adversely impact application performance Run on spot when possible, move to on-demand when evicted Bounded time migration : VM migrates within specified time 0.2 40 35 30 25 20 15 10 5 0 01 20 30 40 Num. VMs per backup server Expected Cost = 0.2 × On-Demand = $ 0.014 / hour SpotCheck : derivative cloud on spot and on-demand instances S POT I NSTANCES m3.medium m3.large m3.xlarge m3.2xlarge 10 40 VMs can share one backup server O UR S YSTEM : S POT C HECK Run interactive applications on mix of Spot & On-demand servers 0.4 SpecJBB Throughput C OST P ROBLEM S TATEMENT 0.0 User 2 12000 10000 8000 6000 4000 2000 0 01 Response time (ms) • Infrastructure as a Service (IaaS) • Examples : Amazon EC2, Google Compute Engine, Rackspace • IaaS rents out physical or virtual computing resources P ERFORMANCE & S CALABILITY Throughput (bops) I AA S C LOUDS SpotCheck with Full restore SpotCheck with Lazy restore 0.15 0.10 0.05 0.00 1-Pool 2-Pools 4-Pools 4-Pools Equal Distributed Cost 99.9989% Availability 4-Pools Stability
© Copyright 2024