Dive Into VM Live Migration OpenStack Liberty Summit 2015 Vancouver Michał Dulko Michał Jastrzębski Paweł Koniszewski Why Bother? Use Cases o Imminent host failure o Maintenance mode o Optimal resource placement Imminent Host Failure o Cooling issues o Storage problems o Networking problems o Your datacenter was struck by a flood Maintenance Mode o Firmware upgrades o Hardware upgrades o Kernel upgrades Optimal Resource Placement o Reduce costs o o o Move VMs closer to their storage to lessen network latency Stack more VMs on hosts to save power Increase resiliency o o Noisy neighbour separation Spread VMs across more hosts General Flow Assumptions o Live o Consistent o Transparent o Minimal service disruption Migrations in OpenStack Non-live migration (cold migration) o True live migration (shared storage or volume-based) o nova migrate <server> nova live-migration <server> [<host>] Block live migration o nova live-migration --block-migrate <server> [<host>] Compatibility Migration type Local storage Volumes Shared storage Block LM ✓ ✗ ✗ True LM ✗ ✓ ✓ Block LM with read-only devices ✗ ✗ ✗ True LM with read-only devices ✗ ✗ ✓ Live Migration Process o Pre-Migration o Reservation o Iterative pre-copy o Stop and copy o Commitment Pre-migration Pre-migration Compute node A Compute node B Reservation Iterative pre-copy Stop and copy VM A Active Commitment Active VM on physical host A, host B selected by scheduler or preselected. Reservation Pre-migration Reservation Compute node A Compute node A Compute node B Compute node B VM A VM Active A VM A VM Reserved A Iterative pre-copy Stop and copy Commitment ACTIV E Confirm availability of resources on host B; reserve a new VM. Iterative pre-copy Pre-migration Compute node A Compute node B Reservation Copying Iterative pre-copy Stop and copy VM A Active VM A Paused Commitment Memory is transferred from A to B and next dirtied pages are iteratively copied. Stop and copy Pre-migration Reservation Compute node A Compute node A Compute node B Compute node B Copy Iterative pre-copy Stop and copy Commitment VM A VM Paused A PAUSE D VM A VM Paused A PAUSE D Suspend VM and copy remaining pages and CPU state. Commitment Pre-migration Reservation Compute node A Compute node A Compute node B Compute node B Iterative pre-copy Stop and copy Commitment VM A PAUSE D VM A VM Active A PAUSE D Host B becomes primary host for VM A. Performance & reliability Pitfalls o o OpenStack does not allow triggering any operations on VM during LM VMs with intensive memory workload are hard to migrate o LM generates heavy load on network o Migrations between CN with different CPUs o Memory oversubscription Interacting With Live Migration o OpenStack disallow any operation on ongoing LM o You can use virsh instead to interact Diagnosis o Information about ongoing LM virsh domjobinfo <domain> Time elapsed 1918595 ms Data processed 410.137 GiB Data remaining 4.600 GiB Data total 16.008 GiB Constant pages 144658 Normal pages 107307605 Normal data 409.346 GiB Expected downtime 1023 ms Forcing Migration Finish o Cancel on-going LM virsh domjobabort <domain> o Pause VM during LM virsh suspend <domain> Tuning Maximum Downtime o QEMU virsh qemu-monitor-command --hmp <domain> migrate_set_downtime <time (sec)> o libvirt virsh migrate-setmaxdowntime <domain> <time (sec)> Auto Converge o nova.conf setting live_migration_flag += VIR_MIGRATE_AUTO_CONVERGE Tunneled Migration o nova.conf setting live_migration_flag += VIR_MIGRATE_TUNNELLED libvirt libvirt Hypervisor Hypervisor Source Host Destination Host Tunneled Migration o nova.conf setting live_migration_flag -= VIR_MIGRATE_TUNNELLED libvirt libvirt Hypervisor Hypervisor Source Host Destination Host Tuning Bandwidth o libvirt virsh migrate-setspeed <domain> <speed (MiB/s)> o nova.conf settings live_migration_bandwidth = <speed (MiB/s)> XBZRLE Compression o nova.conf settings live_migration_flag += VIR_MIGRATE_COMPRESSED Sent Page Cache Updated Page Source Host Delta Compression Destination Host Received Pages Apply Delta Delta Delta Updated Page LM On Dedicated Network o nova.conf o live_migration_uri = qemu+tcp://%s/system Compute node A Compute node B Management Network VM A Active VM A Paused LM On Dedicated Network o nova.conf o o live_migration_uri = qemu+tcp://%s-lm/system Set up your DNS to resolve hostnames with -lm suffix to IPs in your dedicated network. Compute node A Compute node B Management Network VM A Active LM Network VM A Paused Different CPUs Between Compute Nodes o CPU instruction set of source node needs to be a subset of CPU instruction set of destination node Compute Node A Live Migration Compute Node B Passed MMX AVX Failed MMX SSE2 AVX Different CPUs Between Compute Nodes o This can be skipped by explicitly setting VM CPU model in nova.conf: o o o o cpu_mode = custom virt_type = kvm or virt_type = qemu And then you can set cpu_model List of supported named CPUs is in libvirt/cpu_map.xml Memory Oversubscription o LM to specific host does not use memory oversubscription o ram_allocation_ratio Compute Node A 2 GB RAM Reported RAM = available - reserved nova-conductor 2 GB 2 GB 2 GB nova-scheduler ram_allocation_ratio = 2.0 4 GB Memory Oversubscription o Skip it by o reserved_host_memory_mb=-2048 Compute Node A 2 GB RAM Reported RAM = available - reserved nova-conductor 4 GB 4 GB 4 GB nova-scheduler ram_allocation_ratio = 1.0 4 GB Secure Live Migration Why Security Matters? o Everything can be sniffed! o Migrated machines can contain sensitive data o Legal issues with unencrypted data transfer Encryption o Hypervisor native encryption o o libvirt tunneled transport o o o o QEMU doesn’t support it live_migration_uri = qemu+ssh://%s/system live_migration_flag += VIR_MIGRATE_TUNNELLED Uses only one core IPSec tunnel between hosts Memory Access Is Critical 3 Transfer rate [GBps] 2.5 2 Intel(R) Xeon(R) CPU E5-2690 v2 1.5 Intel(R) Xeon(R) CPU E5-2660 v3 1 0.5 0 QEMU+SSH QEMU+TCP Future Of Live Migration Multithreaded Compression o Compress every page sent during LM o zlib used for compression o Configurable: o o Number of threads Comperession ratio Post-copy Live Migration o Move workload immediately to destination host Compute node A Compute node B Copying VM A Paused VM A Active Post-copy Live Migration o Cheap solution to finish live migration in a finite time o VM needs to be rebooted in case of failure o Heavy performance impact Active LM Monitoring In OpenStack o Track memory transfer progress o Detect possible problems and take actions Actions On Ongoing Live Migration o Pause VM o Abort LM o See progress o Change configuration on the fly: o o Maximum tolerable VM down time Transfer bandwith Your voice matters! o Mailing lists: o o o [email protected] [email protected] Win The Enterprise group: o o o [email protected] (IRC: pkoniszewski) [email protected] (IRC: inc0) [email protected] (IRC: dulek) Q&A (& disclaimers) Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. © 2015 Intel Corporation.
© Copyright 2024