Memory Deduplication in Virtualized Environments

Decreasing the Memory Footprint of Virtual Machines
Karim Elghamrawy, Diana Franklin, Fred Chong
Difference Cache Design
We provide five Ubuntu virtual machines running commonly used cloud
applications as shown in the following Table
VM1
Ubuntu 14 running apache benchmark
VM2
Ubuntu 13 running apache benchmark
VM3
Ubuntu 14 compiling the linux kernel
VM4
Ubuntu 14 running redis benchmark
VM5
Ubuntu 14 running sysbench
Workload Characterization
Our Experiment uses the KVM
hypervisor and eight workload
mixes. Each workload mix is
comprised of two virtual machines.
Figure 1 shows the number of
identical and similar pages for the
eight mixes classified as stable,
pseudostable, and unstable pages in
a period of 10 minutes. It is shown
that the vast majority of identical
pages remain stable. For similar
pages, the vast majority of pages
are either stable or pseudostable.
Figure 2 shows the number of similar
pages with different divergence sizes
(d)
Mix1
VM1 and VM3
Mix2
VM1 and VM4
Mix3
VM1 and VM5
Mix4
VM2 and VM3
Mix5
VM2 and VM4
Mix6
VM2 and VM5
Mix7
VM3 and VM4
Mix8
VM4 and VM5
Unlike traditional page sharing, pages that are identical or similar share parts of
the physical address and differ in the MSBs of the address. Those different bits
allow the difference cache to store page differences among pages that start out
as identical or similar.
Block threshold Analysis
Efficiency (η) is defined as the
ratio of the absolute memory
saved to the difference cache
size required for this savings,
Savings (S) is the absolute
memory saved in MegaBytes.
1800
1600
1400
1200
Mix1
Mix2
Mix3
Mix9
Mix4
Mix5
Mix6
Mix7
Mix8
1000
S * η (MB)
VM Configurations
Since many DRAMs now allow for DRAM caching by having a cache attached to
the DRAM, we implement our Difference cache using a DRAM cache and a new
design of the memory controller. Our implementation requires no change in the
processor or the operating systems running on top of the VMM.
800
600
400
Block threshold is the the
number of blocks per page that
a difference cache can track. As
block threshold increases, (S) is
expected to increase, but (η) is
expected to decrease
200
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Block threshold
Figure 3: The savings-Effciency Product for different block thresholds
and different workloads
600
500
In Figure 3 the (savings *
efficiency) is plotted.
For all the workloads, the
maximum (Savings * efficiency)
product happens around 4blocks.
In Figure 4 and Figure 5, the
absolute memory savings and
the difference cache size
required for each of the
workloads at three different
block thresholds. 4 (the sweet
spot), 8, and 32 (the maximum
absolute savings achievable) are
presented.
Absolute Memory Savings in MB
Multiple virtual machines usually run on the same physical server.
These virtual machines often run similar operating systems and
applications resulting in duplicate memory data. Reducing the
memory footprint of those VMs decreases the energy consumed by
the memory system. Transparent page sharing is a memory
deduplication mechanism that detects duplicate memory pages and
keeps only one copy in the physical memory. We achieve more
memory savings by further storing page differences. To mitigate the
associated performance overhead, we try to predict the state of the
page. Stable pages are good candidates for traditional page sharing,
pseudo-stable pages are good candidates for storing the differences,
and unstable pages are ignored.
400
S at 4-blocks (MB)
S at 8-blocks (MB)
S at 32-blocks (MB)
300
200
100
0
Mix1
Mix2
Mix3
Mix4
Mix5
Mix6
Mix7
Mix8
Mix9 Mix10 Mix11 Mix12 Mix13 Mix14
Figure 4: Memory Savings for 4,8, and 32 block-thresholds
160
140
120
Cache Size in MBs
Abstract
100
Cache Size at 4-blocks (MB)
Cache Size at 8 blocks(MB)
Cache Size at 32 blocks(MB)
80
60
40
20
0
Mix1 Mix2 Mix3 Mix4 Mix5 Mix6 Mix7 Mix8 Mix9 Mix10 Mix11 Mix12 Mix13 Mix14
Figure 5: Difference Cache capacity required for 4, 8, and 32 blockthresholds
Figure 1: The number of identical and similar pages
for each workload.
Figure 2: The number of similar pages categorized by
divergence size d
Performance Considerations
To reduce the
performance
overhead
associated with
merging pages
that are highly
dynamic, we try to
expect the
behavior of pages
based on their
linux kernel flags.
Figure 6: Flags breakdown of physical pages
that start out as “identical” and are
referenced by a “single” process
Figure 7: Flags breakdown of physical pages
that start out as “similar” and are referenced
by a “single” process
Figure 8: Flags breakdown of physical pages Figure 9: Flags breakdown of physical pages
that start out as “similar” and are referenced
that start out as “identical” and are
by a “many” processes
referenced by a “many” processes