Memory compression and deduplication
Student: KonradWitaszczyk (def@FreeBSD.org)
Mentor: DavidChisnall (theraven@FreeBSD.org)
Project description
Thanks to swapping FreeBSD can run processes even if it’s not possible to place new pages in memory. However, moving VM objects to and from secondary storage can be expensive. The goal of this project is to implement two features that will allow to remove duplicated pages and compress pages which can be well compressed. Both optimizations should increase VM performance and flash drives lifetime.
Approach to solving the problem
At the beginning the project will focus on memory deduplication. After this I might work on memory compression depending on time and needs.
Please note that this paragraph will be updated during my research.
There are three main problems in memory deduplication:
- What pages should be compared?
- What structures should be used to store information about pages during deduplication process?
- How to manage locks not to break VM or degrade performance?
In this section I would like to answer above questions and present the idea behind my implementation.
Memory deduplication can improve VM performance in many cases. One of them is running multiple virtual machines with the same OS. From VM's perspective memory allocated by the virtual machines is unknown, i.e. it doesn't know anything about the source and content of pages, so it cannot know which pages could be the same.
We can periodically check which pages can be merged. VM already implements daemons for paging and swapping using SYSINIT(9) and they can be considered as good examples for memory deduplication.
Given two pages the daemon can merge them when:
- They are active;
- They represent anonymous memory;
- They don't change frequently;
- Their content is the same.
During the project I will check the following approaches:
- Deduplicate page table pages (PTP).
Deliverables
- Deduplication of identical pages in various scenarios.
- Memory compression using a memory pool and swap (optional).
Milestones
Start |
End |
Task |
May 25 |
May 31 |
Start of coding. |
June 1 |
June 7 |
Create structures. |
June 8 |
June 14 |
Deduplicate page table pages. |
June 15 |
June 21 |
Deduplicate page table pages. |
June 22 |
June 26 |
|
June 26 19:00 UTC |
July 3 19:00 UTC |
Mid-term Evaluations. |
July 4 |
July 12 |
|
July 13 |
July 19 |
|
July 20 |
July 26 |
|
July 27 |
August 2 |
|
August 3 |
August 9 |
|
August 10 |
August 16 |
Code review. |
August 17 |
August 21 |
End of coding (soft). |
August 21 19:00 UTC |
|
End of coding (hard). |
Test Plan
- Use bhyve in order to run several virtual machines with the same OS;
- Use VM meter (sysctl vm.stats) to count duplicated pages and verify VM performance. Metrics depend on the final implementation.
The Code
Repository: https://reviews.freebsd.org/diffusion/DEFGSOC/
Project page in Phabricator: https://reviews.freebsd.org/project/profile/52/
Useful links
Design elements of the FreeBSD VM system - https://www.freebsd.org/doc/en_US.ISO8859-1/articles/vm-design/
Increasing memory density by using KSM - http://landley.net/kdocs/ols/2009/ols2009-pages-19-28.pdf
Security Implications of Memory Deduplication in a Virtualized Environment - http://www.cs.wm.edu/~hnw/paper/memdedup.pdf