Transparent Page Sharing (TPS) in hardware MMU systems

Products

VMware vSphere ESXi

Issue/Introduction

Transparent page sharing is a method by which redundant copies of pages are eliminated. This helps to free memory that a virtual machine would otherwise be using. Because of the way TPS works with hardware-assisted memory virtualization systems like Intel EPT Hardware Assist and AMD RVI Hardware Assist, esxtop may show zero or few shared pages in these systems. Page sharing will show up in esxtop only when host memory is overcommitted. The rest of this article provides background and details.

Environment

VMware ESXi 4.1.x Installable
VMware ESX 4.0.x
VMware ESXi 4.0.x Embedded
VMware ESX 4.1.x
VMware ESXi 4.0.x Installable
VMware ESXi 4.1.x Embedded

Resolution

Transparent Page Sharing basics

In ESX, a 4KB page backing any guest page with content identical to another 4KB guest page can be shared regardless of when, where, and how those contents are generated. ESX periodically scans the content of guest physical memory for sharing opportunities. For each candidate page, a hash value is computed based on its content. The hash value is then used as a key to look up a global hash table, in which each entry records a hash value and the physical page number of a shared page. If the hash value matches an existing entry, a full bit-by-bit comparison of the page contents between the candidate page and the shared page is performed to exclude a false match. After a successful content match, the guest-physical to host-physical mapping of the candidate page is changed to the shared host-physical page, and the redundant host memory copy is reclaimed.

Transparent Page Sharing with large pages

In hardware-assisted memory virtualization systems, ESX will preferentially back guest physical pages with large host physical pages (2MB contiguous memory region instead of 4KB for regular pages) for better performance. If there is not a sufficient 2MB contiguous memory region in the host (for example, due to memory over commitment or fragmentation), ESX will still back guest memory using small pages (4KB). ESX will not share large physical pages because:

The probability of finding two large pages that are identical is very low.
The overhead of performing a bit-by-bit comparison for a 2MB page is much higher than for a 4KB page.

However, ESX still generates hashes for the 4KB pages within each large page during page scanning.

In the cases where host memory is overcommitted, ESX may have to swap out pages. Since ESX will not swap out large pages, during host swapping, a large page will be broken into small pages. ESX tries to share those small pages using the pre-generated hashes before they are swapped out. The motivation of doing this is that the overhead of breaking a shared page is much smaller than the overhead of swapping in a page if the page is accessed again in the future.

Conclusion

Given the above implementation, one may see zero or very little page sharing in hardware MMU systems when host memory is under committed. Page sharing occurs only when the host starts to swap out pages due to very high memory pressure. Note that with the above features, the ability to reclaim memory through page sharing in ESX is not degraded because ESX defers sharing pages until host free memory is very low.