Evaluating the impacts of hugepage on virtual machines

Wang, Xiaolin; Luo, Taowei; Hu, Jingyuan; Wang, Zhenlin; Luo, Yingwei

doi:10.1007/s11432-015-0764-7

Evaluating the impacts of hugepage on virtual machines

评估大页对内存虚拟化的影响

Research Paper
Published: 29 November 2016

Volume 60, article number 012103, (2017)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Xiaolin Wang¹,
Taowei Luo¹,
Jingyuan Hu¹,
Zhenlin Wang² &
…
Yingwei Luo¹

157 Accesses
7 Citations
Explore all metrics

Abstract

Modern applications often require a large amount of memory. Conventional 4KB pages lead to large page tables and thus exert high pressure on TLB address translations. This pressure is more prominent in a virtualized system, which adds an additional layer of address translation. Page walks due to TLB misses can result in a significant performance overhead. One effort in reducing this overhead is to use hugepage. Linux kernel has supported transparent hugepage since 2.6.38, which provides an alternate large page size. Generally, hugepage demonstrates better performance on address translations and page table modifications. This paper first analyzes the impact of hugepage on native system, and then, compares the impact of hugepage on different memory virtualization approaches: hardware-assisted paging (HAP), shadow paging, and para-virtualization. We observe that the current implementation of transparent hugepage is inefficient. It cannot exploit the full performance advantage of hugepages. Worse yet, the conservative strategy of transparent hugepage may conflict with existing OS functions, which can lead to performance degradation. So, we propose a new memory allocation strategy, alignment-based hugepage (ABH) that promotes hugepage allocations. We apply ABH to different paging modes in virtualized systems. The results show that the new allocation strategy can significantly reduce TLB misses and up to 90% page walk cycles due to TLB misses and thus improve the performance in real world applications.

创新点

当前环境下, 很多应用需要的内存越来越大。传统的4KB页会导致地址转换开销过大的问题。在虚拟化环境下, 因为需要增加一层额外的地址转化, 这个问题更为明显。一种减少地址转化开销的方法是使用大页。一般来说, 大页相对于普通4K页, 在访问页表和处理缺页中断上有更好的性能。Linux内核自2.6.38开始支持透明大页, 可以在不影响用户程序的前提下, 为程序分配大页, 提升性能。但是透明大页存在缺陷, 使用大页有额外的对齐要求, 当前的实现无法满足。

本文首先分析了Linux和虚拟化环境下内存的性能, 以及透明大页的效果; 发现因为地址对齐的限制, 透明大页在很多情况下, 使用效率不足25%; 提出一种基于对齐的内存管理方案, 提升大页使用比例, 并提升程序性能。

在Linux和几种虚拟化环境下, 评估了新的内存管理方案。实验结果表明: 新的方案, 最多可以减少90%页表访问的开销; 在虚拟化环境中, KVM的影子页表模式有最好的性能; XEN的影子页表模式目前无法使用大页, 但可以通过支持大页获得更好的性能。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Henning J L. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput Architect News, 2006, 34: 1–17
Article MathSciNet Google Scholar
Bienia C, Kumar S, Singh J P, et al. The parsec benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. New York: ACM, 2008. 72–81
Chapter Google Scholar
Bhargava R, Serebrin B, Spadini F, et al. Accelerating two-dimensional page walks for virtualized systems. ACM SIGOPS Oper Syst Rev, 2008, 42: 26–35
Article Google Scholar
Luo T W, Wang X L, Hu J Y, et al. Improving TLB performance by increasing hugepage ratio. In: Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). Washington, DC: IEEE, 2015
Google Scholar
Ganapathy N, Schimmel C. General purpose operating system support for multiple page sizes. In: Proceedings of USENIX Annual Technical Conference. Berkeley: USENIX Association Berkeley, 1998. 8
Google Scholar
Navarro J, Iyer S, Druschel P, et al. Practical, transparent operating system support for superpages. ACM SIGOPS Oper Syst Rev, 2002, 36: 89–104
Article Google Scholar
Lu H J, Seth R, Doshi K, et al. Using hugetlbfs for mapping application text regions. In: Proceedings of the Linux Symposium, Ottawa, 2006. 2: 75–82
Google Scholar
Romer T H, Ohlrich W H, Karlin A R, et al. Reducing tlb and memory overhead using online superpage promotion. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture. New York: ACM, 1995. 176–187
Chapter Google Scholar
Du Y, Zhou M, Childers B R, et al. Supporting superpages in non-contiguous physical memory. In: Proceedings of IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), Burlingame, 2015. 223–234
Google Scholar
Swanson M, Stoller L, Carter J. Increasing TLB reach using super backed by shadow memory. ACM SIGARCH Comput Architect News, 1998, 26: 204–213
Article Google Scholar
Talluri M, Hill M D. Surpassing the TLB performance of super with less operating system support. ACM SIGPLAN Notices, 1994, 29: 171–182
Article Google Scholar
Bhattacharjee A. Large-reach memory management unit caches. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. New York: ACM, 2013. 383–394
Chapter Google Scholar
Bhattacharjee A, Lustig D, Martonosi M. Shared last-level tlbs for chip multiprocessors. In: Proceedings of IEEE 17th International Symposium on High Performance Computer Architecture (HPCA). Washington, DC: IEEE, 2011. 62–63
Google Scholar
Lustig D, Bhattacharjee A, Martonosi M. TLB improvements for chip multiprocessors: inter-core cooperative prefetchers and shared last-level TLBs. ACM Trans Architect Code Optim, 2013, 10: 2
Google Scholar
Srikantaiah S, Kandemir M. Synergistic tlbs for high performance address translation in chip multiprocessors. In: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC: IEEE, 2010. 313–324
Chapter Google Scholar
Barr T W, Cox A L, Rixner S. Translation caching: skip, don’t walk (the page table). ACM SIGARCH Comput Architect News, 2010, 38: 48–59
Article Google Scholar
Barr T W, Cox A L, Rixner S. SpecTLB: a mechanism for speculative address translation. In: Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA). New York: ACM, 2011. 307–317
Google Scholar
Papadopoulou M-M, Tong X, Seznec A, et al. Prediction-based superpage-friendly TLB designs. In: Proceedings of IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), Burlingame, 2015. 210–222
Google Scholar
Basu A, Gandhi J, Chang J C, et al. Efficient virtual memory for big memory servers. ACM SIGARCH Comput Architect News, 2013, 41: 237–248
Article Google Scholar
Karakostas V, Gandhi J, Ayar F, et al. Redundant memory mappings for fast access to large memories. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture. New York: ACM, 2015. 66–78
Chapter Google Scholar
Fang Z, Zhang L X, Carter J B, et al. Reevaluating online superpage promotion with hardware support. In: Proceedings of the 7th International Symposium on High-Performance Computer Architecture (HPCA). Washington, DC: IEEE, 2001. 63–72
Chapter Google Scholar
Saulsbury A, Dahlgren F, Stenström P. Recency-based TLB preloading. ACM SIGARCH Comput Architect News, 2000, 28: 117–127
Article Google Scholar
Kandiraju G B, Sivasubramaniam A. Going the distance for TLB prefetching: an application-driven study. ACM SIGARCH Comput Architect News, 2002, 30: 195–206
Article Google Scholar
Bhattacharjee A, Martonosi M. Characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors. In: Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT’09). Washington, DC: IEEE, 2009. 29–40
Chapter Google Scholar
Bhattacharjee A, Martonosi M. Inter-core cooperative TLB for chip multiprocessors. ACM SIGARCH Comput Architect News, 2010, 38: 359–370
Article Google Scholar
Adams K, Agesen O. A comparison of software and hardware techniques for x86 virtualization. ACM SIGPLAN Notices, 2006, 41: 2–13
Article Google Scholar
Bhatia N. Performance evaluation of Intel EPT hardware assist. VMware, Inc, 2009. http://www.vmware.com/techpapers/2009/performance-evaluation-of-intel-ept-hardware-assis-1000.html
Google Scholar
Buell J, Hecht D, Heo J, et al. Methodology for performance analysis of VMware vSphere under Tier-1 applications. VMware Technical J, 2013. 19
Google Scholar
Ahn J, Jin S, Huh J. Revisiting hardware-assisted page walks for virtualized systems. ACM SIGARCH Comput Architect News, 2012, 40: 476–487
Article Google Scholar
Gandhi J, Basu A, Hill M D, et al. Efficient memory virtualization. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Washington, DC: IEEE, 2014. 178–189
Google Scholar
Gadre A S, Kabra K, Vasani A, et al. X-xen: huge page support in xen. In: Proceedings of the Linux Symposium, Ottawa, 2011. 7
Google Scholar
Pham B, Vesely J, Loh G H, et al. Using TLB Speculation to Overcome Page Splintering in Virtual Machines. Technical Report DCS-TR-7132015. Rutgers University, 2015
Wang X L, Zang J R, Wang Z L, et al. Selective hardware/software memory virtualization. ACM SIGPLAN Notices, 2011, 46: 217–226
Article Google Scholar
Wang X L, Weng L M, Wang Z L, et al. Revisiting memory management on virtualized environments. ACM Trans Architect Optim, 2013, 10: 48
Google Scholar
Chang X T, Franke H, Ge Y, et al. Improving virtualization in the presence of software managed translation lookaside buffers. In: Proceedings of the 40th Annual International Symposium on Computer Architecture. New York: ACM, 2013. 120–129
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China
Xiaolin Wang, Taowei Luo, Jingyuan Hu & Yingwei Luo
Department of Computer Science, Michigan Technological University, Houghton, MI, 49931, USA
Zhenlin Wang

Authors

Xiaolin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Taowei Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jingyuan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenlin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yingwei Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingwei Luo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Luo, T., Hu, J. et al. Evaluating the impacts of hugepage on virtual machines. Sci. China Inf. Sci. 60, 012103 (2017). https://doi.org/10.1007/s11432-015-0764-7

Download citation

Received: 25 March 2016
Accepted: 10 May 2016
Published: 29 November 2016
DOI: https://doi.org/10.1007/s11432-015-0764-7

Keywords

关键词

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating the impacts of hugepage on virtual machines

Abstract

创新点

Access this article

Similar content being viewed by others

Hzmem: New Huge Page Allocator with Main Memory Compression

Optimizing guest swapping using elastic and transparent memory provisioning on virtualization platform

Huge Page Friendly Virtualized Memory Management

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

关键词

Navigation

Evaluating the impacts of hugepage on virtual machines

Abstract

创新点

Access this article

Similar content being viewed by others

Hzmem: New Huge Page Allocator with Main Memory Compression

Optimizing guest swapping using elastic and transparent memory provisioning on virtualization platform

Huge Page Friendly Virtualized Memory Management

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

关键词

Search

Navigation