Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MmapRegion with huge pages #118

Open
iulianbarbu opened this issue Oct 5, 2020 · 10 comments
Open

Support MmapRegion with huge pages #118

iulianbarbu opened this issue Oct 5, 2020 · 10 comments

Comments

@iulianbarbu
Copy link

Feature request

Add support for huge pages guest memory. For a specific use case, please consult this issue.

@alexandruag
Copy link
Collaborator

Hi! Just to add a bit more context, MmapRegion objects that are backed by huge pages can be created using the MmapRegion::build constructor with a flags value that includes MAP_HUGETLB (there's also the option to use transparent huge pages, for example as discussed around #113). Using an externally managed mapping via MmapRegion::build_raw is also possible. What's currently missing is a way to request huge page support directly via the high level GuestMemoryMmap constructors, which is something we're thinking about going forward.

@EmeraldShift
Copy link

@iulianbarbu

Hi! @doracgp and I are students at the University of Texas at Austin, and we are interested in virtualization and hypervisor technology. We'd like to tackle this issue. Can it be assigned to us?

To be sure, what does this issue entail, beyond presenting a clear interface to request huge pages, backed by these low level functions as @alexandruag describes?

We are also interested in using such a new API downstream in firecracker-microvm/firecracker#2139.

@rbradford
Copy link
Contributor

rbradford commented Oct 27, 2020

You'd want to use memfd_create() to create the huge page file descriptor rather than fiddling about with opening a hugetlbfs filesystem mount. This is how we do it in Cloud Hypervisor.

@jiangliu
Copy link
Member

There's no perfect solution for huge page yet, both hugetlbfs and memfd have limitations. According to our experience, memfd is preferred.

@EmeraldShift
Copy link

You'd want to use memfd_create() to create the huge page file descriptor rather than fiddling about with opening a hugetlbfs filesystem mount. This is how we do it in Cloud Hypervisor.

Is the intention here fully encapsulate the use of memfd and present an "anonymous huge-page-backed" (and maybe also "file huge-page-backed"?) constructor for GuestMemoryMmap? It looks like a way to go about this would be to either (1) add a bool huge to from_ranges indicating the desired page size, or (2) add a new constructor method which indicates a request for huge-page backing.

As far as alignment requirements, are those the responsibility of the VMM, or of this crate? I'm not seeing eager checks on GuestAddress alignment, but I might just be missing it.

@doracgp
Copy link

doracgp commented Nov 12, 2020

There's no perfect solution for huge page yet, both hugetlbfs and memfd have limitations. According to our experience, memfd is preferred.

@jiangliu According to this article, it seems memfd_create makes use of hugetlbfs regardless of whether there was a prior, explicit mount. Could you elaborate on what you meant by limitations for each of them?

@nmanthey
Copy link

Maybe the two issues can be combined (this one and #113 ), so that once configurable, users can choose to either use transparent huge pages, or actual huge pages as a configuration option. That would allow users to either pick the very simple transparent variant, or the more sophisticated, and controlled variant. Thoughts are welcome!

@jiangliu
Copy link
Member

Maybe the two issues can be combined (this one and #113 ), so that once configurable, users can choose to either use transparent huge pages, or actual huge pages as a configuration option. That would allow users to either pick the very simple transparent variant, or the more sophisticated, and controlled variant. Thoughts are welcome!

And there‘s another related PR: #120

@EmeraldShift
Copy link

And there‘s another related PR: #120

I think #120 looks similar to what we'd want to implement here, albeit with stronger coupling between the region's backing behavior and the values of "labels" on the region.

I'm envisioning a configuration option when creating a region that includes multiple options, such as (1) standard base pages, (2) transparent hugepages via madvise with MADV_HUGEPAGE, or (3) explicitly-managed hugepages via memfd_create with MFD_HUGETLB or mmap with MAP_HUGETLB or something else.

Then, a similar getter to the idea proposed in #120 can be used to query a region for the behavior it was configured with. Thus, VMMs can take advantage of this information (as in cloud-hypervisor/cloud-hypervisor#1909), and also have control over these features when allocating memory regions (via the configuration option).

Thoughts?

@jiangliu
Copy link
Member

And there‘s another related PR: #120

I think #120 looks similar to what we'd want to implement here, albeit with stronger coupling between the region's backing behavior and the values of "labels" on the region.

I'm envisioning a configuration option when creating a region that includes multiple options, such as (1) standard base pages, (2) transparent hugepages via madvise with MADV_HUGEPAGE, or (3) explicitly-managed hugepages via memfd_create with MFD_HUGETLB or mmap with MAP_HUGETLB or something else.

We are creating the memfd/hugetlbfsfd by the vmm, and it's reasonable to move this into the vm-memory if it's common:)

Then, a similar getter to the idea proposed in #120 can be used to query a region for the behavior it was configured with. Thus, VMMs can take advantage of this information (as in cloud-hypervisor/cloud-hypervisor#1909), and also have control over these features when allocating memory regions (via the configuration option).

Yes, we should combine these two PRs. One questions is how to make the API platform independent.
Some mechanism, such as hugetlbfs, only works on Linux. And I'm not familiar with Windows.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants