From: Robin H. <ho...@sg...> - 2008-01-10 13:16:10
|
On Thu, Jan 10, 2008 at 01:44:18PM +0200, Avi Kivity wrote: > Christoph Lameter wrote: >> On Wed, 9 Jan 2008, Andrea Arcangeli wrote: >> >> >>> This patch is a first basic implementation of the mmu notifiers. More >>> methods can be added in the future. >>> >>> In short when the linux VM decides to free a page, it will unmap it >>> from the linux pagetables. However when a page is mapped not just by >>> the regular linux ptes, but also from the shadow pagetables, it's >>> currently unfreeable by the linux VM. >>> >> >> Such a patch would also address issues that SGI has with exporting >> mappings via XPMEM. Plus a variety of other uses. Go ahead and lets do >> more in this area. >> >> Are the KVM folks interested in exporting memory from one guest to >> another? That may also become possible with some of the work that we have >> in progress and that also requires a patch like this. >> >> > > Actually sharing memory is possible even without this patch; one simply > mmap()s a file into the address space of both guests. Or are you referring > to something else? He is referring to the xpmem work SGI has pushed in the past. It was rejected precisely because this type functionality did not exist. We were trying to determine the cleanest yet smallest acceptable implementation when this suddenly sprang up. I would expect Dean Nelson or myself to repost the xpmem patch set again based upon this patche. > The patch does enable some nifty things; one example you may be familiar > with is using page migration to move a guest from one numa node to another. xpmem allows one MPI rank to "export" his address space, a different MPI rank to "import" that address space, and they share the same pages. This allows sharing of things like stack and heap space. XPMEM also provides a mechanism to share that PFN information across partition boundaries so the pages become available on a different host. This, of course, is dependent upon hardware that supports direct access to the memory by the processor. Thanks, Robin |