Re: Aperture mapping under GEM

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Zhao, Chunfeng wrote:
> Hi Keith,
> Do we have a time line to merge DRM modesetting_GEM branch to upstream
> main line branch?
>
> Thanks!
>
> Chunfeng
>
> -----Original Message-----
> From: dri...@li...
> [mailto:dri...@li...] On Behalf Of Keith
> Packard
> Sent: Thursday, July 31, 2008 9:18 PM
> To: dri-devel
> Cc: ke...@ke...
> Subject: Aperture mapping under GEM
>
> Ok, we clearly need to deal with mapping subsets of the graphics
> aperture, both for discrete graphics cards and for 2D on tiled surfaces.
> Plus, there are reasons for using WC object mappings which is easily
> done through the aperture.
>
> I haven't spend a huge amount of time thinking about this, but I figured
> I'd prod people into discussion to try and sort things out.
>
> First off, here's what I think I want.
>
> We expose mmap ioctls on the gem objects, and I'd like to use the same
> basic mechanism; when (if?) gem objects become "real" files, we would
> want to continue using the same interface. I suggest creating two mmap
> windows for main memory objects:
>
> 0x00000000-0x7fffffff: map the backing pages directly
> 0x80000000-0xffffffff: map the object through the aperture
>
> I don't quite know what to do with discrete card memory; suggestions
> here are welcome from people who've thought about this more than I.
>
> Using these two per-object windows means there isn't any need to manage
> a synthetic linear address space for some global object (like the DRM
> fd).
>
> Next, we need to hook the mmap path in the driver so that our code can
> get a chance to play. I attached something that might work.
>
> Once we've got an mmap request, here's what I think we want to do:
>
>      1. Detect an aperture mapping request (bit 31)
>      2. Map the object to the aperture (speculating that the app will
>         actually use it)
>      3. Initialize the vma to point at the aperture physical address
>         range
>
> If the object remains mapped to the GTT, there's nothing else to do
> until the unmap request comes along at which point we tear down the vma.
>
> If the object gets unmapped from the GTT, we need to go find every VMA
> mapping it and fix up their PTEs to be unreadable/writable. I'm hoping
> this won't kill performance, but I'm fairly sure this will require an
> IPI to get the TLBs flushed on every core. Right? At least there won't
> be a cache flush as well.
>
> Now, if the application touches any one of those pages, we should map
> the whole object back to the GTT and rewrite the PTEs again. We could do
> this a page at a time, but I don't see any real benefit as we have to
> allocate the aperture space anyways, and it shouldn't be that much more
> expensive to fix up a lot of PTEs than to fix up just one.
>
> I think that's the whole story here; am I missing any big pieces?
>
>   
Keith,

The description would be a little easier to follow if you didn't use the 
term "map" both for
mmap-ing and AGP binding.

Anyway, the above would probably work but for Intel UMA only,
as other driver writers would have to deal with switching caching policy 
and VRAM copies as well, and either not use shmem objects or 
short-circuit their mapping / fault methods.

The Linux mm people are very strongly against having a driver 
manipulating ptes directly. For this reason, one could use 
"unmap_mapping_range()" to invalidate all user ptes pointing to a 
particular range in the address space of an object, and that's why TTM 
needs to manage a fake linear address space for the drm fd.

/Thomas