Question about adding flags to mmap system call / NVIDIA amd64
driver implementation
Robert Noland
rnoland at FreeBSD.org
Tue Apr 28 23:45:43 UTC 2009
On Tue, 2009-04-28 at 16:48 -0500, Kevin Day wrote:
> On Apr 28, 2009, at 3:19 PM, Julian Bangert wrote:
>
> > Hello,
> >
> > I am currently trying to work a bit on the remaining "missing
> > feature" that NVIDIA requires ( http://wiki.freebsd.org/NvidiaFeatureRequests
> > or a back post in this ML) - the improved mmap system call.
> > For now, I am trying to extend the current system call and
> > implementation to add cache control ( the type of memory caching
> > used) . This feature inherently is very architecture specific- but
> > it can lead to enormous performance improvements for memmapped
> > devices ( useful for drivers, etc). I would do this at the user site
> > by adding 3 flags to the mmap system call (MEM_CACHE__ATTR1 to
> > MEM_CACHE__ATTR3 ) which are a single octal digit corresponding to
> > the various caching options ( like Uncacheable,Write Combining,
> > etc... ) with the same numbers as the PAT_* macros from i386/include/
> > specialreg.h except that the value 0 ( PAT_UNCACHEABLE ) is replaced
> > with value 2 ( undefined), whereas value 0 ( all 3 flags cleared) is
> > assigned the meaning "feature not used, use default cache control".
> > For each cache behaviour there would of course also be a macro
> > expanding to the rigth combination of these flags for enhanced
> > useability.
> >
> > The mmap system call would, if any of these flags are set, decode
> > them and get a corresponding PAT_* value, perform the mapping and
> > then call into the pmap module to modify the cache attributes for
> > every page.
>
> Have you looked at mem(4) yet?
>
> Several architectures allow attributes to be associated with
> ranges of
> physical memory. These attributes can be manipulated via
> ioctl() calls
> performed on /dev/mem. Declarations and data types are to be
> found in
> <sys/memrange.h>.
>
> The specific attributes, and number of programmable ranges may
> vary
> between architectures. The full set of supported attributes is:
>
> MDF_UNCACHEABLE
> The region is not cached.
>
> MDF_WRITECOMBINE
> Writes to the region may be combined or performed out of
> order.
>
> MDF_WRITETHROUGH
> Writes to the region are committed synchronously.
>
> MDF_WRITEBACK
> Writes to the region are committed asynchronously.
>
> MDF_WRITEPROTECT
> The region cannot be written to.
>
> This requires knowledge of the physical addresses, but I believe
> that's probably already necessary for what it sounds like you're
> trying to accomplish.
>
> Back in the FreeBSD-3.0 days, I was writing a custom driver for an AGP
> graphics controller, and setting the MTRR flags for the exposed buffer
> was a definite improvement (200-1200% faster in most cases).
This is MTRR, which is what we currently do, when we can. The issue is
that often times the BIOS maps ranges in a way that prevents us from
using MTRR. This is generally ideal for things like agp and
framebuffers when it works, since they have a specific physical range
that you want to work with.
With PCI(E) cards it isn't as cut and dry... In the ATI and Nouveau
cases, we map scatter gather pages into the GART, which generally are
allocated using contigmalloc behind the scenes, so it is also possible
for it to work in that case. Moving forward, we may actually be mapping
random pages into and out of the GART (GEM / TTM). In those cases we
really don't have a large contiguous range that we could set MTRR on.
Intel CPUs are limited to 8 MTRR registers for the entire system also,
so that can become an issue quickly if you are trying to manipulate
several areas of memory. With PAT we can manipulate the caching
properties on a page level. PAT also allows for some overlap conditions
that MTRR won't, such as mapping a page write-combining on top on an
UNCACHEABLE MTRR.
jhb@ has started some work on this, since I've been badgering him about
this recently as well.
robert.
> -- Kevin
>
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
--
Robert Noland <rnoland at FreeBSD.org>
FreeBSD
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: This is a digitally signed message part
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090428/9c70e1f7/attachment.pgp
More information about the freebsd-hackers
mailing list