Re: LibKGI Accel interface

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Mon, 19 Nov 2001, Rodolphe Ortalo wrote:
> For the moment, the accel buffer rings is primarily an (IMO efficient, at
> least not-so-bad) userspace->kernelspace communication mechanism. (Note
> that the arrow is oriented and unidirectional.)

Granted, for now that part can be hacked by including the byte ranges
at the top of the chunk, but the drivers need to be aware that this
hack is occurring and with the planned API the application has to 
avoid stomping on those feilds.  Where it is less than satisfactory for 
me is in the case of drivers that are happy to treat the DMA buffers as a
continuous table full of commands and don't need markers the way WARP
does -- the less magic markers we have in the buffer in those cases,
the easier it is to use as a display-list or as a simple continuous pipe.

> Hmmm... indeed, it seems to me that userspace<-kernelspace (in fact,
> bidirectional) communication is the source of your current problem. I
> still think that userspace *should* use a weakly synchronized approach.
> (At least, we should not assume a need for a strong synchronization
> between kernel context and userspace context in our mechanisms. I think it
> will give bad results.)

Odds are most apps will simply block when buffers are full, but I'd like
the API to be able to present a non-blocking model.  Someone will eventually
figure out an application that this has benefits for.

>  I short, I'd be happier with 2 mechanisms: one userspace->kernelspace and
> one the other way, rather than with a single bidirectional one (which I
> always envisage as too complex).

Well, basically I am not suggesting a r/w page, rather two pages one for
user<-kernel the other for user->kernel.  The user->kernel can be an
ioctl as a temporary solution, but that will result in more context
switches.  That part amounts to manual control by userspace over 
which chunks are mapped for writing, which ranges of bytes are marked
for execution, and which chunks are mapped such that they trigger
execution of other chunks.

> Maybe the KGI system can simply annotate some information wrt the kernel
> context and publish it on request (but you will probably also require the
> driver to cooperate in updating this state). Would you be happy with this?
> Have you considered implementing this as a KGI_RT_COMMAND (display
> command) which would send back to userspace a struct of information which
> would be enough for the needs you envisaged?

I may.

> (First of all, we need support in LibKGI for mode setting and framebuffer
> access... That's the real starter, and I suppose that, on these, you do
> not need much new mechanisms in KGI, except a way to recover information
> from the driver on the board characteristics.)
>
> Have you considered using a userspace system for implementing such
> functionality? I can imagine a system where some threads, hidden by the
> library, manage (userspace) pre-compiled DMA buffers and multiplex them.
> Locking issues would be very simplified also. With the current KGI, this
> may involve some copying in userspace (from memory-based prepared DMA
> buffers to the real DMA ring). Later it may be done using a different
> system.

My general feeling towards threads is that they should always be created
on the application level, and that libraries should offer a syncronous
interface.  This gives the best of both worlds, as the application can simply
spawn a thread that performs a trivial while loop around the library's 
syncronous interface.  IMO there's not much value added in a lot of 
libraries by adding automatic thread creation since a simple while loop
around a syncronous function call is in the end all that the thread ends 
up doing.

The batchops have in them a method to check their current progress, if 
the application wants to manage the display lists in a separate thread
that they have created.  That keeps the LibGGI stuff simpler and eliminates 
the need for dependency on -lpthreads.

> DMA)... But (as you) I'd happily save some time not implementing new
> things in KGI for LibKGI v0.1... :-)

Yes, I am just in a spin trying to figure out what compromises to make
in order to get the first working prototype together -- the more I make,
the more changes happen after the fact which must be propagated into code
that used the prototype.  I suppose I should not be too concerned about the
volume of application code at this point, though :-).

> I've heard of it, but I do not know the exact details. What is this
> unmap() hook used for?

It used to be used by only one other thing in the kernel -- the basic
kernel filemap used it to sync disk buffers after an unmap.  KGI was
the only other thing to ever use the hook, and it uses it to clean up
its own private records of the current mappings.  We can do this 
cleanup after the unmap when the application calls an IOCTL to invalidate
the (now unmapped) mapping in KGI's internal records.

> > 4) (What I'm working on now) create a system for the informational
> > SHMEM resource and en-masse negotiation of resources.  Needed for LibGGI
> > with or without LibKGI.
> 
> Does it really need to be a SHMEM resource or can it take advantage of the
> KGI_RT_COMMAND resource type (which directly relies on ioctl())? No
> display command is really implemented yet, but all the infrastructure is
> in place IIRC. (I guess kgim_display_command() inside kgim-0.9.c should
> call appropriate functions in the driver module instead of returning an
> error code.)

This resource is going to be indeterminately large due to the fact that
it can carry subsystem-private data.  The ioctl mechanism's max buffer 
size is 16KByte, at least under Linux.  I can use an ioctl in the short 
term but will need a SHMEM (or if that's not available on an OS, fall back to 
read()/write() on the filehandle.)  Using multiple ioctls would result
in a situation where the driver had to retain state for a request while
waiting for the rest of the ioctls to happen.

--
Brian