|
From: Sottek, M. J <mat...@in...> - 2001-12-03 18:10:31
|
>hey, it looks nice. i was kinda expecting something really bad to happen, with >the whole removal of mmap et al. yet you provided a physical address, which is >a good thing. hopefully this will be mapable somehow. (closer reading shows a >xxx_fb_mmap functions. good deal.) This is actually something I've been debating. I originally had mmap in there then removed it... obviously there were some remnants left. Ioctl needs to go away for network transparency, once you implement the read/write command functions there is no reason to have ioctl too. The local case doesn't need to be different than the remote case. mmap is a different story. It may be a nice feature for applications, but it causes a few problems that are very hard to overcome. #1) If a client has a mmapped region of video memory and for some reason this region is no longer valid, how do you prevent the client from drawing there? We don't have some advanced event interface to tell clients how to behave, and at any rate relying on the client to do the right thing isn't acceptable. When you switch vt's or another client changes the mode... the client with the mmapped region has to be altered. The only way to do this is to remove the client's map install a zero page fault handler to pause the client when it tries to draw, and then add it back when the vt returns. This behavior is shady but might be worth it. #2) Not all memory regions are memory mappable. Most of the modes on Intel hardware have a pitch that differs from width. This isn't a huge problem since you can easily expose the pitch in the surface view. What about when the "extra" memory between the width and pitch is being used for something else? There will always be valid reasons why you don't want to allow mmap. Therefore a client _has_ to have a read/write fallback anyway, the value of mmap gets smaller when you have to write the harder read/write code anyway. #3) When two clients memory map certain types of memory, bad things could happen. What if it is some type of shared memory used for command buffers or double buffered registers? Then you have to have some type of locking on the memory mapped ranges like the dri does. At that point you have given a client the ability to take the lock and keep it...basically cooperative multitasking. >one thing i don't like about the current methods is the lack of partially >addressable screens (ie vesa1.2 banks). it would be nice if somewhere along >the line someone put in an ADDRESSED_BANKED define and then had a size and >pointer to the bank. it woudl make it really helpful when i develop on my 486 >(i kid you not, its 11 years old and works like a charm). this is more of a >novelty feature though =) If you look back at some of my posts you'll find that the reason I became concerned about the furture of the framebuffer is that The Intel 810 and 815 chipsets have only banked memory when the Gtt isn't used. The way to handle this isn't as you stated... to give the client an indication that bank switching is necessary, and possibly an interface to switch the banks. That is too complicated for the client. With this interface there is no need to make the client aware of the banks. The driver can always switch the banks on the fly. Mmap is difficult and messy, but can be done with page faulting tricks. >overall, it would be nice for an application to know if its connecting to a >networked device rather than a local one. I still see no valid reason for mmap >to die in the name of network transparency (every box has a local set of >devices, not every box has a network greater than itself). im not sure of the >internals of this, it may be possible already, and im just beating a dead horse. Discussed above. I agree, it is a nice feature when it works. If it were supported it would just go like this. Try to mmap, if it fails you can't have it so try something else. This way all the reasons for mmap failing are handled. Not just network. >the text seems to have a small hint at keeping the device stable enough and >protected from the user to keep things from going bad. im kinda iffy on things >protecting the user in this respect (sometimes we learn best not by watching >how things work, but how they break), but the feature would be good as well, >depending on the power of the device. some kind of restore_sane_state function >and corresponding interface would be handy (special key stroke on terminal 8 or >a quadruple right mouse click or any kind of event that rarely happens (other >than system reset)). a configurable sane state would be a nice addition to >this too =) Maybe a sysreq type thing. This shouldn't be necessary, it is perfectly possible to make an interface from which bad things cannot happen. That was one of my goals. >we have seen in many cases where 'device specific' 'features' cause problems. >the stuff here should be all standard I'm sorry but absolutely not. The basic features of the fb should be standard, everything else should not. The reason is simple. In order to wrap differing functionality of hardware you have to add software. The more complex the task the more software you need. Take 3d, what about chips without 3d? Are you going to implement 3d in software in the kernel? Certainly not. If you define some bit pattern that indicates if 3d is supported or not, all you've done is make the client choose a rendering path based on the driver. If you just used a user library in the first place this wouldn't be necessary. I think I wasn't clear in my document, but I'll state it here. NOTHING that is device specific should ever be touched by a client directly. Only a library should be touching those features. So in the 3d case the client uses Mesa, it links against libGL and doesn't have to worry about anything else. Mesa, has a user-space driver which does hardware specific things and dispatches, via the driver specific interfaces, to the driver. >is cursor now controlled by fb, or console? (i read of a development to >separate these, as the redundancy caused problems etc. dont recall which took >the cursor though). In this design the fb _draws_ the HW cursor if one is present. Otherwise the client does it. Note that there is no _hot spot_ in my design. This only has meaning to the client, not the fb. >finally: i, personally, would really like to see a large kerenel driver level >integration of hardware acceleration. this would be rediclously large and >complex and everything, but when successfully completed, will allow any sort of >acceleration from any kind of application. I think you have a limited view of a "driver". "Really large" and kernel should never be mixed. If you want a nice, use anywhere, driver interface for an application you need a library of some sort. That is the only good way. There are three ways to do a driver but only one good way: #1) The application writes directly to the hardware. This is how the old DOS games worked. Today this is not technically feasible it is a terrible idea. You would have to run all applications with sufficient permission to directly access hardware. In practice this means no security what-so-ever. #2) You make a permissions enhanced driver that can be accessed by permissions deprived clients with a defined interface. This is the kernel driver model. The problem here is that the complexity of the interface, and therefore driver code, increases exponentially with each feature added (provided the feature isn't exactly the same on all hardware). So your kernel driver gets huge quickly, Linux kernel developers wouldn't allow that. There is another option here. Make the kernel driver small but only implement a subset of available features, forcing the client to do the work. This is how the OSS driver work (sound). The client may want 24bit 44k stereo, but when the client asks the kernel for it the answer may be NO! In that case the client has to do rate conversion to a sound format that IS supported. This is all fine and good except that now you have made ALL clients responsible for rate conversion, most will do it badly, some not at all. #3) Implement this rule: "Get as close to the hardware format as is SAFELY possible in user-space then to the rest in the kernel". This is the model I was looking to do. To take the sound example above this would mean that a sound library queries the sound driver to determine the supported formats at startup. The client tells the library 24bit 44k stereo and the library then determines that this isn't possible and converts without the clients knowledge. The "driver" does the work, just as you wanted. But "driver" does not mean kernel-space. Why not do the whole thing in user-space? (XFree?) Because not everything can be done that way. You cannot have correct device virtualization, locking, etc. You also force all clients to use the single API you have provided. In the split model you can have several libraries implementing different API's without redoing the lowest level hardware drivers. These only have to be done once and all clients, XFree, DRI, Directfb etc. can use them. One more point here. By standardizing the very basic parts of the API you allow clients to use these features without the overhead of a library. Mode setting, basic drawing etc. Very basic features that can be wrapped with minimal code complexity that have obvious gains by being standard. -Matt |
|
From: <cw...@so...> - 2001-12-03 18:59:11
|
> This is actually something I've been debating. I originally had mmap in > there then removed it... obviously there were some remnants left. > > mmap is a different story. It may be a nice feature for applications, but > it causes a few problems that are very hard to overcome. > > #1) If a client has a mmapped region of video memory and for some reason > this region is no longer valid, how do you prevent the client from drawing > there? We don't have some advanced event interface to tell clients how > to behave, and at any rate relying on the client to do the right thing isn't > acceptable. When you switch vt's or another client changes the mode... > the client with the mmapped region has to be altered. The only way to > do this is to remove the client's map install a zero page fault handler > to pause the client when it tries to draw, and then add it back when the > vt returns. This behavior is shady but might be worth it. yes, in dealing with modern fb, we have had this problem as well (drawing on fb and switching to a different sized vt causes crazy behaviour). perhaps a call to see if the context is still valid could be added, which user land applications would use to see if they are still allowed to be drawing. this doesnt provide any security though =( your approach would work in all cases, but would be difficult to manage. > #2) Not all memory regions are memory mappable. Most of the modes on Intel > hardware have a pitch that differs from width. This isn't a huge problem > since you can easily expose the pitch in the surface view. What about > when the "extra" memory between the width and pitch is being used for > something else? There will always be valid reasons why you don't want > to allow mmap. Therefore a client _has_ to have a read/write fallback > anyway, the value of mmap gets smaller when you have to write the harder > read/write code anyway. oo, are there cases where extra pitch memory is used for things? the area is so broken up and in small chunks that i didnt think any hardware developer would use it. but if intel's do that, i'd imagine there is probably something there. this certainly causes problems i wasn't aware of. > #3) When two clients memory map certain types of memory, bad things could > happen. What if it is some type of shared memory used for command buffers > or double buffered registers? Then you have to have some type of locking > on the memory mapped ranges like the dri does. At that point you have > given a client the ability to take the lock and keep it...basically > cooperative multitasking. perhaps making command buffers not mmapable? in my (limited) understanding, they would function just as well ala read/write ? locking would also work. i've heard of irix doing some kind of command locking or something, to keep it clean. > If you look back at some of my posts you'll find that the reason I became > concerned about the furture of the framebuffer is that The Intel 810 and > 815 chipsets have only banked memory when the Gtt isn't used. The way > to handle this isn't as you stated... to give the client an indication > that bank switching is necessary, and possibly an interface to switch > the banks. That is too complicated for the client. With this interface > there is no need to make the client aware of the banks. The driver can > always switch the banks on the fly. Mmap is difficult and messy, but > can be done with page faulting tricks. i believe watcom and maybe other old dos compilers used page faults to switch banks on the fly. is this possible/safe/reliable in linux? if so, a manual bank switch interface would be useless. I had considered page faults, but i was not sure how kernel handled them. > Discussed above. I agree, it is a nice feature when it works. If it were > supported it would just go like this. Try to mmap, if it fails you can't > have it so try something else. This way all the reasons for mmap failing > are handled. Not just network. yes, thats the idea i have been going for (keep mmap in, and if it fails do something else). > Maybe a sysreq type thing. This shouldn't be necessary, it is perfectly > possible to make an interface from which bad things cannot happen. That was > one of my goals. if command buffers can be locked, a severely broken application could cause bad things to happen, and not have them unlock properly. this would probably results as a failure in the driver code, as even a crashed user land application is capable of cleaning itself up some. if the interface is indeed clean, no sort of sane restore would be necessary. its all in how reliable the drivers are =) > I'm sorry but absolutely not. The basic features of the fb should be > standard, > everything else should not. The reason is simple. In order to wrap differing > functionality of hardware you have to add software. The more complex the > task > the more software you need. Take 3d, what about chips without 3d? Are you > going to implement 3d in software in the kernel? Certainly not. If you > define > some bit pattern that indicates if 3d is supported or not, all you've done > is make the client choose a rendering path based on the driver. If you just > used a user library in the first place this wouldn't be necessary. > > I think I wasn't clear in my document, but I'll state it here. NOTHING that > is device specific should ever be touched by a client directly. Only a > library > should be touching those features. So in the 3d case the client uses Mesa, > it links against libGL and doesn't have to worry about anything else. Mesa, > has a user-space driver which does hardware specific things and dispatches, > via the driver specific interfaces, to the driver. Ahh, ok =) yes, from user-land, they should only see one interface. I was under the impression that the client would have to account for all different interfaces. thanks for clearing that up. > I think you have a limited view of a "driver". "Really large" and kernel > should > never be mixed. If you want a nice, use anywhere, driver interface for an > application you need a library of some sort. That is the only good way. > > #2) You make a permissions enhanced driver that can be accessed by > permissions > deprived clients with a defined interface. This is the kernel driver model. > The problem here is that the complexity of the interface, and therefore > driver code, increases exponentially with each feature added (provided the > feature isn't exactly the same on all hardware). So your kernel driver gets > huge quickly, Linux kernel developers wouldn't allow that. > There is another option here. Make the kernel driver small but only > implement > a subset of available features, forcing the client to do the work. This is > how the OSS driver work (sound). The client may want 24bit 44k stereo, but > when the client asks the kernel for it the answer may be NO! In that case > the client has to do rate conversion to a sound format that IS supported. > This > is all fine and good except that now you have made ALL clients responsible > for rate conversion, most will do it badly, some not at all. > #3) Implement this rule: "Get as close to the hardware format as is SAFELY > possible in user-space then to the rest in the kernel". This is the model > I was looking to do. To take the sound example above this would mean that > a sound library queries the sound driver to determine the supported formats > at startup. The client tells the library 24bit 44k stereo and the library > then determines that this isn't possible and converts without the clients > knowledge. The "driver" does the work, just as you wanted. But "driver" > does not mean kernel-space. > Why not do the whole thing in user-space? (XFree?) Because not everything > can be done that way. You cannot have correct device virtualization, > locking, > etc. You also force all clients to use the single API you have provided. In > the split model you can have several libraries implementing different API's > without redoing the lowest level hardware drivers. These only have to be > done > once and all clients, XFree, DRI, Directfb etc. can use them. > One more point here. By standardizing the very basic parts of the API you > allow clients to use these features without the overhead of a library. Mode > setting, basic drawing etc. Very basic features that can be wrapped with > minimal code complexity that have obvious gains by being standard. ahh, ok. i was visualizing a combination of #2 and #3 (#1 is bad for any kind of modern non/embedded system). i dont know what i was thinking there. sorry about that last part. libraries are indeed better in that situation. i suppose the idea was to provide a sufficiently diverse interface to the libraries that are going to be implimenting the acceleration, but that seems mostly, if not completely, done already. thanks for the enlightening view of the driver/kernel-land/library model =) chris --- moc.lexiptfos@thgirwc --- |
|
From: Sottek, M. J <mat...@in...> - 2001-12-03 19:41:57
|
>oo, are there cases where extra pitch memory is used for things? the area is >so broken up and in small chunks that i didnt think any hardware developer >would use it. but if intel's do that, i'd imagine there is probably something >there. this certainly causes problems i wasn't aware of. Well I don't use the extra, but I could :) I was looking for a cheap example of when you wouldn't want mmap to happen. Some hardware have tiled memory framebuffers that might not work correctly unless you knew about the tiling. Intel's tiled memory isn't a problem, but I think some might be. The extra pitch-width area isn't useless. You can put other surfaces there. You think of the area as being broken up, but really it is just a thinner area with the same pitch as the framebuffer. i.e. if your framebuffer is 800x600 at 16bit, on Intel hardware you have a fb pitch of 2048. That is an area of 448x600 with a pitch of 2048 that can be used. Maybe this reason works, what if the framebuffer isn't page aligned (both the beginning and the end) then you'd get access to memory that you shouldn't. Drivers could just be careful not to put things that you shouldn't have access to in the overlap I guess. The details aren't important (That's what you say when you can't find a good reason) but just in case there are reasons why you couldn't do mmap, we need to account for that fact. > #3) When two clients memory map certain types of memory, bad things could > happen. What if it is some type of shared memory used for command buffers > or double buffered registers? Then you have to have some type of locking > on the memory mapped ranges like the dri does. At that point you have > given a client the ability to take the lock and keep it...basically > cooperative multitasking. Command buffers are just one example. The framebuffer works just as well as an example although the results are not as catastrophic. Think of the banked case again. If two apps were trying to both write to the fb at the same time the hardware would be stuck flipping banks all the time and never get any work done. Or what if you were doing some interface where multiple apps alpha blended with the framebuffer? You have to read->blend->write but without mutual exclusion you can't be sure that what you read is still there when you go to write. Whenever you share resources between two processes you need a way to do mutual exclusion. This is just hard with mmaped areas. Not impossible, just hard. I'm just giving devils advocate arguments as to why mmap may be more work than it is worth. Without mmap lots of things become trivial that would otherwise be very difficult. With mmap you get a speedup in some operations, but is it worth the pain? >i believe watcom and maybe other old dos compilers used page faults to switch >banks on the fly. is this possible/safe/reliable in linux? if so, a manual >bank switch interface would be useless. I had considered page faults, but i >was not sure how kernel handled them. I have it working in a test driver now. My issue isn't user apps doing mmap, it is the kernel behaving as if it has direct access to my hardware when it shouldn't. I'm not sure how popular my use of zap_page_rang() is going to be in a driver. We may need to look for another way. >if command buffers can be locked, a severely broken application could cause bad >things to happen, and not have them unlock properly. this would probably >results as a failure in the driver code, as even a crashed user land >application is capable of cleaning itself up some. if the interface is indeed >clean, no sort of sane restore would be necessary. its all in how reliable the >drivers are =) Yes, the DRI suffers from this potential problem (Although except for my XvMC work and Mesa no one else uses the DRI) If the user-land driver acts badly with the lock the whole system comes down. I haven't spent a lot of time on in but with the read/write interface and mode switching in the kernel a lot of the reasons for holding the lock outside the kernel go away. >thanks for the enlightening view of the driver/kernel-land/library model =) Glad you got something out of it. I would hardly call it enlightening, more like rambling. -Matt |
|
From: Chris W. <cw...@so...> - 2001-12-03 21:20:01
|
> Well I don't use the extra, but I could :) I was looking for a cheap example > of when you wouldn't want mmap to happen. Some hardware have tiled memory > framebuffers that might not work correctly unless you knew about the > tiling. Intel's tiled memory isn't a problem, but I think some might be. > The extra pitch-width area isn't useless. You can put other surfaces there. > You think of the area as being broken up, but really it is just a thinner > area with the same pitch as the framebuffer. i.e. if your framebuffer is > 800x600 at 16bit, on Intel hardware you have a fb pitch of 2048. That > is an area of 448x600 with a pitch of 2048 that can be used. true true, it would most likly be used as an off screen buffer. I am simply used to having very small pitches (3 bytes or less from24bpp misalignment), but no reason very large ones couldnt exist. mmaping that would be extremely insecure > Maybe this reason works, what if the framebuffer isn't page aligned (both > the > beginning and the end) then you'd get access to memory that you shouldn't. > Drivers could just be careful not to put things that you shouldn't have > access to in the overlap I guess. to me, that'd be a driver technicality, but it does provide an example of where things can go wrong =) > The details aren't important (That's what you say when you can't find a > good reason) but just in case there are reasons why you couldn't do mmap, > we need to account for that fact. accounting for it is simply not putting it in, and coding functionality that isnt dependant upon it. as a feature, rather than a guaranteed item, it's just nice to keep around. good applications would/already should account for mmap failing, and if they dont, they would simply segfault. > Command buffers are just one example. The framebuffer works just as well > as an example although the results are not as catastrophic. Think of > the banked case again. If two apps were trying to both write to the fb > at the same time the hardware would be stuck flipping banks all the time > and never get any work done. Or what if you were doing some interface > where multiple apps alpha blended with the framebuffer? You have to > read->blend->write but without mutual exclusion you can't be sure that > what you read is still there when you go to write. I did several starfield demos back in the day for my vesa1.2 card. bank switching all over the place =). the alpha blending context problem is a real problem though. that would cause _all kinds_ of errors. drat, yet another reason for mmap to go away. and command buffers would almost certainly need to be read/write only (no mmap), but i may be wrong with that. mutexing it would work as well, but it would probably me more complex, and im not sure if mmaping command buffers grants the speed that mmaping fb does. (haven't done any heavy non-fb graphical code). > Whenever you share resources between two processes you need a way to do > mutual exclusion. This is just hard with mmaped areas. Not impossible, > just hard. I'm just giving devils advocate arguments as to why mmap may > be more work than it is worth. Without mmap lots of things become trivial > that would otherwise be very difficult. With mmap you get a speedup in > some operations, but is it worth the pain? to me, the speed up would be a very large asset, esp in regards to raw framebuffer data reads/writes. this day in age, hardware is so rediclously fast that we can realistically say 'hey, lets throw out memory access completely, and let kernel handle it' and not expect our systems to be rendered unusable. but there is a large base of systems that take huge performance hits; not good if linux is to grow off of the server and onto the desktop of real people, not just linux geeks =) though the applications that actually use fb could probably be counted on two hands. the techniques for getting it to work, as you have stated, are completely beyond my understanding (i am but a mere user-land programmer for the most part :), but if they are possible, they might be helpful in the long run. *shrugs* wish i knew more. > I have it working in a test driver now. My issue isn't user apps doing mmap, > it is the kernel behaving as if it has direct access to my hardware when > it shouldn't. I'm not sure how popular my use of zap_page_rang() is going > to be in a driver. We may need to look for another way. where is your source (if any is available online)? i'd be interested in taking a look sometime. you've done far more than i have most likley. > Yes, the DRI suffers from this potential problem (Although except for my > XvMC > work and Mesa no one else uses the DRI) If the user-land driver acts badly > with the lock the whole system comes down. I haven't spent a lot of time on > in but with the read/write interface and mode switching in the kernel a > lot of the reasons for holding the lock outside the kernel go away. i suppose it is the responsibility of the driver writer(s) to make sure the driver is safe. but if it can be kept safe, and keep the system from going down, its ok by me :) chris --- moc.lexiptfos@thgirwc --- |
|
From: Geert U. <ge...@li...> - 2001-12-04 15:35:24
|
On Mon, 3 Dec 2001, Sottek, Matthew J wrote:
> >oo, are there cases where extra pitch memory is used for things? the area
> is
> >so broken up and in small chunks that i didnt think any hardware developer
> >would use it. but if intel's do that, i'd imagine there is probably
> something
> >there. this certainly causes problems i wasn't aware of.
>
> Well I don't use the extra, but I could :) I was looking for a cheap example
> of when you wouldn't want mmap to happen. Some hardware have tiled memory
> framebuffers that might not work correctly unless you knew about the
> tiling. Intel's tiled memory isn't a problem, but I think some might be.
>
> The extra pitch-width area isn't useless. You can put other surfaces there.
> You think of the area as being broken up, but really it is just a thinner
> area with the same pitch as the framebuffer. i.e. if your framebuffer is
> 800x600 at 16bit, on Intel hardware you have a fb pitch of 2048. That
> is an area of 448x600 with a pitch of 2048 that can be used.
There exist PowerMac graphics cards that put the hardware cursor in that area.
Gr{oetje,eeting}s,
Geert
P.S. Matt: I'm still reading your document (I'm halfway). I'll post my list of
questions and comments after I've finished reading the whole document. So
far it looks nice!
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@li...
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-03 21:53:08
|
>where is your source (if any is available online)? i'd be interested >in taking a look sometime. you've done far more than i have most likley. It isn't available online yet. I need it in more of a workable state before I publish it as a beta. I am tracking down a kernel wedge when I call info->changevar(con); I've tracked it into fb_setup() and possibly vc_resize_con(). Maybe one of the regulars know of something down this path that is going to try to touch something it shouldn't. Once I get that done I'll clean it up and post it just for review. It still won't allow read/write on the device file and the logo code can't work. Both of those require fixes outside the driver. I'm looking for the least invasive solution for 2.4. -Matt |
|
From: Petr V. <VAN...@vc...> - 2001-12-03 22:25:19
|
On 3 Dec 01 at 16:21, Chris Wright wrote:
> > The extra pitch-width area isn't useless. You can put other surfaces there.
> > You think of the area as being broken up, but really it is just a thinner
> > area with the same pitch as the framebuffer. i.e. if your framebuffer is
> > 800x600 at 16bit, on Intel hardware you have a fb pitch of 2048. That
> > is an area of 448x600 with a pitch of 2048 that can be used.
>
> true true, it would most likly be used as an off screen buffer. I am
> simply used to having very small pitches (3 bytes or less from24bpp
> misalignment), but no reason very large ones couldnt exist. mmaping that
> would be extremely insecure
No PC hardware I know (*) interlaces framebuffer with some non-fb related
data. And if fbdev driver decides to put some special data inside - it is
buggy fbdev driver then.
> > Maybe this reason works, what if the framebuffer isn't page aligned (both
> > the
> > beginning and the end) then you'd get access to memory that you shouldn't.
> > Drivers could just be careful not to put things that you shouldn't have
> > access to in the overlap I guess.
>
> to me, that'd be a driver technicality, but it does provide an example of
> where things can go wrong =)
You cannot address devices on smaller granularity than page, so I do not
understand how you would like to construct such device. I know no device
which does that... And nobody forces you to have framebuffer aligned on
page - at least matroxfb on Millennium I happily did it in the past.
> > be more work than it is worth. Without mmap lots of things become trivial
> > that would otherwise be very difficult. With mmap you get a speedup in
> > some operations, but is it worth the pain?
>
> to me, the speed up would be a very large asset, esp in regards to raw
> framebuffer data reads/writes. this day in age, hardware is so
> rediclously fast that we can realistically say 'hey, lets throw out memory
> access completely, and let kernel handle it' and not expect our systems to
> be rendered unusable. but there is a large base of systems that take huge
> performance hits; not good if linux is to grow off of the server and onto
No. All OSes I know - although they maybe started with some bright ideas -
- sooner or later just gave direct access to videoram for apps - DirectX
for Windows, DGA for XFree.
What I see here is that useful parts of fbdev are removed in the name of
supporting some obsolete or nonfunctional hardware, or in the name of
supporting remote fbdev. Well, make mmap optional, but do not remove it.
I need 80MBps throughput of images from CPU to videoram - it is just
impossible to pass such stream from CPU to main memory, and then from
main memory to videocard with current 33MHz PCI bus and host bridges. After
I'll have 1Gbps network card in each computer around, with 64bit 66MHz PCI,
maybe I'll think that I may need remote fbdev. But until then I'll leave
remote access on app level - I'll run mplayer remote, instead of feeding
uncompressed pictures over the wire.
And now few comments to Framebuffer Interface proposal, draft 1, from
Matt Sottek:
fb_display_info - FB_DISPLAY_*SYNC_HIGH - I did not found into which
field they belong - into flags? And there is missing 'Use COMPOSITE SYNC'
and 'Use SYNC ON GREEN'? How do you query these capabilities for different
modes?
fb_mode_info - How you came to idea that both RGBRGBRGBRGB and
RGBARGBARGBA should use 24BIT, while you do not use 15BIT for 1:5:5:5
16bpp videomode. There is also no definition for BGR layout - what
was wrong with r/g/b/a offset/size?
fb_surface_info - Note that existence/non-existence of FB_SURFACE_TYPE_CURSOR
does not tell you anything about existence of hardware cursor - there
is hardware which does not store cursor body in framebuffer.
Having physical address here is wrong - what is reason for having it here?
Cannot it be relative to something else? And as you do not tell anything
about contents of these areas, what is pitch for?
I do not agree with semantics of /dev/gfx/fb0/command. It should act
as normal pipe - no seeks, you can queue commands and then get command
results back.
fb_set_interface - pass bitmaps through, so userspace APP can offer
which versions are supported, and kernel can reply back which version
was negotiated.
I see no format info on any fb_put() op - either driver or kernel.
Same for fb_get().
What is behind di* interface, except slowing things down due to
additional level of indirection?
Thanks,
Petr Vandrovec
van...@vc...
(*) In 1985 there was (czecho)slovak computer PMD85 which interlaced
picture with system variables - but even at that time if you'd replace
OS, you could use this extra pitch for whatever you want (e.g. for nothing).
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-04 01:22:37
|
>No PC hardware I know (*) interlaces framebuffer with some non-fb related >data. And if fbdev driver decides to put some special data inside - it is >buggy fbdev driver then. You are correct, I was searching for an example of when mmap wouldn't work. I think for the most part I've given up this search :) >You cannot address devices on smaller granularity than page, so I do not >understand how you would like to construct such device. I know no device >which does that... And nobody forces you to have framebuffer aligned on >page - at least matroxfb on Millennium I happily did it in the past. Right so if your framebuffer isn't page aligned (both begin and end) can't the client run off of the framebuffer memory into any other data in the same page? So If you had a command buffer that started right after the framebuffer you might be able to accidentally hit it by running off the end of the framebuffer pointer. This is an easy fix for the driver... just don't do that. I am backing off all assertions that there are good reasons not to do mmap of the framebuffer. The only reason to not allow mmap is that it is hard to force apps to stop writing when they shouldn't. >No. All OSes I know - although they maybe started with some bright ideas - >- sooner or later just gave direct access to videoram for apps - DirectX >for Windows, DGA for XFree. Agreed. Although DGA isn't a good success story. >I need 80MBps throughput of images from CPU to videoram - it is just >impossible to pass such stream from CPU to main memory, and then from >main memory to videocard with current 33MHz PCI bus and host bridges. If you are decoding video you'll have a working copy in RAM anyway, the kernel can read directly out of that copy any into the framebuffer. Not as fast as MMap for some things but not as bad as copying from memory to a temp space then into the framebuffer. I agree, Mmap should probably stay. We'll just have to work out a way to force apps to behave. I haven't given up on the fault handler idea. >fb_display_info - FB_DISPLAY_*SYNC_HIGH - I did not found into which >field they belong - into flags? And there is missing 'Use COMPOSITE SYNC' >and 'Use SYNC ON GREEN'? How do you query these capabilities for different >modes? Yea, flags. I didn't go through the trouble of working out all the #defines that would be needed. I was more concerned with looking at the interface from a different point of view. There are lots of flags that would need to be added, I was pretty sure they could be accommodated with the flags bits. You shouldn't need to query for capabilities. You set them and if the driver comes back with them unset then it wasn't possible. Just like refresh or anything else, try it and see if it works. >fb_mode_info - How you came to idea that both RGBRGBRGBRGB and >RGBARGBARGBA should use 24BIT, while you do not use 15BIT for 1:5:5:5 >16bpp videomode. There is also no definition for BGR layout - what >was wrong with r/g/b/a offset/size? I meant RGBnothing not RGBalpha. RGBA is 32bpp.These were just examples of grouping unique id's so that they have some derived meaning. The examples provided are a 5 minute guess at a grouping. I didn't like the r/g/b/a offset/size because it doesn't handle non RGA formats and makes it difficult for a driver writer to determine what the type is exactly. My way it is a simple case statement. The complicated rendering code that needs the r/g/b/a offset stuff can look it up in userspace. Using unique ids can support any color type. Right now if a driver does it's own character rendering it would have to not only check bpp but either handle all the weird offsets or check to make sure they are sane. It is just to make the kernel side easy and leave the hard stuff to user-land. >fb_surface_info - Note that existence/non-existence of FB_SURFACE_TYPE_CURSOR >does not tell you anything about existence of hardware cursor - there >is hardware which does not store cursor body in framebuffer. The surfaces are just "views" of data. When you are looking for the physical address of the fb you can query surface #0 to get it, checking that surface #0 has a type that indicates it is a fb. If you are looking for a hardware cursor you would just query all the surfaces until you found one that said it was a cursor. It doesn't have to be in video memory at all. You just need to know what surface # it is. If you want to set up a cursor you call fb_set_cursor() if it returns ENODEV, you've got no cursor. You only need to look for the surface when you want to write to the cursor, after you've negotiated the type and stuff. >Having physical address here is wrong - what is reason for having it here? >Cannot it be relative to something else? And as you do not tell anything >about contents of these areas, what is pitch for? I was against the physical address showing up anywhere too but others were not. If you wanted to make a frame grabber write to memory directly you could use the physical address from the surface 0. (or if you had a surface for an overlay) Pitch has no meaning to anyone except those who are accessing the memory directly. Read/write etc. are only referencing the actual visible memory so this information doesn't belong in fb_mode_info. In short, the surface view if helps in getting the data needed for direct access via mmap or some other hardware. For most writable surface types it is either of device dependent format (command buffers, depth buffer) or it is defined somewhere else (cursor, framebuffer). Surfaces are just the information about the memory. How big, where? for what? arrangement? dirty? >I do not agree with semantics of /dev/gfx/fb0/command. It should act >as normal pipe - no seeks, you can queue commands and then get command >results back. Can you write this out completely? I see what you mean about queuing commands but I fail to see how return values and return data would happen as the pipe would be somewhat asynchronous and one direction. >fb_set_interface - pass bitmaps through, so userspace APP can offer >which versions are supported, and kernel can reply back which version >was negotiated. Could be done that way too but I see a couple issues. First a bitmap has no priority, how do you know what the client really wanted? My way the client can just poll until they get one they can live with. In practice this isn't going to happen this way at all. A client is going to be written for ONE interface and is just telling the driver how to behave. If the driver can't do it, the app doesn't run. That is what would happen in your case too. An app written today supports the last 5 versions in it's bitmap, 6 years from now the driver doesn't support any of those so it doesn't work. Old app + new driver = New driver acts old New app + old driver = The driver probably doesn't have the features anyway. >I see no format info on any fb_put() op - either driver or kernel. >Same for fb_get(). Correct, as with sound, there is no way the kernel should do converting of any kind. If you can't put() what the driver is using then you need to do it in a library, not the kernel. It is too slim a line to say RGB16->RGA24 isn't hard so I'll do that in the kernel but YUV4:2:0->YV12 is too hard so I won't do that. >What is behind di* interface, except slowing things down due to >additional level of indirection? There is nothing in there of any speed concern. First, the reason for the context stuff at all was to keep things that the driver shouldn't have access to out of it's hands. Things that belong only to the di layer. Separation is good, it prevents accidents. (I think that wasn't your point) I think you were asking about setting the fb_mode_info and such which DO belong to the driver. The reason for using the interface is that the di layer is helping the driver out by always keeping the most up-to-date copy. This way fb_get_* doesn't have to go down to the driver level. Otherwise you have to have a resource that is shared. All shared resources need mutual exclusion. What happens when the driver is updating the fb_display_info structure and another process queries the structure. If it is shared you could get invalid results. Then you could add a mutex so this doesn't happen, but someone will forget and now you've got problems. This could also happen via a hotplug interrupt so your di layer can't just provide mutual exclusion at the interface. With the difb*() everything is easy. Let the di layer do all the hard parts for you. The tiny loss of speed isn't going to be detectable anyway. Nothing in there happens on a regular basis. -Matt |
|
From: Sven <lu...@dp...> - 2001-12-04 07:55:22
|
On Mon, Dec 03, 2001 at 05:22:27PM -0800, Sottek, Matthew J wrote: > > >No PC hardware I know (*) interlaces framebuffer with some non-fb related > >data. And if fbdev driver decides to put some special data inside - it is > >buggy fbdev driver then. > > You are correct, I was searching for an example of when mmap wouldn't work. > I think for the most part I've given up this search :) > > >You cannot address devices on smaller granularity than page, so I do not > >understand how you would like to construct such device. I know no device > >which does that... And nobody forces you to have framebuffer aligned on > >page - at least matroxfb on Millennium I happily did it in the past. > > Right so if your framebuffer isn't page aligned (both begin and end) can't > the client run off of the framebuffer memory into any other data in the > same page? So If you had a command buffer that started right after the > framebuffer you might be able to accidentally hit it by running off the > end of the framebuffer pointer. This is an easy fix for the driver... > just don't do that. I guess you are speaking about dma buffers here, are you ? mmm, does all board work so ? In the case of the permedia3 and i guess most other 3dlabs boards, the dma buffers get copied directly from the main memory to the chips command pipeline, and never use any framebuffer memory. But then, this is nice hardware, others may not be doing things this cleanly. > I am backing off all assertions that there are good reasons not to do mmap > of the framebuffer. The only reason to not allow mmap is that it is hard > to force apps to stop writing when they shouldn't. You mean, you can write (or read) data past the one you are supposed to own in the framebuffer ? From my experience (but again, i know only the 3dlabs boards) the only real problem here, is when two different apps use the framebuffer, and use two separate space of framebuffer memory has private offscreen memory, and should not interact. This usually doesn't work so, mainly (in the case of fbcon and X i am familiar with), each app saves the framebuffer data before releasing the chip, and i guess it could also clear it so no other app has access to it, and restores it when they get the chip back. This may not be the fastest thing around, but i guess such switches are not done that often, and can acomodate themselves for a little delay, there will momstly be a resync anyway for changing video modes. That said, maybe i am missing the point entirely, so please tell me so, or maybe there are some applications you have in mind that cause a real problem here ? Maybe these application may be changed in order to work better for this. > >No. All OSes I know - although they maybe started with some bright ideas - > >- sooner or later just gave direct access to videoram for apps - DirectX > >for Windows, DGA for XFree. > > Agreed. Although DGA isn't a good success story. What about DRI then ? > >I need 80MBps throughput of images from CPU to videoram - it is just > >impossible to pass such stream from CPU to main memory, and then from > >main memory to videocard with current 33MHz PCI bus and host bridges. > > If you are decoding video you'll have a working copy in RAM anyway, the > kernel can read directly out of that copy any into the framebuffer. Not > as fast as MMap for some things but not as bad as copying from > memory to a temp space then into the framebuffer. not sure, there were some discutions on some list (maybe X-devel, maybe dbeian-ppc, dont remember) about using the fb memory directly through some AGP trick. Current Xv implementation does a copy from app user space to Xv user space memory, who then initiate the transfer either directly or through some accel. > I agree, Mmap should probably stay. We'll just have to work out a way > to force apps to behave. I haven't given up on the fault handler > idea. > > >fb_display_info - FB_DISPLAY_*SYNC_HIGH - I did not found into which > >field they belong - into flags? And there is missing 'Use COMPOSITE SYNC' > >and 'Use SYNC ON GREEN'? How do you query these capabilities for different > >modes? > > Yea, flags. I didn't go through the trouble of working out all the > #defines that would be needed. I was more concerned with looking at > the interface from a different point of view. There are lots of flags > that would need to be added, I was pretty sure they could be > accommodated with the flags bits. > > You shouldn't need to query for capabilities. You set them and if the > driver comes back with them unset then it wasn't possible. Just like > refresh or anything else, try it and see if it works. > > >fb_mode_info - How you came to idea that both RGBRGBRGBRGB and > >RGBARGBARGBA should use 24BIT, while you do not use 15BIT for 1:5:5:5 > >16bpp videomode. There is also no definition for BGR layout - what > >was wrong with r/g/b/a offset/size? > > I meant RGBnothing not RGBalpha. RGBA is 32bpp.These were just No, you can ever use RGBA 5551, which is much better than RGB 565 since you get uniform grays, and you can use the 1bit alpha channel for some nice tricks (like doing masks in video playback for some GUI elements or such). Friendly, Sven Luther |
|
From: <cw...@so...> - 2001-12-04 03:57:31
|
> No PC hardware I know (*) interlaces framebuffer with some non-fb related > data. And if fbdev driver decides to put some special data inside - it is > buggy fbdev driver then. true. though some other application might be using it, or the card might have alignment information or something. just a thought. (agreed, no modern hardware would do this). > No. All OSes I know - although they maybe started with some bright ideas - > - sooner or later just gave direct access to videoram for apps - DirectX > for Windows, DGA for XFree. > > What I see here is that useful parts of fbdev are removed in the name of > supporting some obsolete or nonfunctional hardware, or in the name of > supporting remote fbdev. Well, make mmap optional, but do not remove it. exactly. it may be optional, but the added benefits of it are so immense that any other way is just a limit to app developers. removing that just for the sake of simplicity over a network connection is, imho, a junk idea. there is no shorter way to put it, it just isn't realistic any other way =( some clever middle steps would be interesting and useful for day to day usage, but any useful interface will just hand out a pointer and a configuration structure. network transparency neatly divides this topic in two, one side which is really neat to use over networks, but sucks all around (speedwise), and one that is not network transparent, but usable on the local machine. trying to combine these seems more and more mutually exclusive, without putting some responsibility on the client. in some cases it is difficult to add support for older devices (which do not support linear addresing of video memory). middle ground which supports these devices would be great for people who have such devices, but a way to impliment that functionality w/o making it harder for modern hardware is difficult. this further complicates the problem. chris --- moc.lexiptfos@thgirwc --- |
|
From: Petr V. <VAN...@vc...> - 2001-12-04 10:16:36
|
On 3 Dec 01 at 17:22, Sottek, Matthew J wrote:
> >You cannot address devices on smaller granularity than page, so I do not
> >understand how you would like to construct such device. I know no device
> >which does that... And nobody forces you to have framebuffer aligned on
> >page - at least matroxfb on Millennium I happily did it in the past.
>
> Right so if your framebuffer isn't page aligned (both begin and end) can't
> the client run off of the framebuffer memory into any other data in the
> same page? So If you had a command buffer that started right after the
> framebuffer you might be able to accidentally hit it by running off the
> end of the framebuffer pointer. This is an easy fix for the driver...
> just don't do that.
Yes. It would be buggy driver... On MillenniumI I have nothing before,
and nothing after (as MillenniumI/II does not have cursor image stored
in framebuffer), so only thing user can see is garbage from other VT.
But as we do not clear this anyway on VT switch, it should not matter.
> I am backing off all assertions that there are good reasons not to do mmap
> of the framebuffer. The only reason to not allow mmap is that it is hard
> to force apps to stop writing when they shouldn't.
Why to stop them? If we do not provide full virtualization for them,
we should not put any additional policy on their behavior.
> >Having physical address here is wrong - what is reason for having it here?
> >Cannot it be relative to something else? And as you do not tell anything
> >about contents of these areas, what is pitch for?
>
> I was against the physical address showing up anywhere too but others were
> not. If you wanted to make a frame grabber write to memory directly you
> could use the physical address from the surface 0. (or if you had a
You need busaddress for this. Bus address != physical address != virtual
address, see PPC.
> >I do not agree with semantics of /dev/gfx/fb0/command. It should act
> >as normal pipe - no seeks, you can queue commands and then get command
> >results back.
>
> Can you write this out completely? I see what you mean about queuing
> commands but I fail to see how return values and return data would
> happen as the pipe would be somewhat asynchronous and one direction.
Bidirectional pipe. As pipes are unidirectional, it must be either
character device, or (maybe better, as it has needed semantic) UNIX datagram
socket.
You write some command into socket. You must write command in one chunk -
- use writev if you need scatter/gather. write either fails, or completely
succeeds (due to datagram socket nature). If write fails, well, you are done,
command failed and you'll not hear from fbdev driver back.
If write suceeded, you can read one datagram from socket. This chunk will
contain result of your command, plus optionally data returned by command.
It has nice feature that you have automatically queue in the kernel,
if you'll access it through
fd = socket(PF_UNIX, SOCK_DGRAM, 0);
connect(fd, (struct sockaddr*)&(struct sockaddr_un){AF_UNIX, 20, "/dev/gfx/fb0/command"}, 128);
you can use same access for networked gfx, just using network socket
instead of local unix socket, and datagrams are by definition atomic,
and every client has its own socket, so if one fails to read answers,
it does not affect others.
If you can trust to apps, using SOCK_STREAM is definitely better, but
in such case you must use some length-prefixed encoding for commands,
as request boundaries are not implicitly passed through communication
channel. On other side you are not limited by 30-60KB kernel limit for
datagram size.
> >fb_set_interface - pass bitmaps through, so userspace APP can offer
> >which versions are supported, and kernel can reply back which version
> >was negotiated.
>
> Could be done that way too but I see a couple issues. First a bitmap
> has no priority, how do you know what the client really wanted? My
> way the client can just poll until they get one they can live with.
Highest one.
> In practice this isn't going to happen this way at all. A client is
> going to be written for ONE interface and is just telling the driver
> how to behave. If the driver can't do it, the app doesn't run. That
> is what would happen in your case too. An app written today supports
> the last 5 versions in it's bitmap, 6 years from now the driver
> doesn't support any of those so it doesn't work.
> Old app + new driver = New driver acts old
No. Kernel driver should provide only one API, unless new one is not
superset of old API.
> >I see no format info on any fb_put() op - either driver or kernel.
> >Same for fb_get().
>
> Correct, as with sound, there is no way the kernel should do
> converting of any kind. If you can't put() what the driver is using
> then you need to do it in a library, not the kernel. It is too slim
> a line to say RGB16->RGA24 isn't hard so I'll do that in the kernel
> but YUV4:2:0->YV12 is too hard so I won't do that.
Hardware can do that. We are not talking about RGB16->RGB32, but (mainly)
about monochrome -> current fb type for painting characters on the
screen. And couple of hardware can do YUV <-> RGB on the fly.
> I think you were asking about setting the fb_mode_info and such
> which DO belong to the driver. The reason for using the interface
> is that the di layer is helping the driver out by always keeping
> the most up-to-date copy. This way fb_get_* doesn't have to go
> down to the driver level. Otherwise you have to have a resource
Hm, maybe I understand.
> that is shared. All shared resources need mutual exclusion. What
> happens when the driver is updating the fb_display_info structure
> and another process queries the structure. If it is shared you
> could get invalid results. Then you could add a mutex so this
> doesn't happen, but someone will forget and now you've got problems.
But someone else can do it properly finegrained...
And when I thought about it during night, how this API address when
two displays are provided from one memory pool? Will di layer just
provide dummycon for VT which is incompatible with current hardware
configuration?
Petr
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-04 17:15:08
|
>>The only reason to not allow mmap is that it is hard to force apps
>>to stop writing when they shouldn't.
>Why to stop them? If we do not provide full virtualization for them,
>we should not put any additional policy on their behavior.
I'm ok with leaving out policy, but the problem is that we have no
way to notify the client in a timely manner either. They would
have to poll the surface to look at the status.
So what should be done on a vt switch?
I think it may be ok to zap() someone's mmap during a vt switch but
otherwise make them work out their own policy. vt switches are
slow anyway.
I am unsure of how XFree handles this. Does the X server trap the
vt switch sequence, call leave_vt() then switch the vt?
>You need busaddress for this. Bus address != physical address
>!= virtual address, see PPC.
Just copying what was there for sysmem_start. Bus address is confusing
to platforms who don't have them just like physical address is
confusing to you.
>fd = socket(PF_UNIX, SOCK_DGRAM, 0);
>connect(fd, (struct sockaddr*)&(struct sockaddr_un){AF_UNIX, 20,
>"/dev/gfx/fb0/command"}, 128);
While I do like the features of this one better, It scares me
a little. I don't like having network behavior different than
local, it should be transparent, and the complexity/overhead
is a little higher. This could be bad for small
fb_set_cursor_pos type commands.
>> Could be done that way too but I see a couple issues. First a bitmap
>> has no priority, how do you know what the client really wanted? My
>> way the client can just poll until they get one they can live with.
>Highest one.
I still don't like the "bitmap" idea. It is just a matter of preference
and really doesn't change much functionally.
>No. Kernel driver should provide only one API, unless new one is not
>superset of old API.
I don't agree. The kernel needs to provide a consistent API. For
ioctls this means you really should not be changing them after they
become widely used. Since we are moving to a more advanced command
interface we have a little more flexibility. We can support consistent
interfaces while still accommodating design changes. Therefore the
kernel provides whatever API the client expects. An API that is
sufficiently unused do to age could be dropped.
This certainly doesn't give anyone license to abandon good interface
design. Kernel interfaces should change very rarely and really
should use a superset interface whenever possible. BUT, when you
just really need to change the behavior of a command this would
allow you to do so without breaking apps.
>Hardware can do that. We are not talking about RGB16->RGB32, but
>(mainly) about monochrome -> current fb type for painting
>characters on the screen. And couple of hardware can do YUV
><-> RGB on the fly.
I have to think about this. YUV<->RGB is probably going to need
some other interface that goes with overlays. Color expanding
blits are a valid thing that I missed.
I really don't like the idea of put() passing a format and expecting
the kernel to handle it. If we exposed some list of valid formats
in fb_surface_info(?) it might be acceptable.
>> that is shared. All shared resources need mutual exclusion. What
>> happens when the driver is updating the fb_display_info structure
>> and another process queries the structure. If it is shared you
>> could get invalid results. Then you could add a mutex so this
>> doesn't happen, but someone will forget and now you've got problems.
>But someone else can do it properly finegrained...
I agree, but the circumstances of when you call a difb*() are
when you need to set something out of the normal flow. Like
setting the display_info when you are in a set_mode_info()
function. It just doesn't happen on a regular basis, so the most
simple correct solution doesn't give up much (I'm sure it is
not even measurable).
>And when I thought about it during night, how this API address when
>two displays are provided from one memory pool? Will di layer just
>provide dummycon for VT which is incompatible with current hardware
>configuration?
I'm not sure what you mean about dummycon for VT?
One framebuffer which feeds two displays with their own timings means
the driver just has two fb_display_infos. Notice how all the
*display_info() functions have an index so you can change the
displays independently.
Two framebuffers with two (or more) displays has two fb contexts. Just
like now... you do the whole thing as two drivers, two privs and all.
One framebuffer with two displays with the same timings (a-la the i810
with TV or LCD) has to multiplex the two displays into one
fb_display_info, probably using device private flags to mark valid
heads. This behavior is just too different to wrap in a sane way, a
mode setting app will just have to be smart enough to allow hardware
specific extensions.
|
|
From: Michel <mic...@ii...> - 2001-12-04 17:29:04
|
On Tue, 2001-12-04 at 18:15, Sottek, Matthew J wrote: > So what should be done on a vt switch? > I think it may be ok to zap() someone's mmap during a vt switch but > otherwise make them work out their own policy. vt switches are > slow anyway. >=20 > I am unsure of how XFree handles this. Does the X server trap the > vt switch sequence, call leave_vt() then switch the vt? More or less, yes. It has a semaphore to guard hardware access. --=20 Earthling Michel D=E4nzer (MrCooper)/ Debian GNU/Linux (powerpc) developer XFree86 and DRI project member / CS student, Free Software enthusiast |
|
From: Geert U. <ge...@li...> - 2001-12-05 09:56:47
|
On Tue, 4 Dec 2001, Sottek, Matthew J wrote:
> >>The only reason to not allow mmap is that it is hard to force apps
> >>to stop writing when they shouldn't.
>
> >Why to stop them? If we do not provide full virtualization for them,
> >we should not put any additional policy on their behavior.
>
> I'm ok with leaving out policy, but the problem is that we have no
> way to notify the client in a timely manner either. They would
> have to poll the surface to look at the status.
>
> So what should be done on a vt switch?
> I think it may be ok to zap() someone's mmap during a vt switch but
> otherwise make them work out their own policy. vt switches are
> slow anyway.
That's why for 2.5.x we wanted to disable VT switching for a VT that has its
/dev/fb* opened by some application.
> I am unsure of how XFree handles this. Does the X server trap the
> vt switch sequence, call leave_vt() then switch the vt?
The X server indeed installs a VT switch handler, and releases access to the
hardware and does the VT switch.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@li...
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
|
|
From: Sven <lu...@dp...> - 2001-12-05 17:49:38
|
On Wed, Dec 05, 2001 at 10:55:55AM +0100, Geert Uytterhoeven wrote:
> On Tue, 4 Dec 2001, Sottek, Matthew J wrote:
> > >>The only reason to not allow mmap is that it is hard to force apps
> > >>to stop writing when they shouldn't.
> >
> > >Why to stop them? If we do not provide full virtualization for them,
> > >we should not put any additional policy on their behavior.
> >
> > I'm ok with leaving out policy, but the problem is that we have no
> > way to notify the client in a timely manner either. They would
> > have to poll the surface to look at the status.
> >
> > So what should be done on a vt switch?
> > I think it may be ok to zap() someone's mmap during a vt switch but
> > otherwise make them work out their own policy. vt switches are
> > slow anyway.
>
> That's why for 2.5.x we wanted to disable VT switching for a VT that has its
> /dev/fb* opened by some application.
Does that mean we cannot anymore switch away from X to console ?
Friendly,
Sven Luther
>
> > I am unsure of how XFree handles this. Does the X server trap the
> > vt switch sequence, call leave_vt() then switch the vt?
>
> The X server indeed installs a VT switch handler, and releases access to the
> hardware and does the VT switch.
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@li...
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>
>
> _______________________________________________
> Linux-fbdev-devel mailing list
> Lin...@li...
> https://lists.sourceforge.net/lists/listinfo/linux-fbdev-devel
|
|
From: Sven <lu...@dp...> - 2001-12-05 17:36:26
|
On Tue, Dec 04, 2001 at 09:15:01AM -0800, Sottek, Matthew J wrote:
>
> >>The only reason to not allow mmap is that it is hard to force apps
> >>to stop writing when they shouldn't.
>
> >Why to stop them? If we do not provide full virtualization for them,
> >we should not put any additional policy on their behavior.
>
> I'm ok with leaving out policy, but the problem is that we have no
> way to notify the client in a timely manner either. They would
> have to poll the surface to look at the status.
>
> So what should be done on a vt switch?
> I think it may be ok to zap() someone's mmap during a vt switch but
> otherwise make them work out their own policy. vt switches are
> slow anyway.
>
> I am unsure of how XFree handles this. Does the X server trap the
> vt switch sequence, call leave_vt() then switch the vt?
Well, X handles it in two ways :
1) X is not fbdev aware ...
=> it simply use vesa to save the text screen status and content, and hopes
that fbdev did save the data before.
(notice, on the glint driver is not fbdev aware for boards other than pm2
one, but X cohabits nicely with pm3fb in this way)
2) X is fbdev aware, and using fbdevhw.
=> again, fbdev is supposed to save its stuff when leaving the vt.
When leaving X again, it is Xs turn to save it's stuff.
Some syncing is involved, which slows down things.
Friendly,
Sven Luther
|
|
From: Petr V. <VAN...@vc...> - 2001-12-04 17:47:56
|
On 4 Dec 01 at 9:15, Sottek, Matthew J wrote:
>
> I'm ok with leaving out policy, but the problem is that we have no
> way to notify the client in a timely manner either. They would
> have to poll the surface to look at the status.
>
> So what should be done on a vt switch?
App uses all (or needed of) VT_LOCKSWITCH/VT_UNLOCKSWITCH ioctls +
VT_SETMODE (fields acqsig, relsig) ioctl if it needs direct access to
fb.
> I think it may be ok to zap() someone's mmap during a vt switch but
> otherwise make them work out their own policy. vt switches are
> slow anyway.
Why do you think that VT switches are slow? And tearing down existing
mappings is not trivial.
> I am unsure of how XFree handles this. Does the X server trap the
> vt switch sequence, call leave_vt() then switch the vt?
LOCKSWITCH + SETMODE to deliver signals on console switches.
> >You need busaddress for this. Bus address != physical address
> >!= virtual address, see PPC.
>
> Just copying what was there for sysmem_start. Bus address is confusing
> to platforms who don't have them just like physical address is
> confusing to you.
You need physical address for mmap on /dev/kmem, while you need busaddress
for doing DMA transfers from other PCI busmasters to videoram.
> >fd = socket(PF_UNIX, SOCK_DGRAM, 0);
> >connect(fd, (struct sockaddr*)&(struct sockaddr_un){AF_UNIX, 20,
> >"/dev/gfx/fb0/command"}, 128);
>
> While I do like the features of this one better, It scares me
> a little. I don't like having network behavior different than
> local, it should be transparent, and the complexity/overhead
> is a little higher. This could be bad for small
> fb_set_cursor_pos type commands.
You can use UDP if you want same behavior on local - but I do not
think that you'll get remote behavior same as local anytime. For remote
access you must do authentication, authorization, encryption and couple
of other things you do not require on local system because of someone
else already took care of auth*, and you do not need encryption because
of nobody without root priviledges can see what you are doing.
> >No. Kernel driver should provide only one API, unless new one is not
> >superset of old API.
>
> I don't agree. The kernel needs to provide a consistent API. For
> ioctls this means you really should not be changing them after they
> become widely used. Since we are moving to a more advanced command
It is userspace library business to do these conversions. Kernel should
support one, at most two for transition period, interfaces. If you are
upgrading kernel, upgrade your userspace too. If you are not going to
upgrade kernel - no problem, kernel API does not change.
> >(mainly) about monochrome -> current fb type for painting
> >characters on the screen. And couple of hardware can do YUV
> ><-> RGB on the fly.
>
> I have to think about this. YUV<->RGB is probably going to need
> some other interface that goes with overlays. Color expanding
> blits are a valid thing that I missed.
They are same from my (and from G400) viewpoint. I have byte array,
source color organization, and framebuffer color organization. And
either hardware can do that conversion, or it cannot.
> >two displays are provided from one memory pool? Will di layer just
> >provide dummycon for VT which is incompatible with current hardware
> >configuration?
>
> I'm not sure what you mean about dummycon for VT?
>
> Two framebuffers with two (or more) displays has two fb contexts. Just
> like now... you do the whole thing as two drivers, two privs and all.
You have two framebuffers, two displays, two fb contexts, and only one
memory they share. Now you do
VT #1 belongs to fb #0
VT #2 belongs to fb #0
VT #11 belongs to fb #1
Your videocard has 8MB total. Now you select 1024x512/32bpp on all
VTs, each uses 2MB of videoram. VT#1 and VT#11 are visible. Now you change
VT#1 size to 1024x1536/32bpp - consuming 6MB + 2MB. Now you switch to VT#2,
and resize #11 to 1024x1536/32bpp - this is again valid resolution, as
2MB + 6MB <= 8MB. But now, when you switch back to VT#1 - oops, VT#1's
resolution needs 6MB, VT#11 needs 6MB, total 12MB. Impossible...
So *_set_var or how is it named in your API, fails, and VT layer must
be able to cope with that. Temporary moving VT from fbcon to dummycon,
or refusing to switch VT, are two possible solutions.
Petr Vandrovec
van...@vc...
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-04 19:12:53
|
>App uses all (or needed of) VT_LOCKSWITCH/VT_UNLOCKSWITCH ioctls +
>VT_SETMODE (fields acqsig, relsig) ioctl if it needs direct access to
>fb.
Ah, but we have an issue here. The X server is a trusted binary,
therefore we can rely on it to behave. Fb apps are not trusted
therefore we have to make sure they behave. Just like the DRI only
allows access when you have the lock, and you can't get the lock
when you are switched away.
Two apps on one VT is policy, two apps of different VT's is supposed
to behave like two apps on different computers. It isn't policy,
it is kernel enforced.
I am willing to admit that we may _have_ to fall back on the
"everyone please behave" method, but I would like to be sure we've
exhausted all other methods first.
>Why do you think that VT switches are slow? And tearing down existing
>mappings is not trivial.
A vt switch console<->XFree takes a couple seconds on some hw,
plus you may have to resync the display anyway. zap_page_range()
happens very fast in comparison, not trivial but shouldn't be
eliminated on grounds of slowness. Perhaps eliminated because
it is scary, but not slowness :)
>You need physical address for mmap on /dev/kmem, while you
>need busaddress for doing DMA transfers from other PCI
>busmasters to videoram.
Right, but we don't need to tell anyone where to mmap on /dev/kmem.
If they want to mmap they should use the fb. We only need to
provide them a way to do DMA... so that's a bus address for PPC
and a physical address on x86. We can just call it dma_address.
>You can use UDP if you want same behavior on local - but I do not
>think that you'll get remote behavior same as local anytime. For remote
>access you must do authentication, authorization, encryption and couple
>of other things you do not require on local system because of someone
>else already took care of auth*, and you do not need encryption because
>of nobody without root priviledges can see what you are doing.
This all depends on what Network Transparent means to you. Doing
read/write on a character file works over NFS and then you don't
need to worry about authentication. Network connections from an
untrusted machine need some other layer of authentication but that
should happen in some root daemon. Assuming that some filesystem
can be network exported you should be able to use the device from
another trusted machine, that's all I was trying to achieve.
>They are same from my (and from G400) viewpoint. I have byte array,
>source color organization, and framebuffer color organization. And
>either hardware can do that conversion, or it cannot.
Right, I need to think of something for this.
Your last part provides a good corner case that can be summed up
as follows:
>What happens when you are supporting two fb's with one memory space,
>and you get into a situation where two modes are valid on their
>own (or with a small "partner") but they don't fit at the same
>time.
I would say that the best thing to do depends on how the driver
wants to behave. I would use two private structures but both one
has the context # for the "other" context. Like:
struct my_priv {
stuff...
u32 context;
u32 other_context;
}
When you want to set a mode you have to see if it fits, including
the information about what the other context is using up. If you
can't fit the new mode you do one of two things. Either don't
set the mode and tell the user, or using the "other_context"
with difb*() change the other guy's mode so that you can now
fit. See how the context makes things nice? Just call
difb_set_mode(other_context...) and the other fb gets a call
from the di layer and has no idea that it really came from the
other context.
In the case where this happens without the user requesting it.
(Vt switching to the two "large" modes) you'll get a fb_set_mode()
from the di layer. The driver can either change the requested
mode and apply it, or change the "other_context" and set the
requested mode. Either way one VT is altered... can't make
them both happy.
The driver is supposed to return success at all costs when it
gets a fb_set_mode(), even if it has to alter the mode greatly.
The driver gets to decide what is best, if the user wants to
be sure, it should test the mode first.
-Matt
|
|
From: Petr V. <VAN...@vc...> - 2001-12-04 19:47:04
|
On 4 Dec 01 at 11:12, Sottek, Matthew J wrote:
> >App uses all (or needed of) VT_LOCKSWITCH/VT_UNLOCKSWITCH ioctls +
> >VT_SETMODE (fields acqsig, relsig) ioctl if it needs direct access to
> >fb.
>
> Ah, but we have an issue here. The X server is a trusted binary,
> therefore we can rely on it to behave. Fb apps are not trusted
> therefore we have to make sure they behave. Just like the DRI only
> allows access when you have the lock, and you can't get the lock
> when you are switched away.
Anything what can mess my console must be trusted. More or less.
If application does not have these right, kernel must not postpone
VT switch until app answers, so only unmapping & delivering signal (SIGBUS)
to app is possible.
> Two apps on one VT is policy, two apps of different VT's is supposed
> to behave like two apps on different computers. It isn't policy,
> it is kernel enforced.
It is impossible. VTs are virtual terminals, fully virtualized, while
there is only one underlying fb - so apps have to know about each
other, and have to know about VT concept, if they want to support it.
If they do not -> suid & LOCKSWITCH.
> >Why do you think that VT switches are slow? And tearing down existing
> >mappings is not trivial.
>
> A vt switch console<->XFree takes a couple seconds on some hw,
> plus you may have to resync the display anyway. zap_page_range()
> happens very fast in comparison, not trivial but shouldn't be
> eliminated on grounds of slowness. Perhaps eliminated because
> it is scary, but not slowness :)
It is bug in X, not anything else. It is true that XFree mga driver
reads whole 64KB EEPROM through i2c on startup - but it is just
silly driver. I can switch from fbtv to another console in less than
20ms (== sooner than TV field finishes).
> >You need physical address for mmap on /dev/kmem, while you
> >need busaddress for doing DMA transfers from other PCI
> >busmasters to videoram.
>
> Right, but we don't need to tell anyone where to mmap on /dev/kmem.
> If they want to mmap they should use the fb. We only need to
> provide them a way to do DMA... so that's a bus address for PPC
> and a physical address on x86. We can just call it dma_address.
It is bus address on x86. It is just happy coincidence that bus address ==
== physical address. It is bus address by definition...
> This all depends on what Network Transparent means to you. Doing
> read/write on a character file works over NFS and then you don't
> need to worry about authentication. Network connections from an
> untrusted machine need some other layer of authentication but that
> should happen in some root daemon. Assuming that some filesystem
> can be network exported you should be able to use the device from
> another trusted machine, that's all I was trying to achieve.
And who is listening on other end of wire if you'll use NFS read+lseek+write
solution? How you'll ensure that change is propagated through wire,
and not cached back? Besides that no system I have interconnected offers
NFS for writting - it is far too dangerous without hardcoded ARP tables.
> When you want to set a mode you have to see if it fits, including
> the information about what the other context is using up. If you
> can't fit the new mode you do one of two things. Either don't
It fits. Only VT level knows resolution of other VTs. And if fbdev
should know this itself - then do not change API at all... I see no
reason for change then...
> In the case where this happens without the user requesting it.
> (Vt switching to the two "large" modes) you'll get a fb_set_mode()
> from the di layer. The driver can either change the requested
> mode and apply it, or change the "other_context" and set the
> requested mode. Either way one VT is altered... can't make
> them both happy.
Driver will refuse to set mode you requested on console switch.
> The driver is supposed to return success at all costs when it
> gets a fb_set_mode(), even if it has to alter the mode greatly.
> The driver gets to decide what is best, if the user wants to
> be sure, it should test the mode first.
He tested it. It worked. Then he changed something else, and now
kernel refuses previously working mode. It is possible to split
memory on driver load for 24MB for fb0 & 8MB for fb1 - but why if
you can do it dynamically. At least I hoped that it will be possible
with new APIs.
And no, I cannot change mode greatly, current console will not survive
if you'll change vc_cols/vc_rows in console switch, besides that it
violates layering.
Best regards,
Petr Vandrovec
van...@vc...
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-05 16:38:50
|
>That's why for 2.5.x we wanted to disable VT switching for a VT >that has its /dev/fb* opened by some application. Geert, I agree that just disabling VT switching makes things very easy, but you are eliminating behavior that I would say is very required. It just isn't a good trade. Lots of people run an X server on one virtual terminal and leave the other terminals as consoles, or run two X servers at different depths. You can't do this if you can't VT switch while an application is running. And what happens when an app locks up with the device open? Forget VT switching. And a fb based installer that needs kernel messages on another terminal plus a console just in case. And an embedded controller that runs a diagnostic app on a different VT from the main display without shutting down the main display. I can not see how this is a good idea. >> I am unsure of how XFree handles this. Does the X server trap the >> vt switch sequence, call leave_vt() then switch the vt? >The X server indeed installs a VT switch handler, and releases >access to the hardware and does the VT switch. That's what I thought, but that doesn't work for our needs because VT switch handlers are for trusted apps. The X server runs as root, so do the old svgalib apps. The point is, that something with direct hardware access needs to be trusted to leave the hardware in a known safe state when leaving. That isn't what we have here. We are eliminating the need for direct hardware access. We are virtualizing the graphics device and providing safe interfaces for user applications that are not trusted. If you want to trust user apps then just run them as root and forget the kernel. The best alternative I see at this time is to, on a VT switch, remove any mappings that clients may have and install a zero page fault handler that blocks applications that attempt to write until the VT is switched back. This only impacts mmaped clients as the other interfaces are easy to block. We could also have the client request to be signaled so they can do something other than block, and are notified on return so they can refresh... but the signal handling has to be optional. -Matt |
|
From: Petr V. <VAN...@vc...> - 2001-12-05 17:54:50
|
On 5 Dec 01 at 8:38, Sottek, Matthew J wrote:
> And a fb based installer that needs kernel messages on another
> terminal plus a console just in case.
> And an embedded controller that runs a diagnostic app on a different
> VT from the main display without shutting down the main display.
>
> I can not see how this is a good idea.
Fb apps can close /dev/fb* when they receive switch info. But then
we are again where we were before - you must be trusted app to
catch VT switch.
> That's what I thought, but that doesn't work for our needs because
> VT switch handlers are for trusted apps. The X server runs as root,
> so do the old svgalib apps. The point is, that something with direct
> hardware access needs to be trusted to leave the hardware in a known
> safe state when leaving. That isn't what we have here.
No. fbdev app can leave hardware in any state. You can always run fbset
after it exits. Problem is when fbdev app running on background still
fills framebuffer with black color again and again - and I think
that it is user problem. Only thing we need is to execute
'fuser -k /dev/fb*' and 'chmod 600 /dev/fb*' on console logout, and
then each user can screw only itself.
> The best alternative I see at this time is to, on a VT switch, remove
> any mappings that clients may have and install a zero page fault
> handler that blocks applications that attempt to write until the
> VT is switched back. This only impacts mmaped clients as the other
> interfaces are easy to block. We could also have the client request
Such as busmaster clients? :-)))
> to be signaled so they can do something other than block, and are
> notified on return so they can refresh... but the signal handling
> has to be optional.
I think that replacing mapping with /dev/null one (and delivering
signal) is better - app has to query framebuffer state again after
it gets control back, and it also has to repaint screen, so an
additional mremap during these actions is not a big problem.
Only problem is that write(fd, framebuffer, sizeof(framebuffer))
may yield incorrect (black) result - but it is what you get if
you have shared resource, and you refuse to serialize accesses due to
security concerns.
I'd vote for suspending app when kernel is able to restore screen
contents, mapping and fb state after fb is available again, but as
it does not do that, there is no reason for suspend - you'll get invalid
data anyway, only difference is when write() finishes.
Petr Vandrovec
van...@vc...
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-05 18:43:26
|
>No. fbdev app can leave hardware in any state.
Right, but the fbdev driver needs to save the full context between
VT's. The di portion saves all the mode info but you may have to
save a bunch of other context information if you are doing accel.
The dri works this way already but on a per client basis, we need
it on a per VT basis.
>I think that replacing mapping with /dev/null one (and delivering
>signal) is better
So you are saying, instead of blocking the app when it writes, just
map it over to /dev/null and allow the app the continue.
I like this better than blocking the app. Blocking can cause
unexpected behavior since almost no app is expecting to block on
a memory write.
-Matt
|
|
From: Petr V. <VAN...@vc...> - 2001-12-05 19:10:28
|
On 5 Dec 01 at 10:43, Sottek, Matthew J wrote:
> The dri works this way already but on a per client basis, we need
> it on a per VT basis.
All commands your fbdev API presents are not context sensitive. At least
it looks that way to me. After console switch all on-screen and off-screen
memory is lost.
Only thing we can do is that if app told us that it uses new semantic
(/dev/null mmap replacing, f.e.), after framebuffer returns back it
must confirm it with some action (mmaping of /dev/fb, or some fb API
action), and until it confirms it, decline to execute any of commands
submitted, as it is 100% sure that app did not restore its context yet.
> >I think that replacing mapping with /dev/null one (and delivering
> >signal) is better
>
> So you are saying, instead of blocking the app when it writes, just
> map it over to /dev/null and allow the app the continue.
Yes.
Petr Vandrovec
van...@vc...
|
|
From: Sottek, M. J <mat...@in...> - 2001-12-05 19:36:44
|
>> The dri works this way already but on a per client basis, we need >> it on a per VT basis. >All commands your fbdev API presents are not context sensitive. >At least it looks that way to me. After console switch all >on-screen and off-screen memory is lost. I wasn't really talking about my API, but rather _any_ future API. Specifying context isn't necessary as it is with the dri because we only save state on a VT basis. The di knows which VT you are on and tells the driver to restore the correct settings. If we are only talking about mode/display settings the di layers of both my API and all current API's already send enough information to accomplish this task. If we are talking about advanced acceleration then another VT private data structure needs to be sent as well. On screen memory isn't required to be restored in most virtualizatons. As long as the "rules" include that the application may be required to refresh the entire screen at any time. Offscreen memory can either obey the same "rules" or be allocated to specific VT's. It is easiest to say that all memory on or off screen has to be refreshed if it is "dirty". My API allows the client app to detect that the surfaces are dirty and the client would then refresh them. We would need to add in the signals so that the app doesn't have to poll the status. >Only thing we can do is that if app told us that it uses new semantic >(/dev/null mmap replacing, f.e.), after framebuffer returns back it >must confirm it with some action (mmaping of /dev/fb, or some fb API >action), and until it confirms it, decline to execute any of commands >submitted, as it is 100% sure that app did not restore its context yet. Good idea, once the app gets the signal it should send a command down to tell the driver that it has populated all of the "content". Note the difference between "Context" and "Content". The driver can restore the context. That includes the mode/display settings. The size/location of command buffers. The cursor settings. (and if you want to include advanced acceleration in the mix, it could include the size/location of textures, video surfaces, etc). The problem is that while we can restore all of the "Context" the "Content" is something the driver would not have. "Content" is the contents of the frambuffer, cursor, and offscreen textures or scratch memory. The app has to put that all back. |