|
From: Antonino D. <ad...@po...> - 2002-08-08 21:51:18
|
On Fri, 2002-08-09 at 02:51, Petr Vandrovec wrote: > > Message from Antonio Daplas > (http://www.geocrawler.com/lists/3/SourceForge/9276/0/9249087/) > says: > 2.5 old (with offscreen buffers) 10.708 > 2.5 new 4.378 > 2.4 2.098 > > His first message > (http://www.geocrawler.com/lists/3/SourceForge/9276/25/9237029/) > listed 13.586 for old 2.5 code. > > So you are right, old code was not 1000% slowdown, only 500%. But main > problem is not speed of old code, but speed of new code. And if numbers > are right, new code is still 100% slower than 2.4.x code was. > Petr Vandrovec The numbers are correct. However I'm only talking about software drawing here. With a few more optimizations with the code, the scroll time was further cut down to 3.780s. Also, 16bpp and 32bpp is now faster in 2.5 than in 2.4, although 24bpp is still a bit slower because of problems of its weird alignmment. However, ALL hardware accelerated code is much faster than the old one, and will be much, much faster if hardware sync on demand is implemented. (I really want this James :) The extra processing of the font bitmap in putcs() outweighs the benefit of "bulk" writing the data in 8bpp, but becomes insignificant as we go to higher color depths, or as we take advantage of hardware acceleration. I'm attaching diffs for cfbimgblt.c, cfbfillrect.c, cfbcopyarea.c and fbcon-accel.c. This is against vanilla 2.5.27. fbcon-accel.c: process 4 characters at a time, if possible, to squeeze a few more CPU cycles cfbimgblt.c divided into fast_imageblit (for 8, 16, 32 bpp), slow_imageblit (24 bpp) and bitwise_imageblit (default). slow_imageblit involves packaging 4 pixels (or 8 if we have color depths > 32) which are written as double words (1 - 8bpp, 2 - 16bpp, 3 - 24bpp). cfbcopyarea.c uses fast_memmove and fb_memmove for 24 bpp. Anthing wrong with this fb string functions? I seem not to see any performance degradation by using them. cfbfillarea.c Similar concept as slow_imageblit, packages 4-pixels in 24 bpp that are written as 3 double words to the framebuffer. Also is the double word access alignment a strict or optional requirement? Any comments? Tony |