From: <gb...@di...> - 2000-09-18 22:38:54
|
Hi, [Direct Addressing] > > Implemented. And indeed, what a nice piece of improvement: near twice > > the speed! In the event that a temporary frame buffer is needed in DGA mode, the user will notice that this mode will be up to 50% slower than in windowed mode. Despite all my efforts, I could only get direct addressing + DGA + temporary frame buffer 3% slower than memory banks + DGA with no temporary frame buffer (the original B2 without my patches) whereas the windowed mode in direct addressing is 50-60% faster than without direct addressing. I told you earlier that I will try direct addressing for Solaris. I tried and it failed :-/ Actually, under Solaris, the only reasonnable mappable region is above 0xE0000000. That's not enough to "triple allocate" the Mac address space. By "triple allocation" I mean allocating RAM, ROM, VID so that the gaps between each of them in the host address space are the same as those in the mac address space. In fact triple allocation could come down to "double allocation" since ROM is relocatable and can be for example next to the RAM region. A solution for Solaris and other systems that don't allow such a mapping is to use a table of offsets. That table is only 16 KB long if I take 1 MB pages and a 32-bit addressing host system. I would like to commit the patches but be advised that: - I altered main.cpp so that RAM and ROM base addresses are known before invokation of VideoInit(). This is enabled only in emulated_68k mode. That code comes from uae_cpu/basilisk_glue.cpp as you suggested. - I added a few lines in rom_patches and rsrc_patches for the ScratchMem hook in direct addressing. This works great, thanks Christian for that trick! - I have an extra file (video_vosf.cpp) for Video on SEGV signals that I will probably split into two files: video_vosf.h and blitters.h. Those files are only meant to be included from video_x.cpp. - I have specialised VideoRefresh functions to handle different sort of display updates. I also had a genvideo module that would generate specialised C handlers according to depth, xshm presence, static/dynamic updates of the screen, etc. I removed the genvideo thing because the benefits are not so impressive. There are more and more #defines :-/ - enable_vosf for Video on SEGV signals - direct_addressing as its name indicates - maybe direct_addressing_lt (lite) for the hack with a table of offsets - have_siginfo_t - have_sigcontext_subterfuge The last two macros are defined in acconfig.h and set at configure time for the config.h file. Besides, those tests are quite huge: ~45 lines each in configure.in... > > 4. Reversed Address Space > > [...] > > Note however that I didn't checked all the patches for "correctness" > > yet, especially audio that makes strange sounds. > > It may not be obvious, but part of the intention behind the Mac2Host_memcpy() > etc. functions in cpu_emulation.h was to allow for a reversed address space. > It's not used consequently in all modules, however. The reason why I chose the prepare,reverse,reverse-back approach is that the reversal can take place without any extra memory. I reckon that the Mac2Host_* functions are cleaner. Maybe next step would be to use them in all places that make use of Mac2HostAddr() and assume a forward address space. The point with direct addressing under Linux/i386 is that in some conditions, it could turn out to real addressing. Then, the JIT compiler will only have to generate load/stores without any other offset... I am pretty sure that this is where the major speed increase will come from. More if no byteswapping at all is needed. Bye. -- Gwenolé Beauchesne |