You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(13) |
Sep
(42) |
Oct
(17) |
Nov
(7) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(14) |
Feb
(8) |
Mar
(13) |
Apr
(10) |
May
(28) |
Jun
(28) |
Jul
(23) |
Aug
(7) |
Sep
(2) |
Oct
(24) |
Nov
(9) |
Dec
(2) |
2002 |
Jan
(58) |
Feb
(15) |
Mar
(57) |
Apr
(26) |
May
(7) |
Jun
|
Jul
(10) |
Aug
|
Sep
(19) |
Oct
(9) |
Nov
(6) |
Dec
(4) |
2003 |
Jan
(4) |
Feb
(1) |
Mar
(3) |
Apr
(5) |
May
(14) |
Jun
(3) |
Jul
(7) |
Aug
(4) |
Sep
(7) |
Oct
(4) |
Nov
(11) |
Dec
(3) |
2004 |
Jan
(32) |
Feb
(21) |
Mar
(3) |
Apr
(11) |
May
(33) |
Jun
(42) |
Jul
(46) |
Aug
(2) |
Sep
(3) |
Oct
|
Nov
(42) |
Dec
(23) |
2005 |
Jan
(5) |
Feb
(2) |
Mar
(12) |
Apr
(26) |
May
(8) |
Jun
(18) |
Jul
(21) |
Aug
(3) |
Sep
|
Oct
(1) |
Nov
(10) |
Dec
(1) |
2006 |
Jan
(17) |
Feb
(17) |
Mar
(3) |
Apr
(2) |
May
(2) |
Jun
(7) |
Jul
(6) |
Aug
(4) |
Sep
|
Oct
(3) |
Nov
(7) |
Dec
(4) |
2007 |
Jan
(6) |
Feb
(4) |
Mar
|
Apr
(3) |
May
(7) |
Jun
(17) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
(2) |
Dec
(5) |
2008 |
Jan
(14) |
Feb
(2) |
Mar
(2) |
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2009 |
Jan
(2) |
Feb
(22) |
Mar
(3) |
Apr
|
May
(7) |
Jun
|
Jul
|
Aug
(15) |
Sep
|
Oct
(32) |
Nov
(9) |
Dec
|
2010 |
Jan
(18) |
Feb
(2) |
Mar
(14) |
Apr
(1) |
May
|
Jun
|
Jul
(2) |
Aug
(7) |
Sep
(6) |
Oct
(35) |
Nov
(4) |
Dec
|
2011 |
Jan
(4) |
Feb
|
Mar
(9) |
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
(9) |
Oct
|
Nov
|
Dec
(4) |
2012 |
Jan
(4) |
Feb
|
Mar
(8) |
Apr
(9) |
May
|
Jun
(176) |
Jul
(86) |
Aug
(20) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(4) |
Mar
(5) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2016 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
(4) |
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
(13) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(11) |
Aug
|
Sep
(5) |
Oct
(2) |
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
From: Christian B. <cb...@st...> - 2001-03-18 16:06:26
|
Hi! On Sat, Feb 24, 2001 at 10:06:50AM +0100, Gwenole Beauchesne wrote: > - Teach Linux/NetDriver to work^Wcompile with newer kernels (2.4+) This is fixed now. > - Clean up siginfo subterfuges in configure by possibly sharing a common > header between configure programs and B2 ? Desirable, but not release critical. > - Fix the DGA2 bug I'll try to have a look at this. > - Classic Emulation. I don't think it is possible to do Classic > emulation in direct addressing mode since we are always keeping the > whole 32 bits from the address. Putting in a note that the program had to be recompiled to enable Classic emulation would be fine for me. It's not used by that many people. > I don't really want to have to build multiple binaries for B2. > i.e. { DGA1, DGA2 } x { Classic, Quadra } emulation is starting to be > big in both compilation time and space. Maybe we should use loadable CPU emulation plugins? > The following is just convenience for me so that I may not have to trash > compiler.{h,cpp} and fpu files whenever I update my JIT tree ;-) > - Remove uae_cpu/compiler.{h,cpp} > - Move fpp.cpp, fpu_x86* into uae_cpu/fpu/ Just do it. :-) Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: Christian B. <cb...@st...> - 2001-03-12 22:06:31
|
Hi! A Web interface to the B2 CVS repository based on ViewCVS is now available on: http://down.physik.uni-mainz.de/cgi-bin/viewcvs.cgi/BasiliskII/?cvsroot=cebix Using ViewCVS instead of CVSWeb should make it more robust and it also adds nice features like syntax coloring. If nobody reports problems with the ViewCVS interface I will remove the CVSWeb-based interface soon, so update your bookmarks. Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: Christian B. <cb...@st...> - 2001-03-06 18:46:49
|
Hi! On Thu, Mar 01, 2001 at 02:14:21PM -0800, Brian J. Johnson wrote: > The subjective performance increase was very noticeable as well. Same here. > Which raises another question: why isn't one-bit "color" > supported in Quadra (as opposed to Classic) mode?) Because this would require 2 and 4 bpp modes as well (MacOS needs contiguous bit depths) which in turn would require more blitting cases. (I'm somewhat inclined to dump much of the code in video_x.cpp, add 1/2/4 bpp support to SDL and do all video output via SDL. If we're going to implement run-time depth switching we would need some kind of "blit any color depth to any other color depth" feature anyway.) > In addition, I tried adding an XSync call at the end of every video > frame This seems to fix the annoying mouse pointer flicker as well. > I'd imagine that minimal updates (the existing BII code) would give > better performance for non-local displays I'm not so much concerned about this. The VOSF itself should avoid most unnecessary image transfers. I've added your patches to the CVS without any #ifdef's. Nice work. Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: Brian J. J. <bjj...@ya...> - 2001-03-01 22:12:57
|
Folks, When looking over the window-mode VOSF screen update code, it struck me that maintaining the_buffer_copy and calculating the minimal rectangle to update was a rather expensive operation. It involves scanning many kilobytes of memory on every screen update, from both the_buffer and the_buffer_copy... if we're going to scan the entire page anyway, why not just blit the whole thing to the screen (after all, DGA and XShm blits are essentially memory copy operations), and bypass all the overhead of maintaining the_buffer_copy? So I ran some experiments. The machine: my dual-processor SGI Octane, IRIX 6.5.11. BII compiled with SGI's MIPSPro compilers with "-Ofast" optimization (massive interprocedural analysis), output to the local display (DISPLAY set to ":0", which should allow XShm BII version: CVS current as of around January 2000 BII video mode: 30 Hz, 800x600, window video (DGA doesn't work on IRIX. I should really write an OpenGL BII video driver, since OpenGL is the fast path to the screen under IRIX. One of these decades....) Bit depths as described below. The test: boot MacOS 7.6.1, start Speedometer, run 3 iters. of video test, save, quit and shut down. Speedometer doesn't do 24bpp, so the test for that depth was: boot 7.6.1, start the game Continuum (a real video hog), play 1 level running in to the far wall. I applied the following patch, which bypasses the the_buffer_copy code, assuming that all bytes in pages flagged by VOSF are modified: --- video_vosf.h 2001/02/10 15:29:01 1.14 +++ video_vosf.h 2001/03/01 03:17:34 @@ -234,6 +234,7 @@ int x1, x2, width; if (depth == 1) { +#ifdef MIMIMAL_VOSF_REDRAWS x1 = VideoMonitor.x - 1; for (j = y1; j <= y2; j++) { uint8 * const p1 = &the_buffer[j * bytes_per_row]; @@ -245,7 +246,11 @@ } } } +#else + x1 = 0; +#endif +#ifdef MIMIMAL_VOSF_REDRAWS x2 = x1; for (j = y2; j >= y1; j--) { uint8 * const p1 = &the_buffer[j * bytes_per_row]; @@ -257,18 +262,24 @@ } } } +#else + x2 = (((VideoMonitor.x>>3) - 1) << 3) + 7; +#endif width = x2 - x1 + 1; // Update the_host_buffer and copy of the_buffer i = y1 * bytes_per_row + (x1 >> 3); for (j = y1; j <= y2; j++) { Screen_blit(the_host_buffer + i, the_buffer + i, width >> 3); +#ifdef MIMIMAL_VOSF_REDRAWS memcpy(the_buffer_copy + i, the_buffer + i, width >> 3); +#endif i += bytes_per_row; } } else { +#ifdef MIMIMAL_VOSF_REDRAWS x1 = VideoMonitor.x * bytes_per_pixel - 1; for (j = y1; j <= y2; j++) { uint8 * const p1 = &the_buffer[j * bytes_per_row]; @@ -280,8 +291,12 @@ } } } +#else + x1 = 0; +#endif x1 /= bytes_per_pixel; +#ifdef MIMIMAL_VOSF_REDRAWS x2 = x1 * bytes_per_pixel; for (j = y2; j >= y1; j--) { uint8 * const p1 = &the_buffer[j * bytes_per_row]; @@ -293,6 +308,9 @@ } } } +#else + x2 = VideoMonitor.x * bytes_per_pixel - 1; +#endif x2 /= bytes_per_pixel; width = x2 - x1 + 1; @@ -300,7 +318,9 @@ i = y1 * bytes_per_row + x1 * bytes_per_pixel; for (j = y1; j <= y2; j++) { Screen_blit(the_host_buffer + i, the_buffer + i, bytes_per_pixel * width); +#ifdef MIMIMAL_VOSF_REDRAWS memcpy(the_buffer_copy + i, the_buffer + i, bytes_per_pixel * width); +#endif i += bytes_per_row; } } I used SGI's Speedshop profiler to measure the time the video update thread spent in the routine update_display_window_vosf and its children (I began the Speedshop experiment with the MacOS boot, after dismissing the BII configuration GUI.) In the tables below, "perf" is the result of 3 iterations of Speedometer's video test (higher is better), %self is the percent of the video update thread's time spent in update_display_window_vosf, and %incl is the percent of its time spent in update_display_window_vosf and its children. On all runs, the great majority of the update thread's time was spent in nanosleep(), as it should be. I can send full profile output if anyone's interested. Command line: setenv _SPEEDSHOP_INIT_DEFERRED_SIG 17 ; setenv _SPEEDSHOP_DEBUG_NO_SIG_TRAPS ; ssrun -totaltime ./BasiliskII Vanilla BasiliskII (MINIMAL_VOSF_REDRAWS turned on): perf %incl %self 8bpp .46 11.8 8.4 m48592 (288 samples) 15bpp .40 13.0 12.1 m48442 (745 samples. More time blitting) 15bpp .45 5.5 5.0 m50604 (289 samples. 20 sec. shorter) 24bpp - 29.9 28.0 m49782 (1030 samples) With MINIMAL_VOSF_REDRAWS turned off: perf %incl %self 8bpp .52 12.4 0.0 m48456 (Only 34 samples. Not much time spent in the video code!) 15bpp .53 4.1 0.0 m49772 (145 samples) 24bpp - 3.2 0.0 m51065 (84 samples) So at 8bpp, Speedometer measured a performance increase of 13%, and at 15bpp, an increase of 33% or 18%, depending on the run. On all settings I got occasional freezes, although they seemed less prevalent with the modified code and the lower bit depths. They probably affected the measurements, eg. the difference in the two 15bpp vanilla runs. (The hangs look like an IRIX pthreads bug: the video thread gets stuck in nanosleep(). SGI's internal bug database suggests that there's an unwholesome interaction among pthreads, signals, and nanosleep, and that pthreads_sv_timedwait can be used instead of nanosleep as a workaround. I'll have to give that a try.) The subjective performance increase was very noticeable as well. With my modifications, BII "felt" like a real Mac. Continuum was actually playable in 8bpp mode with MINIMAL_VOSF_REDRAWS turned off, unlike in any other mode! (It was still slightly sluggish, but then, it's slightly sluggish on a real '030 Mac in anything but black-and-white mode. Which raises another question: why isn't one-bit "color" supported in Quadra (as opposed to Classic) mode?) In addition, I tried adding an XSync call at the end of every video frame, to make sure that the X server keeps caught up with all the data BII is throwing at it. This made a tremendous improvement in a game I ported to an HP Bobcat (68020!) workstation ages ago, so I thought it might help BII: diff -u -r1.36 video_x.cpp --- video_x.cpp 2001/01/28 14:05:19 1.36 +++ video_x.cpp 2001/03/01 03:17:34 @@ -2053,6 +2053,9 @@ LOCK_VOSF; update_display_window_vosf(); UNLOCK_VOSF; +#ifndef NO_FRAME_SYNC + XSync(x_display, false); // Let the server catch up +#endif } } } This patch noticeably improved the smoothness of the video, and also seemed to reduce the hangs I was seeing. In fact, the _only_ way I've been able to run BII successfully on my SGI O2 workstation (as opposed to the Octane) is with the XSync call, and DISPLAY set to localhost:0 (which presumably defeats XShm.) And it runs quite well in that configuration. I'd imagine that minimal updates (the existing BII code) would give better performance for non-local displays, where the cost of shipping the pixmaps to the server is much greater, so perhaps MINIMAL_VOSF_REDRAWS should be a prefs item instead of a compile-time option. Ideas? I'd be very interested in hearing how these patches affect BII performance on other platforms, especially those with DGA (similar hacks would need to be made to update_display_dga_vosf(), of course.) Thanks, ===== Brian J. Johnson __________________________________________________ Do You Yahoo!? Get email at your own domain with Yahoo! Mail. http://personal.mail.yahoo.com/ |
From: <gb...@di...> - 2001-02-24 09:01:20
|
Hi, There are a few things I think it is worth fixing before releasing B2 0.9. IMHO, the following are not new features but bugfixes: - Teach Linux/NetDriver to work^Wcompile with newer kernels (2.4+) - Clean up siginfo subterfuges in configure by possibly sharing a common header between configure programs and B2 ? - Fix the DGA2 bug - Classic Emulation. I don't think it is possible to do Classic emulation in direct addressing mode since we are always keeping the whole 32 bits from the address. I don't really want to have to build multiple binaries for B2. i.e. { DGA1, DGA2 } x { Classic, Quadra } emulation is starting to be big in both compilation time and space. So, we should find a decent solution. The following is just convenience for me so that I may not have to trash compiler.{h,cpp} and fpu files whenever I update my JIT tree ;-) - Remove uae_cpu/compiler.{h,cpp} - Move fpp.cpp, fpu_x86* into uae_cpu/fpu/ Since no features must be added for that release, I won't commit the new fpu core ("ieee") from my tree. I would have appreciated support for FreeBSD 4.1 in direct addressing mode. Unfortunately, it is quite weird there: SIGBUS is used instead of SIGSEGV and, most importantly, the test program that used to work alone will lock inside the configure script... I will try to have a closer look at it this weekend. |
From: Christian B. <cb...@st...> - 2001-02-17 16:56:02
|
Hi! On Fri, Feb 16, 2001 at 12:58:20PM +0100, Gwenole Beauchesne wrote: > Christian, I hope you don't mind if I release it as version 0.9. I've just released a new snapshot bearing 0.9 as well. :-) As such, the thing is now feature frozen for the 0.9 release. Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: <gb...@di...> - 2001-02-17 08:07:09
|
Hi, I thought XFree 4.X servers could support old DGA 1.X applications. Unfortunately, it doesn't look to be the case on my system (Debian 2.2, XF4.0.1, glibc 2.1). i.e. XF86DGAQueryDirectVideo() will not return the proper XF86DGADirectPresent bit, and therefore B2 would run in (small) windowed mode. Does anyone know a fix for that instead of adding DGA2-specfic support functions ? |
From: <gb...@di...> - 2001-02-16 11:51:17
|
Hi, Christian, I hope you don't mind if I release it as version 0.9. It is just a side effect from synchronizing with the latest CVS sources ;-) What's new * Windows port - First port to Windows (experimental) - Fixed 24 bpp screen blitter (Windows seems to use packed BGR888) - Started synchronization with B2 0.9 [*] * 680x0 core and JIT compiler - Fixed some extended-precision floating-point computations - Fixed scrollbar bug in MacOS. It was due to an incorrect code generator for FMOVEM instructions - Fixed stupid bug introduced earlier this year in CINV and CPUSH instructions . The translation cache was simply not flushed as requested. BBEdit Lite 4.1 crashes are solved and probably others... - Added emulation of CMOV instructions for processors that don't support them (e.g. AMD K6-2). The "jitwantcmov" prefs item is now deprecated since the code generator uses native CMOV instructions if possible, and emulate them otherwise. [*] The common sources are now really common between the Windows port and the standard distribution. I probably broke one or two features from the initial Windows port, though (keyboard type, "get hardware volume"). I also introduced two other system-dependencies (WIN32) in emul_op.cpp and extfs.cpp Actually, I am saying that it may now be possible to reintegrate the Windows port back into CVS, if Lauri wishes it. It should be just a matter of importing the Windows directory and patching the Makefile so that it uses the files from the "old" (interpretive) cpu core. Note that I don't provide any sources from the rest of the Windows distribution, i.e. drivers, GUI tool. BTW, could someone please patch the GUI prefs editor for Windows so that it takes into account the JIT-specific prefs items ? GCC/MinGW doesn't support MFC or it might rely on really experimental foreign {DLL,LIB} imports. Bye, Gwenol=E9. |
From: Christian B. <cb...@st...> - 2001-02-11 17:49:27
|
Hi! On Sat, Feb 10, 2001 at 01:07:34PM +0100, Gwenole Beauchesne wrote: > That was too easy for you to find it out. ;-) Did I miss an Inside Mac ? It's the result of countless hours spent with a disassembler and a Mac ROM. :-) There was only once an article in Apple's "develop" magazine that mentioned UniversalInfoPtr: http://www.mactech.com/articles/develop/issue_18/108-113_Puzzle_Page.html Nothing much helpful there, though. > Subsidiary question: why did the problem show up only when B2 was run as > root ? I don't know. Maybe B2 was able to initialize more (sound, CD-ROM, SCSI) because it had more permissions? Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: <gb...@di...> - 2001-02-10 12:02:58
|
Hi, > Direct addressing requires VOSF and therefore the configure script > checks if the system is capable of that. Initially, it was possible to do without VOSF but it would have required the same amount of work (blitters, arrangements to the static screen updates, to the DGA code). Actually, it is faster to go with VOSF because we have a chance to know exactly which pages of the framebuffer get modified. > * TECH file: I think it is worth mentioning those > - direct addressing > - real addressing now works under Linux/i386 provided that the > above-mentioned requirements are met. Done. > - VOSF I don't know where to insert that one. It is partially linked to the modes of operation since it helps to convert the Mac framebuffer to the host framebuffer, in direct or real addressing. In virtual addressing, the "Mac" frame buffer is the host framebuffer because specific memory access functions are used to handle the different layouts. |
From: <gb...@di...> - 2001-02-10 12:00:57
|
Hi, > The UniversalInfo contains as its first element (decoderInfoPtr) a pointe= r > to a list of hardware base addresses. The AddrMapFlags (which also comes = from > the UniversalInfo) contains one bit for each of these addresses, signifyi= ng > its validity. That was too easy for you to find it out. ;-) Did I miss an Inside Mac ? Subsidiary question: why did the problem show up only when B2 was run as root ? > So you can replace hardware addresses by patching the DecoderInfo of the > UniversalInfo used in the ROM (for example at the beginning of > rom_patches.cpp/ patch_rom_32()). Done. > For a production version of B2 it seems better to silenty ignore illegal > accesses (and have a "debug" option to turn on strict checking). I set up a global variable for that (PatchHWBases) because you may want to add a command-line switch to disable the patch ? Bye, Gwenol=E9. |
From: Christian B. <cb...@st...> - 2001-02-01 17:32:49
|
Hi! On Tue, Jan 30, 2001 at 04:42:28PM +0100, Gwenole Beauchesne wrote: > Therefore, I traced down changes to those LowMem globals (0x1d4, 0x1d8, > and 0x1dc) and the most "frequent" ones are those initialized by some > code located at ROMBase + 0x92a. The code there is a little bit > obfuscated Let's see... SetupHWBases A00910: MOVE.L ($DD0),D0 ;AddrMapFlags (basesValid) A00914: MOVEA.L ($DD8),A0 ;UniversalInfoPtr A00918: ADDA.L (A0),A0 ;decoderInfoPtr A0091A: LEA ($A0094A,PC),A2 A0091E: MOVE.W (A2)+,D3 ;Offset in DecoderInfo (/4) A00920: BMI $A00930 A00922: MOVEA.W (A2)+,A3 ;Address of LM global A00924: BTST D3,D0 ;Base valid? A00926: BEQ $A0091E A00928: LSL.W #2,D3 ;Yes, set LM global A0092A: MOVE.L (0,A0,D3.W),(A3) A0092E: BRA $A0091E ... A0094A: 0002 01D4 ;VIA 1 Base A0094E: 0003 01D8 ;SCC Read Base A00952: 0004 01DC ;SCC Write Base A00956: 0011 01D8 ;SCC IOP Read A0095A: 0011 01DC ;SCC IOP Write A0095E: 0005 01E0 ;IWM Base A00962: 0010 01E0 ;SWIM IOP A00966: 0006 0B0A ;PWMBuf1 A0096A: 0006 0312 ;PWMBuf2 A0096E: 0007 0266 ;SoundBase (RAMSndBuf) A00972: 0008 0C00 ;SCSI Base A00976: 0009 0C04 ;SCSI DMA A0097A: 000A 0C08 ;SCSI Hsk A0097E: 000B 0CEC ;VIA 2 Base A00982: 000C 0CC0 ;ASC Base A00986: 000D 0CEC ;RBV Base A0098A: 000F 0C00 ;SCSI Base A0098E: 000F 0C04 ;SCSI DMA A00992: 000F 0C08 ;SCSI Hsk A00996: 0012 0CEC ;OSS Base A0099A: FFFF ;End The UniversalInfo contains as its first element (decoderInfoPtr) a pointer to a list of hardware base addresses. The AddrMapFlags (which also comes from the UniversalInfo) contains one bit for each of these addresses, signifying its validity. So you can replace hardware addresses by patching the DecoderInfo of the UniversalInfo used in the ROM (for example at the beginning of rom_patches.cpp/ patch_rom_32()). IIRC, ShapeShifter on the Amiga does this in exactly this way (all addresses except for the ASC base are redirected to ScratchMem). I chose not to do this in Basilisk II because it was more helpful during development to have Mac hardware accesses cause a segfault (so it drops into a debugger and you can see what went wrong). For a production version of B2 it seems better to silenty ignore illegal accesses (and have a "debug" option to turn on strict checking). Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: Gwenole B. <Gwe...@en...> - 2001-01-30 15:42:43
|
Hi, I committed a patch to emul_op.cpp so that VIA, SCCRd and SCCWr base addresses may get faked to ScratchMem. Actually, the patch occurs in EMUL_OP_INSTALL_DRIVERS because it seemed more appropriate than EMUL_OP_PATCH_BOOTGLOBS. Unfortunately, this is not the safest way to take and I would like to attack the problem right from its roots. Therefore, I traced down changes to those LowMem globals (0x1d4, 0x1d8, and 0x1dc) and the most "frequent" ones are those initialized by some code located at ROMBase + 0x92a. The code there is a little bit obfuscated and I don't think that replacing it with NOPs would do any good. I think it would be best to replace the MOVE.L (A3,D3.W*1,$00), (A3) instruction (4 bytes) with an EmulOP and a NOP. The suggested EmulOp is EMUL_OP_OVERWRITE_VIA_SCC. Its purpose would be to set the referenced value to ScratchMem if A3 references one of the above-mentioned LowMem globals. Otherwise, it would just what the instruction was supposed to do. Do you have any other suggestions ? PS: I had to commit that patch because B2 would crash in some cases in real or direct addressing mode. More strangely, without that patch, B2 would run fine in "user" mode whereas it would crash as "root" just before the extensions are loaded. Both prefs files (in user or root) are identicial. How such a thing could happen ? i.e. switching to root made the bug more visible. Bye, Gwenole. |
From: Christian B. <cb...@st...> - 2001-01-28 17:53:49
|
Hi! On Sat, Jan 27, 2001 at 09:38:25AM +0100, Gwenole Beauchesne wrote: > Depending on exactly when you intend to release version 0.9 If the source could be frozen next week, that would be fine. > * Should we detail configure options in the INSTALL file ? Only if the average user would have a reason to specify them. > * TECH file: I think it is worth mentioning those > - direct addressing > - real addressing now works under Linux/i386 provided that the > above-mentioned requirements are met. > - VOSF Indeed... Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: <gb...@di...> - 2001-01-27 08:32:57
|
Hi Christian, > I'd like to make a 0.9 release of Basilisk II this week, since the last > "official" release is now a year old. So if anyone has objections, he should > speak up now. Also, each contributor should read the README, INSTALL, TODO > and TECH files and check them for errors and outdated information. Depending on exactly when you intend to release version 0.9 and how long the testing process will take, I would have liked to rearrange the blitters in the VOSF code so that the right RGB mask values could be taken into account. I could work on it only that weekend, however. About the documentation: * Should we detail configure options in the INSTALL file ? --enable-addressing=<mode> where mode is: fastest Try to auto-detect the best mode available real Real Addressing direct Direct Addressing banks Banked Memory Addressing Notes: 1. Real addressing works only for Linux/i386 insofar Basilisk II is able to map the whole 680x0 address space to 0x00000000, in the host address space. Otherwise, Basilisk II may fail due to another LowMem global (or any other variable) that we still don't patch correctly thus making B2 to address something off the valid memory regions... 2. Direct addressing should work on any platform that supports extended signal handlers (a signal handler that takes a siginfo_t structure as parameter) or on any platform we provide a subterfuge. Direct addressing requires VOSF and therefore the configure script checks if the system is capable of that. If VOSF can't be enabled, you will have to switch to banked memory addressing mode. In that case, please tell the authors about it (operating system used, kernel version, etc.) * TECH file: I think it is worth mentioning those - direct addressing - real addressing now works under Linux/i386 provided that the above-mentioned requirements are met. - VOSF [I probably won't have the time to add them, however] Bye; |
From: Christian B. <cb...@st...> - 2001-01-22 16:40:16
|
Hi! I'd like to make a 0.9 release of Basilisk II this week, since the last "official" release is now a year old. So if anyone has objections, he should speak up now. Also, each contributor should read the README, INSTALL, TODO and TECH files and check them for errors and outdated information. As the JIT emulation now seems pretty stable, there could be a 1.0 release soon afterwards (I intend to make more frequent releases). Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: Christian B. <cb...@st...> - 2001-01-13 18:02:11
|
Hi! On Fri, Jan 12, 2001 at 08:06:49PM +0100, Gwenole Beauchesne wrote: > A solution may be to set up VIA to be right in the ScratchMem region. > But where could that be achieved ? EMUL_OP_PATCH_BOOT_GLOBS? > Alternative 1: skip the instruction that caused the access to an invalid > memory region. SheepShaver also does this kind of thing (but skipping instructions on PPC is easy, they are all 4 bytes). Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: Christian B. <cb...@st...> - 2001-01-13 15:13:36
|
Hi! On Fri, Jan 05, 2001 at 11:56:29PM +0100, Gwenole Beauchesne wrote: > Indeed, whenever I set up a breakpoint, gdb would not stop and finally > makes B2 to exit with a (-1) return value. Therefore, I am currently > condamned to read and re-read the source code and experiment a few things > with runtime disassemblers and some other techniques I am not proud of... Have you tried (cx)mon? Maybe breakpoint support can be added to it. > Actually, the JIT compiler normally does 68020 up to 68040. the original > compiler was to support 68000 only. Is it worth "supporting" that CPU ? People using the Classic emulation are probably not in need of speed... Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |
From: <gb...@di...> - 2001-01-12 19:01:13
|
Hi, It was brought to my attention that MacMinix wouldn't run under UNIX builds of Basilisk II in Direct Addressing mode. The problem is that MacMinix tries to read the VIA Interrupt Flag Register located at (VIABase + 0x1a00). But in direct addressing, this is just out of any valid memory region. The Windows port doesn't suffer from that problem because it simply skips the instruction that caused the illegal memory access. A solution may be to set up VIA to be right in the ScratchMem region. But where could that be achieved ? Note: MacMinix really does access to VIA by itself. i.e. It doesn't seem to be caused by a MacOS support routine. Therefore, a runtime patch like those in rsrc_patches.cpp is not possible. Alternative 1: skip the instruction that caused the access to an invalid memory region. I am a bit worried because this is too specific to the system. Alternative 2: just don't support MacMinix! ;-) Anyway, I am sure the VIA registers issue will not be the sole problem with MacMinix. I am in favor of that solution... |
From: <gb...@di...> - 2001-01-07 22:29:17
|
Hi, [New direct addressing scheme for Windows] > In fact: I tested under OSR2/VMware and Win98 only after doing this > aforementioned change, so it might be that you fixed it :) > Would you like to try it out? Sure, I would like to try it out but my Win98 is definitively dead now. I ran scandisk and then it "fixed" the problems. And as their fixes tend to remove things, I am good to reinstall Windows now. I will try to have it up that week. My new 20 GB disk needs some operating systems. ;-) I have just installed FreeBSD 4.1 and intend to port JIT and support direct addressing for that OS too. That shouldn't be difficult, though. > >or simply implement the ScratchMem subterfuge as > >well. > > What's that? I have probably missed something. That's the trick Christian uses under AmigaOS in order to prevent MacOS from writing to ROM or some other places. You can check it from rom_patches.cpp and rsrc_patches.cpp. - ROM_IS_WRITE_PROTECTED is set to 0 - USE_SCRATCHMEM_SUBTERFUGE is set to 1 The hack is to patch the faultive handle so that it "points" to a scratch region (64 KB) then just let MacOS writes to it as usual. As nobody else seems to complain, I will drop support of banked addressing mode if B2 is compiled with the JIT compiler. Direct addressing is a so big deal, even with a pure interpretive core, that it should be implemented on as many operating systems as possible. The only problem is what you mentioned: a badly-written MacOS program could make B2 to crash by writing to a valid host address. There is an intermediate solution with the baseaddr[] trick as any invalid bank would get mapped to a scratch memory area. That technique associated with the JIT compiler is twice as slow as with the direct addressing mode but it is, in my opinion, the fastest "safe" way to operate. OK, I have yet to fix the other bug (spcflags and "get out of compiled code" process) then, we can start porting the JIT compiler to Windows again. Talking about that, I won't be able to use your assembly optimized core right now. Indeed, when an instruction is too complex to be compiled (or no template is implemented yet), we just call the "cpuemu*" handler and, as we don't return from an instruction handler in your code, some changes would have to be done to the core. On one hand, I am afraid those changes would probably slow down compiled code execution. On the other hand, calls to "cpuemu*" handlers within compiled code tend to be less and less frequent now, so that would probably not slow down execution that much... |
From: Lauri P. <lpe...@ni...> - 2001-01-06 05:53:33
|
On Fri, 5 Jan 2001 18:59:46 +0100, you wrote: Hi Gwenol=E9, >3) Forbid usage of banked memory addressing with the JIT compiler. I >will tend to this solution since direct addressing is so much faster >than banked memory addressing. Both with the JIT compiler, of course. I think number 3) makes sense... >Lauri, is it really impossible to do Direct Addressing under Win 9X ? Not at all. Your post inspired me to test it after a very long time, it seems to work ok now. I'm not sure when it got fixed, but in theory, it should have been working all along. Writes to random memory crash Windows 9x a lot more easily than NT. At least for me, MacOS writes to illegal addresses during the boot and later on too, not only to ROM. Most of the illegal writes are catched by the exception handler, but if it happens to write to a legal (but fatal) host address, say, the B2 stack, heap or static data, anything can happen. I suspect that something like this was the reason. It's a bit scary that things can break this way. The only way (that I know of) to make sure it won't happen is to precede every memory access with a "bound" assembly command, but it's slow. >As far as the Unix ports are concerned, I really don't have to do >"triple_allocation". > >Rough process: > >- allocate RAM + ROM at the same time but keep their respective base >address properly page-aligned > >- ROMBaseMac seems to be relocateable. Therefore, it is simply > RAMBaseMac + aligned_ram_size; > >- Init MEMBaseDiff to be RAMBaseHost - RAMBaseMac. As RAMBaseMac is 0, >MEMBaseDiff simply turns out to be RAMBaseHost. > >- MacFrameBaseMac seems to be relocateable too. Then, when it is time to >initialize VideoMonitor.mac_frame_base, you simply assign the result of >the call to Host2MacAddr(the_host_screen_buffer); > >It will work because we have distinct regions and offsets between a >virtual (Mac) address and a native (host) address is constant. Conceptually this is pretty much the same what I have been doing. But when I started to write the triple allocation thing, I assumed that it is not safe to relocate anything. Given the 1GB/2GB address space I had, I quickly realized that I must relocate *something* and it worked, but I tried to keep them at absolute minimum. But I already changed the code to reflect what you said, allocating only one memory block. It makes the code cleaner; and either one relocates something or doesn't -- you can't be almost dead either now can you :) In fact: I tested under OSR2/VMware and Win98 only after doing this aforementioned change, so it might be that you fixed it :) Would you like to try it out? >- As for write-protecting the ROM region, you can stick with you >step_over() method=20 The stepping is needed not only because of the ROM, but other illegal memory accesses as well. >or simply implement the ScratchMem subterfuge as >well. What's that? I have probably missed something. I grepped the old mails and found some hits, but none of them explained what's that all about. >Conclusion: > >What should I do ? Struggling myself to make the JIT compiler "fast" in >banked memory addressing and therefore fix the problems I related >hereabove ? You know, current results shows that banked memory >addressing is twice as slow as direct addressing... I am just wondering >if it is worth the effort ;-) As far as the Windows port is concerned, banks are needed only under Classic emulation, I don't think that those addresses can be relocated, although I have never tried. Classic emulation has been broken for some time in Windows port, but I'm hoping to get it working again some day. I have no real use for it, but some other people do. And anything I might want to run under Classic emulation is more than fast enough already. >PS: I placed a new source tarball on my website. > >Bye. Lauri |
From: <gb...@di...> - 2001-01-05 22:51:46
|
Hi, > On Mon, Dec 04, 2000 at 07:55:38PM +0100, Gwenole Beauchesne wrote: > > Talking about PPC Emulation, I realized when I finally set up a PPC > > cross-compilation system that PSIM was only twice as slow as UAE on the > > benchtest program that comes with PSIM. > > Hm... Makes me wonder how fast it can get just by throwing out unnecessary > stuff (memory management). For what it worth, here are the configure options I used for the fastest PSIM build I ran: --enable-sim-cflags="-g0 -O2" \ --enable-sim-decode-mechanism=padded-switch \ --enable-sim-icache=1024 \ --enable-sim-filter="-f 64" \ --enable-sim-inline \ --enable-sim-bswap \ --enable-sim-endian=big-endian \ --enable-sim-model=MODEL_ppc604 \ --disable-sim-model-issue \ --disable-sim-regparm \ --disable-sim-smp \ --disable-sim-jump \ --disable-sim-monitor \ --disable-sim-stdcall \ --disable-sim-xor-endian \ --disable-sim-trace \ --disable-sim-assert Yes, the "padded-switch" implementation yielded better results than a direct threaded core on my K6-2... |
From: <gb...@di...> - 2001-01-05 22:51:42
|
Hi, > > PS: Christian, could you please rerun your tests for "Julia's Dream" ? > > This seems to work fine now, but I get lots of random crashes with the > emulator. The average runtime is about 30 seconds before it crashes > (usually "do_handle_screen_fault: unhandled address", but also "Illegal > instruction"). This is on a PIII with Debian 2.2 and gcc 2.95.2. Yes, I also noticed that for small translation cache sizes. e.g. 256 KB or 512 KB. With a much higher cache, say 8 MB, things are more stable and also much faster. As for the cause to the problem, I am pretty sure it is related to the way I handle spcflags and the condition/process to get out of compiled code once a real ("hard") translation cache flush occured. I had experimented a few other approaches but with no avail :-/ It's probably not related to spcflags after all. You know, I find pretty hard to debug B2 because I can't use gdb. Indeed, whenever I set up a breakpoint, gdb would not stop and finally makes B2 to exit with a (-1) return value. Therefore, I am currently condamned to read and re-read the source code and experiment a few things with runtime disassemblers and some other techniques I am not proud of... > > PPS: I removed compiler.{cpp,h} in my sources. Do you still want those > > for future normal B2 releases (CVS) ? > > Anything that is unneeded can go. Actually, the JIT compiler normally does 68020 up to 68040. the original compiler was to support 68000 only. Is it worth "supporting" that CPU ? How many people do really emulate a Classic Mac ? ;-) > > B2-JIT only does 68040 though it should also work for 68020 and 68030 > > cpus, but actually doesn't... > > Caching issues? Yes, probably. I already had that problem in the past and fixed that in a previous release. The problem is that I don't remember how since I was fixing other bugs I thought unrelated to that problem. i.e. some day, I wanted to try in 68030 mode again and it automagically worked... But there is definitively a cache problem since the JIT compiler would also get disabled forever once the Desktop shows up. |
From: <gb...@di...> - 2001-01-05 17:54:48
|
Hi, Facts: using the JIT compiler when banked memory addressing is used is awfully slow. It is actually slower than without JIT compilation at all! Current solution to make things faster: the "baseaddr" table hack. That table contains offsets to add to a virtual Mac address so that it results into a native (host) equivalent. A table of MEMBaseDiff values if you prefer. That table is only used in compiled code because we still need the old memory access routines to know whether it is possible to use the baseaddr[] or not. Cases where baseaddr[] can't be used is when we access the frame buffer and some conversions are required (e.g. for an RGB565 layout) Sure, some of you will tell me that I could use the VOSF "technology". That's true but if we can use it, that means that we could use direct or real addressing mode as well. Unfortunately, "knowing whether it is possible to use the baseaddr[] or not" is pure guess-work! Indeed, paraphrasing Bernie's words, we actually make the assumption that any given instruction will either always access real memory, or always access memory that needs some post- or pre-processing. Let us take an example: say I have a basic block of instructions whose purpose is to copy some data. Hmm, say this is the _BlockMove trap. We are likely to compile this piece of code with the assumption that only "real" memory is accessed, i.e. the host address is determined through the baseaddr[] table. Unfortunately, what happens if _BlockMove is used to move data from a screen buffer to the MacFrameBuffer ? Well, if screen depth is greater than 8bpp, this simply draws horrible portions of screen. Solutions: 1) Use _BlockMove and _BlockMoveData trap replacements. That will probably not work since there are for example custom-made memcpy() implementation and the same problem is bound to persist. 2) Use the blitters from VOSF mode. Making it portable would require to use the old memcmp() method to determine the screen regions to update. That's not a problem but just makes things slower. The other major problem is we will also lose the benefit of DGA as in VOSF mode. Well, in one way, it's better slower than buggy. 3) Forbid usage of banked memory addressing with the JIT compiler. I will tend to this solution since direct addressing is so much faster than banked memory addressing. Both with the JIT compiler, of course. Lauri, is it really impossible to do Direct Addressing under Win 9X ? As far as the Unix ports are concerned, I really don't have to do "triple_allocation". Rough process: - allocate RAM + ROM at the same time but keep their respective base address properly page-aligned - ROMBaseMac seems to be relocateable. Therefore, it is simply RAMBaseMac + aligned_ram_size; - Init MEMBaseDiff to be RAMBaseHost - RAMBaseMac. As RAMBaseMac is 0, MEMBaseDiff simply turns out to be RAMBaseHost. - MacFrameBaseMac seems to be relocateable too. Then, when it is time to initialize VideoMonitor.mac_frame_base, you simply assign the result of the call to Host2MacAddr(the_host_screen_buffer); It will work because we have distinct regions and offsets between a virtual (Mac) address and a native (host) address is constant. - As for write-protecting the ROM region, you can stick with you step_over() method or simply implement the ScratchMem subterfuge as well. Conclusion: What should I do ? Struggling myself to make the JIT compiler "fast" in banked memory addressing and therefore fix the problems I related hereabove ? You know, current results shows that banked memory addressing is twice as slow as direct addressing... I am just wondering if it is worth the effort ;-) PS: I placed a new source tarball on my website. Bye. |
From: Christian B. <cb...@st...> - 2001-01-02 21:10:28
|
Hi! There's now a "basilisk-cvs" mailing list on SourceForge that receives all CVS commit messages. This list archives seem to be non-operational, but the list itself should work. So, if you'd like to get informed about updates to the Basilisk II CVS, head to http://lists.sourceforge.net/lists/listinfo/basilisk-cvs Bye, Christian -- / Coding on PowerPC and proud of it \/ http://www.uni-mainz.de/~bauec002/ |