Menu

#324 2K page size is inefficient

future
open
nobody
None
5
2015-05-04
2015-05-04
No

I'm raising this as a bug even though it might be argued that it should be a feature request, simply as it's a performance issue.

With the switch to 2KB pages (arguably needed for Opus, but also needed for the Mirage Microdriver, Currah µSpeech, Didaktik 80 (and 40), we incur a greater performance cost to page swapping.

The ZX Spectrum 48K, 128K and +2/+3 are the main concern for those running on mobile devices, where this overhead matters most. I therefore think we should revert to a 16 KB page size, but add a mechanism for handling subpages.

I could be talked into going with an 8 KB page size for the Timex machines but feel this would probably not be worth the slowdown to 48K/128K emulation.

I'm not even sure whether there are any popular pathological cases of page swapping for the 128K/+2/+3 to test against, so I am not sure of the severity of this. Even with 2K pages Fuse still only uses 2.5% CPU when at the 128K menu on my own machine so I'm not sure how bad the change to 2K pages will have been... so whilst I don't currently see this as a huge issue, I'd like to know if I'm mistaken in that view.

Once we have a subpage mechanism, we should be able to freely switch back to 4K, 8K, 16K or even 32K pages without breaking emulation.

I imagine we would have recursive divisions into groups of four or so subpages, so for example:

0x0000-0x0000 [0]: Opus ROMCS
    0x0000-0x0fff [0,0]: Lower Opus ROM
    0x1000-0x1fff [0,1]: Upper Opus ROM
    0x2000-0x2fff [0,2]: RAM/MMIO
        0x2000-0x23ff [0,2,0]: Lower Opus RAM
        0x2400-0x27ff [0,2,1]: Upper Opus RAM
        0x2800-0x2bff [0,2,2]: WD FDC MMIO
        0x2c00-0x2fff [0,2,3]: WD FDC MMIO
    0x3000-0x3fff [0,3]: MMIO/floating bus
        0x3000-0x33ff [0,3,0]: 6821 MMIO
        0x3400-0x37ff [0,3,1]: 6821 MMIO
        0x3800-0x3bff [0,3,2]: Floating bus
        0x3c00-0x3fff [0,3,3]: Floating bus
0x4000-0x7fff [1]: RAM page 5
0x8000-0xbfff [2]: RAM page 2
0xc000-0xffff [3]: RAM page 0

With such a scheme in place we would no longer need peripheral-specific code in writebyte_internal() or in readbyte().

The only problem here is I'm sketchy on the details of how to handle this efficiently. I imagine we would throw away our old page map entirely when updating our page tables, and not bother coalescing subpages if a fine grained mapping is overridden by a more coarse mapping.

For ROMs, we might want to consider achieving mirroring by duplicating the ROM image itself, to avoid using subpages.

Discussion

  • Stuart Brady

    Stuart Brady - 2015-05-04
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -19,7 +19,7 @@
                 0x2000-0x23ff [0,2,0]: Lower Opus RAM
                 0x2400-0x27ff [0,2,1]: Upper Opus RAM
                 0x2800-0x2bff [0,2,2]: WD FDC MMIO
    -            0x2c00-0x2fff [0,2,3]: FDC MMIO
    +            0x2c00-0x2fff [0,2,3]: WD FDC MMIO
             0x3000-0x3fff [0,3]: MMIO/floating bus
                 0x3000-0x33ff [0,3,0]: 6821 MMIO
                 0x3400-0x37ff [0,3,1]: 6821 MMIO
    
     
  • Stuart Brady

    Stuart Brady - 2015-05-04
    • summary: Hierarchical page handling --> 2K page size is inefficient
     

Log in to post a comment.