Menu

#4345 Loading large Smacker movies is slow

All Games
closed-fixed
Graphics (902)
5
2009-06-07
2009-05-20
No

Current SVN snapshot

Loading a large Smacker movie, such as the Broken Sword 2 intro, is pretty slow at the moment. (By slow, I mean there's a one-second delay or so on my fairly new computer.) It looks to me like most of the delay is in the BigHuffmanTree::decodeTree() function, but I'm not sure exactly what it's doing so I can't suggest any way of speeding it up.

Discussion

  • kirben

    kirben - 2009-05-20

    This problem is even more noticeable in the opening sequences of The Feeble Files, which uses several video files.

    The Feeble Files demo (which includes opening sequence) could be used for testing, if anyone doesn't have the game.

     
  • Max Horn

    Max Horn - 2009-05-20

    Our smacker decoding code looks a lot less efficient then what FFmpeg has (see <http://git.ffmpeg.org/?p=ffmpeg;a=blob;f=libavcodec/smacker.c>, their SVN repository seems to be gone?!). In particular, we do bad things like using Common::Array where they had a simple fixed size array, etc.

    Assuming that the FFmpeg code is performant / efficient, I think it should be possible to improve the speed of our code to similar levels.

     
  • Torbjörn Andersson

    The header has fields for mMapSize, mClrSize, fullSize and typeSize. They can probably be used to calculate the size of the array that's needed. If you multiply them by four you get *almost*, but not quite, the final size of the _tree arrays. At least for the few cases I've tried.

     
  • Thomas Fach-Pedersen

    The BigHuffmanTree uses a prefix lookup table to speed up decodes, but the SmallHuffmanTrees do not. I didn't do that at first because the small trees are only used during setup and I never got around to it because it wasn't necessary for me. I would suggest that adding this, and figuring out if memory can be pre-allocated, be the first approach to speeding up initial setup.

    The bigtree prefix tables use 8 bits of lookup, some googling suggests 9 bits is recommended. This might be a secondary improvement.

     
  • Max Horn

    Max Horn - 2009-05-25

    Also note that our Common::Array code is riddled with bounds checking and other assert() statements, which won't help speed either.

    Anyway, I just reactivated my old 400 Mhz G4 Mac, which should serve as a great testbed for profiling the smacker code on a low end device (and a big endian one at that, though I'd hope the code is already endian safe ;)

     
  • Torbjörn Andersson

    I timed it a bit more carefully, and on my computer (which claims to have a 3.4 GHz CPU), the delay before the Broken Sword 2 intro was actually a little over 3 seconds.

    There were some optimizations made earlier today (pre-allocating a little bit of memory for the trees, and fixes to the Array class, I believe), which has brought the time down to a little over 100 milliseconds, which certainly is good enough on my hardware. I'd be curious to hear the before/after figures for Fingolfin's Mac, though.

     
  • Thomas Fach-Pedersen

    I implemented prefix lookup for the SmallHuffmanTree class like it's done already for the BigHuffmanTree.

    I can't attach to this bug report so I pasted it here:
    http://scummvm.pastebin.com/f1aa950e7

    Please test if it improves the speed of initialization.

     
  • Torbjörn Andersson

    That patch brings down the load time for the Broken Sword 2 intro from a little over 100 ms to a little over 40 ms on my computer.

     
  • Thomas Fach-Pedersen

    This patch implements both the prefix lookup for the SmallHuffmanTree class and it preallocates the tree storage as plain c++ arrays.

    It's been tested (lightly) on the BS1 intro sequence.

    http://scummvm.pastebin.com/f5195a3b8

     
  • Thomas Fach-Pedersen

    I apparently forgot to delete[] the _tree-member from the SmallHuffmanTree. That's what you get for switching to plain arrays :-)

    I don't have a source tree on this machine (or even svn) so if anybody applies this patch, please add a destructor to the SmallHuffmanTree-class that delete[]s the _tree-member.

     
  • Thomas Fach-Pedersen

    Here's another patch against trunk.

    http://scummvm.pastebin.com/f7671c765

    This changes the BitStream class to not require a two byte look-ahead, the _trees are now fixed sized, and there's some cleanup.

    As before, it works on the BS1 intro sequence. I have not tested it on anything else.

     
  • Torbjörn Andersson

    The latest patch brings the time down to a bit less than 30 ms. (Actually, I've only been timing the creation of the four big Huffman trees, since that was the bottleneck before.) I no longer trust the values to be exact, but it's at least an indication that the patch speeds things up a bit further.

    I'd still be curious to hear what difference the patch and the recent changes make on low-end hardware, though.

     
  • Filippos Karapetis

    • assigned_to: nobody --> thebluegr
    • status: open --> closed-fixed
     
  • Filippos Karapetis

    Applied madm00se's patch with some minor modifications. I've tested this with the Smacker files of BS1, BS2, FF, FTA2 and Dinotopia, and they all seem to be working, with 0 loading times, so the slow loading issue seems to be gone with this patch.

    Thanks for your work once again, madm00se :)

    Closing

     
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.