Menu

#556 support parallel compress-uncompress (multi threaded)

3.0 Series
open
nobody
None
5
2024-02-22
2020-04-07
No

support parallel compress-uncompress (multi threaded) in LZMA or other usefull algorithm, to take advantage of the multi-cpu environment we all have today...

Discussion

  • Hamish McIntyre-Bhatty

    I would definitely like this as well.

    For my use case, which involves using a bundle Cygwin environment to run a graphical program (and is hence quite large), it would make my installer development/test cycle a lot quicker if I could do parallel compression.

     
  • Sergio Torres Soldado

    ditto

     
  • Johan Compagner

    Johan Compagner - 2021-07-16

    yes we have the same problem, i don't fully understand that this is not way more higher prio
    i dont care to much about compressing (our build server makes installers and that doesn't happen all the time)
    But our customers installing it is very slow. If i have exactly the same data in a 7zip archive and i uncompress it with 7zip it is so much faster..

     

    Last edit: Johan Compagner 2021-07-16
    • Anders

      Anders - 2021-07-16

      Things take time because we are a really small team, essentially two people. A patch is more likely to make it in, rather than a request.

       
  • Gad Hayisraeli

    Gad Hayisraeli - 2022-09-08

    @anders_k thanks for all your efforts ! i suggest you replace the current NSIS compression methods with a known compression - external - API library like LZMA DLL or such, so you can pass it parameters like jobs number (in parallel) etc. so you wont have to do this yourself since the LZMA official SDK already supports that

     
  • forgexin

    forgexin - 2023-04-14

    It takes too much time to compress large file that may be 1GB size . Hope Parallel execute File instruntion.

     
  • Jasper van Baten

    Love the software. Parallel compression would certainly cut down installer compilation time a lot! Implementation could be very simple: compress each file in a different thread in a thread pool or parallel for construction. This would not tackle the case of 1 big file, but certainly the probably more common case of many files.

     
  • Jason

    Jason - 2023-12-30

    For anyone interested, I took the multithread code from my fork (NSISBI) and made a patch for NSIS 3.09. The code does compile using VC6, and resulting ansi installers can be run on windows 9x / ME.

    Note: the underlying format for the compressed data has changed slightly, therefore all 3rd party apps for extracting compressed data will fail to decompress these installers.

     
  • Johan Compagner

    Johan Compagner - 2024-01-02

    Thx Jason, first question: can you now merge your stuff really into the main product? (before we are getting all kinds of forks). So have maybe a discussion with the team?

    Second thing is, i moved to your fork for our product and it is way faster compressing and making the installer..

    instead of that it took +/-370 seconds, it now takes +/-115 seconds. so that is a nice improvement but for me not the most important one, that is extracting, but the size did increase it went up from 569MB (56.6%) to 585MB (58.2%).
    But because of the big improvement of compression i started testing right away the decompressing which i find personally way more important (because compression is done once a day at our build server and so on, or when we really making releases a bit more but that time is not so important, the time it takes to install our product is, that is done over and over again by our customers)

    But decompression didn't gain that much :(

    the old one it ws +/-43 second and with the new (bigger) one: 36 seconds so not 1/3 that i hoped for..

    The thing is that 7zip can be so much faster, if i just take a zip (not 7zip) of the same install and extract that: 11 seconds..

    if i make from that install a real 7zip file, and extract that again: 9 seconds...

    So why there is so much time difference still i don't know.

     

    Last edit: Johan Compagner 2024-01-02
  • Jason

    Jason - 2024-01-03

    I've had my fork since 2016, and yes I have pushed some of my code back into nsis over the years. The devs are well aware of what I'm doing.

    The size increase is expected, because the lower level format changed slightly, it's using 1MB blocks for the data, so compression does suffer from the small size.

    The main problem with nsis vs a zip file, is that all the data is stored sequentially in a zip file, and it's much faster to decompress when all the data is in one stream; whereas in nsis, each file is stored as a separate compressed data block, so it can't utilize previous data to help decompression speed. I only implemented threading for each file, not for the data as a whole like a zip file does.

    You can try solid compression, and use a neat trick to decompress the whole installer first before installing, you just move your .oninit function to the end of the script and add a dummy file that you extract. This forces the installer to decompress the whole installer just to get that file on startup.

     
    • Aloft

      Aloft - 2024-02-18

      Thanks for the great work.

      The latest version of NSISBI supports multithread compression, however the size increases significantly (10~25% in my cases). How to decrease the size of installer as much as possible?

      Or is it possible to disable the multithread function? I have tried setting SetCompressorNumThreads to 1, but it doesn't change the size.

       
  • Jason

    Jason - 2024-02-18

    That's something I haven't looked into yet, what type of content (text, video, other compressed data, etc) are you compressing to see that much of a size change? And how big are those files as well?

    I have already optimized files smaller than 1MB to be compressed on the main thread without starting the threading manager, which is a speed tweak, not a size tweak (since the size remains the same in this case).

    At the moment the underlying format is the reason why the size doesn't change regardless of thread count, so I suggest if you want size, use the official nsis version, or an older version of my fork without threading for the time being. My project is still in development and I appreciate the feedback.

    There is always going to be a compromise between speed and size, this is inherent to any compression that goes from single thread to multithread. Really lzma should be the only codec used now, since it has the highest compression ratio. It's also particularly tricky since this isn't a standalone file, there is also overhead of the installer as well and it works differently to a file based compression (ie .7z files).

     
  • Aloft

    Aloft - 2024-02-20

    Most files are text, with some binary executable files.

    I have tried different settings with the previous version of NSISBI and found that the major factor is the /SOLID option. While I remove the /SOLID option, the size of installer is larger than that of NSISBI multithread version with /SOLID. It seems the effect of solid compression is significantly reduced in the new version, as shown following.

    Method:                 file1,     file2,     file3,     file4
    NSISBI 3.08 w/ SOLID:   55377KB,   181075KB,  196457KB,  277085KB,
    NSISBI 3.08 w/o SOLID:   82148KB,  229624KB,  246605KB,  364585KB,
    NSISBI 3.09 w/ SOLID:   81701KB,  215292KB,  233265KB,  352166KB,
    NSISBI 3.09 w/o SOLID:   83079KB,  232803KB,  250401KB,  371446KB.
    

    BTW, I found a bug that the file name is missing in the log file, e.g.:

    File: wrote 963232 to ""
    File: wrote 11218944 to ""
    File: wrote 8704 to ""
    

    If change the code in exehead/exec.c from

            log_printf3(_T("File: wrote %d to \"%s\""),ret,buf0);
    

    to

            log_printf3(_T("File: wrote %d to \"%s\""),1,buf0);
    

    the file name is correctly recorded in the log file. This issue appears in the version of both 3.08 and 3.09.

     

    Last edit: Aloft 2024-02-20
  • Jason

    Jason - 2024-02-21

    If you are using lzma, then yes it's expected. lzma uses a dictionary to help compression ratio. Without solid compression, the compressor has to make a new dictionary for every file it compresses; whereas with solid compression, it's one whole chunk of data so it can use a single dictionary for all of it.

    The problem is that dictionary based compressors are inherently hard to multithread, because the dictionary is built up from the compressed output data, data that doesn't exist when you add more than one chunk/thread. This is reflected in your tests, because I'm using chunks to just wrap the existing compression method. Zlib and bzip2 have little effect on ratio, because they don't use a dictionary (or use a preset dictionary).

    That log bug is interesting, I didn't change the logging code from official nsis. I had a look, turns out I did change the type for the return value to a 64 bit type, which isn't handled automatically by the print function. Try the attached patch which just truncates sizes over 4GB. I'll have to do a proper fix another time.

     
  • Aloft

    Aloft - 2024-02-21

    Thanks Jason. I got it.

     
  • Johan Compagner

    Johan Compagner - 2024-02-21

    i would love to use solid compression and then i don't really care if the compression speed is not so high..

    If that would help the download size and maybe also the install speed, that would be great

    The only problem is that there is no option to support solid compression and still have 1 (exe) file so that is kind of a bummer..

     
  • Jason

    Jason - 2024-02-21

    You can use solid compression in my fork (use 'OutFileMode aio' followed by SetCompressor to access solid compression), it's just not as effective as official nsis is. You'll find that solid compression is even faster than non-solid compression, my tests show up to 40% faster in some cases.

    I might consider doing some background decompression for solid installers, while the pages are being displayed. This is independent of the codecs, so official nsis can benefit from it as well if I get a nice enough implementation.

     
  • Johan Compagner

    Johan Compagner - 2024-02-21

    aio is default right? (so auto is aio?)
    because i get this:

    SetCompressor: ignoring /SOLID flag due to OutFileMode auto (:6)

    problem is a bit i can't really easily set that because i use mvn and this plugin:
    https://github.com/DigitalMediaServer/nsis-maven-plugin

    so can't easily just set random stuff i think :(

     
  • Jason

    Jason - 2024-02-21

    No, auto is the default. You can put it in the nsisconf.nsh file, this way it will be always be run for every installer.

    If my fork ever gets merged into nsis, the default will change to 'aio' so that it doesn't break existing scripts. The reason auto is the default is so the compiler will always be successful in making an installer, no matter what the size is (if you have the space for it of course). Solid compression is limited to aio files only, which is why it's not the default. It's a 'user first' thing, usually people discover my fork because they run into an error about size in official nsis, and I don't want them to also get an error on my fork as well, so that they keep using an nsis based install system.

     
  • Johan Compagner

    Johan Compagner - 2024-02-22

    thx it took a while to figure it out and try a few runs (you cant mix/match stuff with command line arguments and what is in the config file)
    so i don't give any arguments related to compression and have those all in the nsisconf.nsh file:

    OutFileMode aio

    SetCompressor /FINAL /SOLID lzma

    then it works and it is much faster compressing it and also decrompressing it.

    Problem is this works on windows:

    [INFO] [MAKENSIS] Processing config: xxxxxx\target\nsis\nsisconf.nsh
    [INFO] [MAKENSIS] OutFileMode: aio
    [INFO] [MAKENSIS] SetCompressor: /FINAL /SOLID lzma

    but when commit those same changes and let our buld server, which is linux, builds it (that has the makensis executable also for linux)
    then it doesn't say "processing config' at all, just as it just doesn't see that file..

    I call it with the same mvn command, so why that is i don't know yet :(

     
  • Jason

    Jason - 2024-02-22

    It should be here: /etc/nsisconf.nsh . My ubuntu 22.04 VM has a different install path: /home/jason/nsis/install/etc/nsisconf.nsh , and it sees it just fine. If you see this line in the output: "!define: "MUI_INSERT_NSISCONF"="", then you know the config file is being found and processed.

     
  • Johan Compagner

    Johan Compagner - 2024-02-22

    yeah i fixed it by just setting the NSISCONFDIR environment variable to the dir (where also the executable is in..)

    Problem is that it is not installed in the system, but i just extract a tar file in a mvn target dir and then run it directly from there

     

Log in to post a comment.