Menu

#198 merge from background job could be way more efficient

workingwiki
open
None
5
2013-07-27
2012-08-02
Lee Worden
No

Merging a background job that creates a whole lot of files can take a really long time. This is because the merge process is:

  • copy all the updated and created files into the persistent directory, while leaving the rest behind.
  • remove the background session directory.

This algorithm has the advantage that if the merge fails, the background session is still there and you can try again. The obvious disadvantage is that it's really slow and requires a lot of disk space, compared to just moving the relevant files. Also, it's not all that "safe" because if the merge fails halfway through the persistent directory will have half the files copied into it. Should I just switch to using mv? It would probably be so fast!

Discussion

  • Jonathan Dushoff

    Does this issue also apply to preview sessions?

     
  • Jonathan Dushoff

    I would say it's worth trying. Using "mv" doesn't seem a particularly risky move.

     
  • Lee Worden

    Lee Worden - 2012-08-08

    I guess it's the same for preview merges, except that if it fails someone might try to do another preview in it, which could be very chaotic with a bunch of project files missing. Maybe not though.

    There's a related issue (whether it uses cp or mv), which is that the MW save operation happens in two steps - first the ?title=Page_Title&action=submit, which does the saving to the database and outputs a redirect, and second, the target of the redirect, which is the now-saved page (?title=Page_Title) - and the controversial stuff happens in the first step, which gives me no place to put error messages where the user will see them. That should eventually be addressed, though not in this tracker item. The connection is that if a preview merge fails, the user won't be told. (Preview merges "fail" all the time, in that I discard all the preview project files if anything has happened in the persistent project that threatens to make them anachronistic - but failing after merging half the files would be different.)

    Honestly, I don't imagine these potential failures are a big deal. The main way it would happen is if the disk partition becomes full - and that is much less of a danger when moving files than copying... and lots of other failures would happen in people's working directories as well...

     
  • Jonathan Dushoff

    It's hard ever to be sure, but I'm feeling like slow preview merges are an issue in my life.

    I think it's best to go ahead with mv; the way to make things more robust, imo, would be to improve that archived project file mechanisms, so taht valuable files could be protected and other files cleared when necessary.

     

Anonymous
Anonymous

Add attachments
Cancel