TAbZipper.AddFiles() is not thread safe.
When calling TAbZipper.AddFiles() from separate instances of TAbZipper from separate threads the process of zipping the files can fail. This is because of the internal calls to ChDir that are used through the source code. Usages of ChDir() can be found in: AbZipPrc.pas, AbTarTyp.pas, AbGzTyp.pas, AbArcTyp.pas.
The problem occurs because Thread #1 will come change the current working directory with a call to ChDir() and if Thread #2 comes and does the same thing right after thread #1 does, Thread #1 will now be operating on a directory that it is not supposed to be working on because thread #2 changed the working directory before thread #1 had a chance to do any work.
Symptoms to the problem will be that the application will have directories locked after the two threads get done doing their work.
This could be fixed by adding a critical section into the calls to ChDir(), however I feel that this would be poor mans solution as it would introduce a bottle neck. I feel that a better approach would be to remove the calls to ChDir() and keep track of the current directory being operated on by using a local variable.
With that said... I am wondering how willing you would be to accept a patch from me to resolve the issue? Some open source projects are very picky as to receive a patch from a first-time poster. I have to fix the problem regardless as this is causing problems in my multi-threaded server application. I would also be ok with me if someone other than me resolved the issue.
Hi Rick,
I haven't had time to work on Abbrevia in depth in a long time, and would absolutely welcome outside patches. I will take a look at it with a critical eye, but your posting history won't have any effect on that.
I agree with your analysis, both the cause and the proposed fix. From what I remember, the ChDir calls are all related to the BaseDirectory handling, both when compressing and extracting, and do affect how the resulting filenames are stored in the archives. The test cases do use BaseDirectory for simple cases, but it's not trying to stress that specifically. If you want to verify behavior, expanding the test framework would be very welcome too.
If you have any questions, please don't hesitate to ask.
Hi Craig, that sounds great. I will be working on building a patch for this. What is your preferred method for delivering a patch? I am familiar with the GitHub approach of using pull requests and was wondering what you usually use here on sourceforge.
Thanks,
Rick
Unfortunately things are a lot more low-tech here with Subversion. Best is to just attach the changes to this bug report, either full copies of the files or patches against trunk. I'll use Beyond Compare to compare the changes, so either is fine. There isn't a way to credit your account to the commit like Github, but if that's something you want I can give you access and let you check in the changes yourself after I have a look at them.
Sounds good.
Just wanted to give an update....
I have started the process of fixing the components that I found to NOT be thread safe. My changes so far fix more than just the TAbZipper component.
So far I have been running in production for a couple of months with the new changes and I am no longer experiencing issues in multi threaded environments. I have temporarily posted my changes to a temporary GitHub repository:
https://github.com/rburgstaler/tpabbrevia
Even though the changes have been in production for a couple of months, some of the tests that worked before my changes are now failing after the changes. My goal is to fix the tests before posting the updates to you here at SourceForge.
One other thing that I just noticed last week is that the Cab file extractor is not thread safe either. I will be attempting to fix that in the future as well.
Thanks for the update on your progress. From a cursory look, one thing I think will be a problem is the changes to CreateItem()'s initialization of DiskFilename, which will affect TAbUnzipper/TAbZipKit more than TAbZipper. In previous builds you could load an archive and extract it to multiple places by changing BaseDirectory in between each extraction call. I'm not sure if that's part of the unit tests, but I think that would fail now.
Adding BaseDirectory should probably be done when using DiskFilename. Maybe add an accessor method for "DiskFilename" and add it automatically whenever it's called? Since TAbArchiveItem doesn't currently have a reference back to the containing TAbArchive that might not be trivial; I haven't looked into it myself. Alternatively, add a "TAbArchive.GetDiskFilename(aItem: TAbArchiveItem): string" method instead.