#10 Fix Output File Checking in CmdLineUtil.java

closed-fixed
nobody
None
3
2010-07-09
2010-06-30
James Kosin
No

This patch fixes the file checking on the output to be sure we have write permissions on an existing file. I'm guessing on the previous comments in the source that this was correct. The code was commented out earlier.... not sure why?

This addes the needed checks for the existance and that is is a file.

Discussion

1 2 > >> (Page 1 of 2)
  • Joern Kottmann
    Joern Kottmann
    2010-06-30

    Yeah File.canWrite only returns true if the file exists and if it can be written. Thats why the out commented code does not work properly.

    Before a often long running training is started it should be validated if the model can be written at the end.

    Do you know if it is possible to check if a file can be written without creating it ?

    Chec

     
  • Joern Kottmann
    Joern Kottmann
    2010-06-30

    Ups, accidentally hit update. We then have to add to all training tools a check prior training if the model can be written.

    Jörn

     
  • James Kosin
    James Kosin
    2010-06-30

    Joem,

    Hmm... I'll have to check and see if we have a way of unlinking (or removing) after the check. I tested with a READ-Only file and the code does bauk at the file and terminates the session.
    I'll also check to see if there is a way to get the permissions on a directory. That would allow us to check the permissions there first.

    Actually, my code would first check for the existance of the file before checking the writability. Which should keep from creating the file if it doesn't exist. But, this won't catch the case were we don't have permission to create the file that doesn't exist.

    James

     
  • Joern Kottmann
    Joern Kottmann
    2010-06-30

    "Actually, my code would first check for the existance of the file before
    checking the writability."

    Yes, that must be checked, like you did in the patch. In the case
    the existing file is not write able the check fails.

    "But, this won't catch the case were we don't have
    permission to create the file that doesn't exist."

    Yes, that case is not checked at all and I am not sure if it
    is possible to check that without creating the file prior training.

    Creating the file prior training also avoids the concurrency issue we have with
    the current checks, e.g. after the check and during training the condition might
    change and the file is not longer be write able.

    Jörn

     
  • James Kosin
    James Kosin
    2010-06-30

    Thanks.

    Would it be better than to create the file then in the check?
    Then we have the possibility of creating a model (at least by name) without actually writing any data; ie: if the training failed.

    James

     
  • Joern Kottmann
    Joern Kottmann
    2010-06-30

    Yes I think so.

    We must look at these two cases, the output file exists and the file does not exists.

    First case, the file should be opened and locked. When training was successful the model
    should be written, if it fails the output file should not be changed at all.

    In the other case, the model file is created prior training and if training fails the model is invalid, it might be nice to delete it on a best we can do basis. If deleting the model file is not possible and a user assumes its valid he will fail
    when he tries to load it.

    Jörn

     
  • James Kosin
    James Kosin
    2010-07-01

    createNewFile() will return true if it can create the file for writing.. I just have to catch any exceptions from this that would cause it to fail. The only other failure would be returning false however that would be caught by the previous else if clause.
    James

     
  • Joern Kottmann
    Joern Kottmann
    2010-07-01

    We should change the method to open and lock the output file. If that fails it should print some kind of meaningful error message. The caller also needs to know if the output file was created or not.

    Maybe the method should return an object which contains an Output Stream and a flag indicating if the file was created or not. Then we can have one generic methods which handles an error e.g. when an exception is thrown based on that object.

    What do you think ?

    Jörn

     
  • James Kosin
    James Kosin
    2010-07-01

    Sounds good; however, it goes at bit beyond... and we run the risk of having a file handle lost or worse if the JVM crashes while we are training.
    I'm updating the patch to at least check the thrid case of the file not existing and creating to be sure we can write the file.

    James

     
  • James Kosin
    James Kosin
    2010-07-01

    I'm not able to test the file being non-existant and the file not being able to be created and writable. Under Linux it is a bit easier to perform this test than here with windows....

    James

     
  • James Kosin
    James Kosin
    2010-07-01

    Joem,

    Right now it looks like we only test the output file just before writing the model out. We would have to change a lot to get the file validated before training in all the other classes, as they pass a file handle into the writeModel method....
    Currently, there is a split instant when we check the file and we attempt to write to the file. The writes will always fail, if it can't create or otherwise overwrite the file. This way we have the flexability of checking before; however, as is the patch doesn't add anything we don't have with the current situation... Other than maybe catching the fact we can't write to the file before we attempt to write to the file for real.

    James
    James

     
  • James Kosin
    James Kosin
    2010-07-01

    • priority: 5 --> 3
     
  • James Kosin
    James Kosin
    2010-07-01

     
    Attachments
  • James Kosin
    James Kosin
    2010-07-01

    Joem,

    This fourth patch cleans up the code a bit to make more readable. I'll look into the other option of checking the file before training and how difficult it would be to do that.

    James

     
  • James Kosin
    James Kosin
    2010-07-01

    Patch to check output file before training

     
  • James Kosin
    James Kosin
    2010-07-01

    Joem,

    The patches to check first were easy enough. The patch below for CheckOutputFirst.patch patches the training file for the command line. This allows us to check the file before training.
    The only ohter change would be you have to set checkOutputFile() public instead of private. I wasn't able to get a patch of that; because the other changes were too close to the function name. I would have produced a really bad patch-file had I removed the other lines.
    I tested and everything appears to be working as expected.

    I'll have to try a quick test for not being able to create the file.
    James

     
  • James Kosin
    James Kosin
    2010-07-01

    Joem,

    I was able to create a situation where it wasn't able to create the output file. This also works and generates the proper response.

    James

     
  • Joern Kottmann
    Joern Kottmann
    2010-07-01

    CheckOutputFirst.patch is applied. Thanks for the patch James.

    Just checking if writing to the model file before training is possible is fine for me. It might turn out to be no longer the case after training, but then the actual write fails and the users gets an appropriate error message.
    The check in the beginning is then more like a hint to tell the user that something is obviously wrong, on a best we can do basis.

    This check in the beginning should not create the output file, because then it would be better in my opinion to just directly open it like I explained in an earlier post.

    To make sure an non-existing file can be written we should test that the parent directory has write permissions, then we are allowed to create a file in it (does that also works for windows?).

    In the other patch, we also have to ensure that if a file exists isDirectory and isFile might both return false. Please have a look at the code I committed based on your patch.

    What do you think ?

    Jörn

     
  • James Kosin
    James Kosin
    2010-07-01

    Jorn,

    I'll have to check out tonight. I'm on EDT, at work.

    James

     
  • James Kosin
    James Kosin
    2010-07-01

    Jorn,

    Looks good; however, the test for the directory being writable doesn't work with Windows. The test always succeeds even if the user doesn't have permissions to do so.

    James

     
  • James Kosin
    James Kosin
    2010-07-02

    Patch to fix Permission Check on writting file

     
    Attachments
  • James Kosin
    James Kosin
    2010-07-02

    Newly added patch file fixes the file for checking permissions in Windows.
    James.

     
  • Joern Kottmann
    Joern Kottmann
    2010-07-02

    Can you please explain what you did exactly on windows to make the code fail ?
    Would like to add a comment into the source code which explains why we create the file.

    Jörn

     
  • James Kosin
    James Kosin
    2010-07-03

    Jorn,

    Okay, here is my best.

    (a) Create a directory, in my case under the project where I do my training of the models.
    (b) Under windows, (at least Windows 7), putting read-only permissions on a directory isn't enough using attrib. I tried and was able to create and write the resulting model file every time.
    (c) Change the permissions by left clicking on the directory under a windows file manager, my favorate is explorer.
    (d) Change the permissions for everyone, under security and use the edit button, on the directory to deny everything... don't worry, you will still be able to regain control if you still allow special permissions... If you click on Full Control check box for deny, it will select the appropriate checkboxes down the list. NOTE: Don't do this in the Advanced button. You may loose complete control and may not be able to remove or regain the directory at all! You have been warned.
    (e) Now, do the normal training; however specify a new file in the directory that is now created that shouldn't allow any writting to the directory... In fact you should not be able to do much with the directory in this state.

    I'm sure linux has a different security model and you should be able to perform the same things if you remove write permissions on the directory. Windows has something simular with the security model; however the test for the directory writing still succeed even after the permissions have been set to deny. Could even be a bug in Windows...

    Sorry this is so long. The file gets created because it is the only way to verify that we have permission to create a file. The code does currently remove or delete the newly created file. I tested this by only performing the test and not actually writing to the model file after training.

    I'm guessing that Java may be seeing the directory and may not be checking the correct permissions when determining if we can write to the directory. Or like I said maybe a bug with windows not reporting the correct permissions. Either way, the test for writing succeeds regardless in Windows.

    I'll verify this again, and write back.

    James

     
  • James Kosin
    James Kosin
    2010-07-03

    Naw....

    It is a bit more conveluted than that. Under Windows the isWritable() method for Java isn't working correctly, or it is getting or being told wrong information by windows.

    I singled out only Writing and Java failed to catch this, even with an existing model in the directory that was now not writable (maybe the read only attribute and not the permissions for this case).

    I think I'm going to have to dig on this a bit more and see if we can get the permissions for the directory and file in a more intelegent way.

    James

     
1 2 > >> (Page 1 of 2)