Menu

Search folder names only and copy contents

2019-11-26
2020-01-20
  • Richard Gralnik

    Richard Gralnik - 2019-11-26

    Hi, I need to identify folders on an 8 TB drive where the folder name contains one of a list of keywords then copy that folder's entire contents to another drive. There are 8.5 million files and folders on the drive so I need to be able to do this in a batch mode. Can SmartCopyTool do this?

     
  • Richard Gralnik

    Richard Gralnik - 2019-12-03

    Simon,

    Would you be interested in a quick contract to write such a script? (I don't know Python and don't have time on this project to learn it on the job.). Basically I would need to treewalk a directory structure for folders with names that contain one of a list of keywords (as a substring or a whole word if the folder name contains whitespace). If such a folder is found its entire contents would need to be copied to another drive preserving timestamps and the entire directory structure from the drive letter (this is for a windows machine) down to the lowest file copied. Please let me know. Thanks, Richard

     
  • Simon Booth

    Simon Booth - 2019-12-07

    No contract necessary, good opportunity to re-acquaint myself with Python:

    https://github.com/machinewrapped/Toybox/blob/master/Python%20Scripts/CopyFoldersBasedOnKeyword.py

    To run it you'll need to install Python 3.8 (https://www.python.org/downloads/windows/) and edit the source, target and keyword lines to suit your needs then run the script.

    You can use Idle, which comes with Python, to edit and run the script.

    Let me know if you have any issues ... it works OK in my tests but they weren't exactly exhaustive :)

     
    • Richard Gralnik

      Richard Gralnik - 2019-12-09

      Simon,
      Wow.  Thank you VERY much.  I was starting toteach myself Python so I could write the programmyself but it takes more time than I really have sothis is a huge help.
      Not to look a gift horse in the mouth or seem
      ungrateful, but could I ask for a couple of what I
      hope are simple enhancements? 

      First, I need to preserve the original timestampsin the copied data.  I was going to use a call torobocopy with the options /COPY:DAT and /DCOPY:T.Is there an equivalent in python?
      Second, this project is so massive (8 TB of data and8.5 million files and folders to go through) that I cannotdo it in one pass.  Also it is possible that the terms I useto identify folders to copy may overlap causing the samedata to be copied twice.  Can you add an option to the copythat will prevent copying something again if it's already on
      the target drive?
      Again, thank you.
      Richard

      On Saturday, December 7, 2019, 10:56:33 AM PST, Simon Booth <mrsdbooth@users.sourceforge.net> wrote:
      

      No contract necessary, good opportunity to re-acquaint myself with Python:

      https://github.com/machinewrapped/Toybox/blob/master/Python%20Scripts/CopyFoldersBasedOnKeyword.py

      To run it you'll need to install Python 3.8 (https://www.python.org/downloads/windows/) and edit the source, target and keyword lines to suit your needs then run the script.

      You can use Idle, which comes with Python, to edit and run the script.

      Let me know if you have any issues ... it works OK in my tests but they weren't exactly exhaustive :)

      Search folder names only and copy contents

      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/smartcopytool/discussion/general/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       
      • Simon Booth

        Simon Booth - 2019-12-09

        I've updated the Python script to handle directories that already exist at the destination more intelligently - it'll just copy any files that are missing, rather than re-copying everything.

        Ironically that's exactly the sort of thing SmartCopyTool handles nicely ... or would, if it had filtering by keyword :)

        The script uses shutils.copy2 to copy files and folders (shutil.copytree uses copy2 internally by default) ... copy2 says it tries to preserve file metadata where possible ... apparently on windows that includes modification time but not creation time :-/

        Copying files with the original creation time looks like it would be significantly harder, unfortunately... would need to copy files individually, and then transfer the creation timestamp separately:

        https://subscription.packtpub.com/book/networking_and_servers/9781783987467/1/ch01lvl1sec13/copying-files-attributes-and-timestamps

        Afraid I'll have to leave it as an exercise for the reader ;-)

         
        • Richard Gralnik

          Richard Gralnik - 2019-12-10

          Simon,
          Thank you for the update.  I'm still trying
          to understand some of the commands inthe original script so this helps a lot.
          If you don't mind, how does this statementrun:
                  if any(word in folder for word in keywords):

          I don't see word or folder defined or used beforethis and they aren't reserved words.  Am I supposedto substitute them for something?

          Richard

          On Monday, December 9, 2019, 3:08:33 PM PST, Simon Booth <mrsdbooth@users.sourceforge.net> wrote:
          

          I've updated the Python script to handle directories that already exist at the destination more intelligently - it'll just copy any files that are missing, rather than re-copying everything.

          Ironically that's exactly the sort of thing SmartCopyTool handles nicely ... or would, if it had filtering by keyword :)

          The script uses shutils.copy2 to copy files and folders (shutil.copytree uses copy2 internally by default) ... copy2 says it tries to preserve file metadata where possible ... apparently on windows that includes modification time but not creation time :-/

          Copying files with the original creation time looks like it would be significantly harder, unfortunately... would need to copy files individually, and then transfer the creation timestamp separately:

          https://subscription.packtpub.com/book/networking_and_servers/9781783987467/1/ch01lvl1sec13/copying-files-attributes-and-timestamps

          Afraid I'll have to leave it as an exercise for the reader ;-)

          Search folder names only and copy contents

          Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/smartcopytool/discussion/general/

          To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

           
          • Simon Booth

            Simon Booth - 2019-12-10

            Hah, yes sorry that is an obscure bit of code, and very 'Python'. It breaks down to something like:

            any(...) ... is the list formed by this expression non-empty
            [word in folder] ... true if the string 'folder' contains the string 'word' ...
            [for word in keywords] ... where we will use the name 'word' to refer to every item in the list 'keywords'

            It's basically equivalent to something that would look more like this in other languages:

            foreach (string word in keywords)
            {
                if (folder.Contains(word))
                {
                       // Do the things
                      break;
                }
            }
            
             
            • Richard Gralnik

              Richard Gralnik - 2019-12-10

              Simon,
              Thanks for the explanation. 

              Richard

              On Tuesday, December 10, 2019, 11:07:37 AM PST, Simon Booth <mrsdbooth@users.sourceforge.net> wrote:
              

              Hah, yes sorry that is an obscure bit of code, and very 'Python'. It breaks down to something like:

              any(...) ... is the list formed by this expression non-empty
              [word in folder] ... true if the string 'folder' contains the string 'word' ...
              [for word in keywords] ... where we will use the name 'word' to refer to every item in the list 'keywords'

              It's basically equivalent to something that would look more like this in other languages:
              foreach (string word in keywords)
              {
              if (folder.Contains(word))
              {
              // Do the things
              break;
              }
              }

              Search folder names only and copy contents

              Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/smartcopytool/discussion/general/

              To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

               
    • Richard Gralnik

      Richard Gralnik - 2020-01-20

      Hi Simon,
      I finally learned enough Python to write the programs.It turned out to be more complicated than I originallythought (that's never happened before) and I endedup writing one program to identify folders to copy
      based on the presence of keywords in the path thena second program to actually copy each folder and allof its subdirectories.  I also added logic to stop thecopy after a certain amount of data was copied withthe ability to pick up where it left off on another run(there are 8 TB of data to filter).

      I'd be interested in your opinion of the code and anysuggestions on how I could have done it better if Iactually knew what I was doing if you're interested.
      Richard

      On Saturday, December 7, 2019, 10:56:33 AM PST, Simon Booth <mrsdbooth@users.sourceforge.net> wrote:
      

      No contract necessary, good opportunity to re-acquaint myself with Python:

      https://github.com/machinewrapped/Toybox/blob/master/Python%20Scripts/CopyFoldersBasedOnKeyword.py

      To run it you'll need to install Python 3.8 (https://www.python.org/downloads/windows/) and edit the source, target and keyword lines to suit your needs then run the script.

      You can use Idle, which comes with Python, to edit and run the script.

      Let me know if you have any issues ... it works OK in my tests but they weren't exactly exhaustive :)

      Search folder names only and copy contents

      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/smartcopytool/discussion/general/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       

Log in to post a comment.

MongoDB Logo MongoDB