Easy everyday parallelism with a file tree abstraction. Read a directory structure as a Julia data structure, (lazy-)load the files, apply map and reduce operations on the data while not exceeding available memory if possible. Make up a file tree in memory, create some data to go with each file (in parallel), write the tree to disk (in parallel). FileTrees is a set of tools to lazy-load, process and save file trees. Built-in parallelism allows you to max out all threads and processes that Julia is running with. Files and subtrees in a file tree can have any value attached to them, you can map and reduce over these values, or combine them by merging or collapsing trees or subtrees. When computing lazy trees, these values are held in distributed memory and operated on in parallel.
Features
- Tree operations such as map, filter, mv, merge, diff are immutable
- Nothing is written to disk until save is called to save a tree, hence tree restructuring is cheap and fast
- Files and subtrees in a file tree can have any value attached to them
- When computing lazy trees, these values are held in distributed memory and operated on in parallel
- FileTrees is a set of tools to lazy-load, process and save file trees
- Built-in parallelism allows you to max out all threads and processes that Julia is running with