From: Patrick F. <fus...@gm...> - 2011-09-30 07:44:57
|
On 09/28/2011 07:26 PM, Kristofer Pettijohn wrote: > GFS2 in Google was redesigned for smaller files. Multi-master design is needed, but that is a huge overhaul and a lot of work to complete. > > Ask and beg for it; you might see it some day. > Those are interesting points, that MooseFS has an architecture like GoogleFS and now Google has the GFS2 aka Colossus. Colossus is designed for smaller files and has a distributed master design. Maybe that is what MooseFS 2 will work to emulate as well. > On Sep 28, 2011, at 8:55 PM, Ken wrote: > >> Distribute filesystem always design for huge space. Waste often exist. eg: >> Haystack in facebook, GFS in google never recycling space of delete >> files, they mark flag for deleted status. >> It isn't true that all distributed file systems are designed for huge files. Lustre for instance uses the block size of the underlying file system. I disagree that the concept of distributed file systems is synonymous with large files. That doesn't strike me as a valid reason to dismiss the idea of variable block sizes at compile time. >> Much small size files put into moose filesystem cause master server >> memory bottleneck. >> IMHO, space saving will never be main target in these systems. >> My servers can support 148GB of RAM which is enough for hundreds of millions of files. That would give our site years of growth, I'm not as worried about that as I am about the fact that we only have 10TB of space unused on the web farm that I want to use with MooseFS. With 64KB blocks we will run out of that space well before we reach a hundred million files. With 3 copies of the data we'd be out already with just the 50 million files we currently have. >> If we must handle much small files, just like photo files, should >> bundle them into a big file(s). And use URL locate content, like >> '/prefix/bundle_filename/offset/length/check_sum.jpg'. That is an interesting idea and I'm not against it if you can tell me what tools will do that and allow me to present it as a standard POSIX filesystem path. Seems to me though that a smaller block size for this awesome filesystem is still the better fix. |