From: Kristofer P. <kri...@cy...> - 2011-09-29 02:26:09
|
GFS2 in Google was redesigned for smaller files. Multi-master design is needed, but that is a huge overhaul and a lot of work to complete. Ask and beg for it; you might see it some day. On Sep 28, 2011, at 8:55 PM, Ken wrote: > Distribute filesystem always design for huge space. Waste often exist. eg: > Haystack in facebook, GFS in google never recycling space of delete > files, they mark flag for deleted status. > > Much small size files put into moose filesystem cause master server > memory bottleneck. > IMHO, space saving will never be main target in these systems. > > If we must handle much small files, just like photo files, should > bundle them into a big file(s). And use URL locate content, like > '/prefix/bundle_filename/offset/length/check_sum.jpg'. > > > Best Regards > -Ken > > > On Thu, Sep 29, 2011 at 4:55 AM, Patrick Feliciano <fus...@gm...> wrote: >> >> I'd like to start with how very impressed I am with the MooseFS features >> and architecture. I even prepared a presentation to sell the benefits >> of MooseFS for our web services to management. It is the only thing >> I've found that is easy to manage, easily extendible, with good >> documentation, has automated replication, fault tolerance, self healing, >> and POSIX ( a requirement of our design ). Only one problem, many of >> our files are approx. 4KB. So average space used on MooseFS for that >> class of files is in excess of 12 times the expected. >> >> Now before you reply with the same response I've read in the FAQ and >> seen in the mailing list archives; I understand that MooseFS was written >> for large files and that is what it is used for by Gemius. And I've >> seen that others point to other systems that can handle small files. >> >> However none of those systems pointed to have the same feature set as >> MooseFS. Even if they have extendibility and fault tolerance, none I've >> seen also present a POSIX file system like we need. >> >> Also I agree that the block size should not be a configurable of the >> compiled FS. There are too many pieces to manage to be worried that you >> set the right block size configurable on each chunk server and add extra >> code to deal with variable block sizes in the master etc. Ugh. Mess, I >> totally agree. >> >> But how about at compile time as a option to ./configure ? How about I >> pick block size then and compile a complete set of master, metalogger, >> chunk, and client apps and/or RPMs that all have the hardcoded block >> size I pick then. I would think this change would be much easier to >> implement. I imagine that a constant would need to be changed somewhere. >> >> This would be very good for the spread and reputation of MooseFS, >> enabling its wider use and adoption as a general purpose DFS, adaptable >> to suit individual application needs. Also we'd be able to add our >> website with millions of users to the "Using MooseFS" list. :) >> >> So unless someone can point me to something else that REALLY has all of >> MooseFS's features, including POSIX... Well then, I think it is simply >> cruel to limit such an amazing tool and exclude those of us who could >> make such wonderful use of it. >> >> Of course, I have the source code and I can try to figure it out myself, >> but it would be much easier going with your cooperation and guidance. I >> would be willing to do the implementation myself and contribute it back. >> >> Please truly consider this, and if not, please consider at least >> pointing me to the right places in the source code I should look to >> implement the changes myself. >> >> Thank you very much, >> >> Patrick Feliciano >> Systems Administrator >> Livemocha, Inc. >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2dcopy1 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > -Ken > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |