From: Robert S. <rsa...@ne...> - 2011-08-26 12:46:46
|
Hi Davies, Our average file sizes is around 560 kB and it grows by approximately 100 kB per year. Our hot-set of files are around 14 million files taking slightly less than 8 TB of space. Around 1 million files are added and removed per week. There is also some growth in the number of hot files with it doubling every 2 years. In a realistic world I would have a dual level storage arrangement with faster storage for the hot files, but that is not a choice available to me. I have experimented with storing the files in a database and it has not been a great success. Databases are generally not optimized for storing large blobs and a lot of databases simply won't store blobs bigger than a certain size. Beansdb looks like something I have been looking for but the lack of English documentation is a bit scary. I did look at it through Google translate and even then the documentation is a bit on the scarce side. Robert On 8/26/11 3:25 AM, Davies Liu wrote: > Hi Robert, > > Another hint to make mfsmaster more responsive is to locate the > metadata.mfs > on a separated disk with change logs, such as SAS array, then you > should modify > the source code of mfsmaster to do this. > > PS: what is the average size of you files? MooseFS (like GFS) is > designed for > large file (100M+), it can not serve well for amount of small files. > Haystack from > Facebook may be the better choice. We (douban.com <http://douban.com>) > use MooseFS to serve > 200+T(1M files) offline data and beansdb [1] to serve 500 million > online small > files, it performs very well. > > [1]: http://code.google.com/p/ <http://code.google.com/p/>*beansdb*/ > > Davies > > On Fri, Aug 26, 2011 at 9:08 AM, Robert Sandilands > <rsa...@ne... <mailto:rsa...@ne...>> wrote: > > Hi Elliot, > > There is nothing in the code to change the priority. > > Taking virtually all other load from the chunk and master servers > seems > to have improved this significantly. I still see timeouts from > mfsmount, > but not enough to be problematic. > > To try and optimize the performance I am experimenting with accessing > the data using different APIs and block sizes but this has been > inconclusive. I have tried the effect of posix_fadvise(), > sendfile() and > different sized buffers for read(). I still want to try mmap(). > Sendfile() did seem to be slightly slower than read(). > > Robert > > On 8/24/11 11:05 AM, Elliot Finley wrote: > > On Tue, Aug 9, 2011 at 6:46 PM, Robert > Sandilands<rsa...@ne... <mailto:rsa...@ne...>> > wrote: > >> Increasing the swap space fixed the fork() issue. It seems that > you have to > >> ensure that memory available is always double the memory needed by > >> mfsmaster. None of the swap space was used over the last 24 hours. > >> > >> This did solve the extreme comb-like behavior of mfsmaster. It > still does > >> not resolve its sensitivity to load on the server. I am still > seeing > >> timeouts on the chunkservers and mounts on the hour due to the > high CPU and > >> I/O load when the meta data is dumped to disk. It did however > decrease > >> significantly. > > Here is another thought on this... > > > > The process is niced to -19 (very high priority) so that it has good > > performance. It forks once per hour to write out the metadata. I > > haven't checked the code for this, but is the forked process > lowering > > it's priority so it doesn't compete with the original process? > > > > If it's not, it should be an easy code change to lower the > priority in > > the child process (metadata writer) so that it doesn't compete with > > the original process at the same priority. > > > > If you check into this, I'm sure the list would appreciate an > update. :) > > > > Elliot > > > ------------------------------------------------------------------------------ > EMC VNX: the world's simplest storage, starting under $10K > The only unified storage solution that offers unified management > Up to 160% more powerful than alternatives and 25% more efficient. > Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev > _______________________________________________ > moosefs-users mailing list > moo...@li... > <mailto:moo...@li...> > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > -- > - Davies |