From: Wang J. <jia...@re...> - 2012-04-24 18:04:55
|
于 2012/4/25 0:40, Allen, Benjamin S 写道: > Have you looked at Facebook's Haystack? It's not publicly available but likely a very good model to base your infrastructure on: > > http://static.usenix.org/event/osdi10/tech/full_papers/Beaver.pdf > > Why not push your images to Amazon's Cloudfront or similar CDN service? > > Another option, LiveJournal built MogileFS for this specific purpose. > > Using a sparse file or FS in a file approach on top of MooseFS is a bit silly. Since you're stacking three filesystems. Ensure you look at the administration overhead of the solutions as well. > > MooseFS, as the authors freely admit, was not designed for small files. It was designed to serve big media files. I'd suggest picking something that was written to solve your problem, not shoehorn > MooseFS. > Facebook's Haystack and Taobao's TFS, are both 'packed small files in large files', the answer to manage titanic amount of small files. Their very difference is about how to store meta-meta information. Haystack, according to published docs, encodes location information in filename, i.e. big file location and small file's offset and length in the pack. But TFS store small meta-meta information in database. By comparison, Haystack goes furthur to decrease meta lookup overhead. But under the hood, they both have two steps when accessing small files: 1. Small files to big files location, offset, length, etc 2. Access the big files' parts in question You know, whether the first step is builtin or seperate from big file storage engine, there's no intrinsic performance difference. How to do the first step does matter. The solution Ken and I are referring, is to use a seperate meta service, which translates small file names to big file names, offset, length, etc. MooseFS can be used as low level big file storage, but other engine, like Ceph, should work too. > Ben > > On Apr 24, 2012, at 4:25 AM, Wang Jian wrote: > >> It's not that simple. >> >> In your scenario, one FS image can only be mounted once, that means a single server, thus single point of failure. And more seriously, loss of one chunk of FS image(I mean real one, this happens when >> two or more harddisk fail at the same time) leads to the whole image corruption, then you will lose all data in the FS image. >> >> Another problem is meta data overhead. Your mounting system will do filesystem journal, and below, MooseFS will do filesystem journal. >> >> DB? I am curious if some company really did and be doing that, in the really large scale (tens and hundereds of 12TB servers) >> |