Re: [Kosmosfs-devel] [Kosmosfs-users] Announcement: new release kfs-0.5
Status: Alpha
Brought to you by:
sriramsrao
From: John P. <jpl...@ac...> - 2010-06-10 21:56:47
|
A couple questions: 1. the chunks have a logical size of 64MB, but they seem to have a variable real size. GFS2 is moving to a smaller chunk size, is there any reason not to allow the chunk size to vary either fixed on a file basis or vary on a block to block basis? This would have some implications for seeking within a file where records were being appended to multiple blocks simultaneously, but perhaps the benefits outweigh the problems. 2. given that it is now possible to append "records" atomically is there any interest in providing the record size and/or schema (protobufs/avro/etc) as metadata ? 3. if there were records what about variable size records a la protobufs, avro, etc. ? 4. if there were variable size records would there be interest in having an index which would permit seeking to different records numbers? 5. If there were variable size records and an index would there be any interest in being able to write/update records which are variable size and still preserve the ability to seek/read to other records (in other words, change the chunk size to handle the new record size). It seems like this would be a logical extension of the latest work (which looks really nice BYW). john On 6/7/2010 10:23 PM, Sriram Rao wrote: > Hi, > > We are happy to provide a new release of KFS (kfs-0.5). This release > adds new features (particularly, atomic record append) as well as > stability/performance improvement over previous releases. In a bit of > detail, > > 1. Add support for atomic record append. This capability is to enable multiple > writes append records to a file. Writers can be writing to same chunk of a > file or to different chunks of a file. The system guarantees that records will > not be split across chunk boundaries. The support for atomic record append > entails three parts: (1) metaserver to allocate chunks for append, (2) > chunkserver to receive data from multiple clients and interleave the data, and > (3) support in the client to construct records and send them to chunkserver. > To limit the # of concurrent writers to a chunk, we employ a space reservation > policy: clients reserve space on the chunkserver; if the reservation fails, the > client will interface with the metaserver, ask for a new allocation, > and retry. The atomic record append > operation can be used, for instance, to do log aggregation in a > cluster: Logger processes on individual > nodes in a cluster open a file in KFS and atomically append log > records to the file. > 2. Add reliability support in record append. To the writer, the reliability > protocol provides exactly once semantics. If the writer can't determine if the > write is committed at the multiple servers, the write will fail. > 3. Add support for doing adler-32 using Intel's IPP (Intel's performance > package). > 4. Add support for a "chunk coalesce" operation: Data written to chunks in a > different files can be coalesced into a single file. That is, a container > file can be created and content from different files can be appended to the > container. > 5. Add support for async read/write in the KFS client. With read/write, the > app can issue reads to data from multiple files/chunks concurrently; the client > code multiplexes I/O from multiple chunkservers concurrently. > 6. Add a rebalancer tool: the tool takes as input chunk sizes/locations, and > then constructs a plan (which lists what chunk needs to be moved where); the > plan should then be uploaded to a running metaserver, which then executes the > plan. > 7. Modifications to the write protocol for reliability. On each write sync, > the client sends the adler-32 over the data that it has sent; the chunk master > and the replicas should agree on the checksums; otherwise, the write > is failed, > and the client will retry. > 8. Performance tweaks to the metaserver for scale. > > Acknowledgments: The code for atomic record append and other features > that comprise 0.5 release were funded solely by Quantcast Corp. It > was work done by Mike Ovassianikov and Sriram Rao. Thanks Quantcast! > > Sriram > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Kosmosfs-users mailing list > Kos...@li... > https://lists.sourceforge.net/lists/listinfo/kosmosfs-users |