From: Michał B. <mic...@ge...> - 2010-09-02 06:24:57
|
Hi! Below are the answers to your questions. From: Reed Hong [mailto:fis...@gm...] Sent: Thursday, August 26, 2010 4:39 AM To: moo...@li... Subject: [Moosefs-users] I want to know more detail about w/r operation Hi: I am very concerned with some problems about Write/Read operations: 1. what is write operation waiting for when it return? [MB] The command "write" itself doesn't wait for anything. Only command "fsync" or "close" waits for confirmation of all writes. If command "fsync" or "close" finishes with success it means that all writes, on all chunkservers also finished with success. 2. goal = 3, client <=> cs1 <=> cs2 <=> cs3 , if writing to cs2 or cs3 failed, how to deal with that? [MB] Mfsmount takes care of this. It repeats write operations by a given number of times (default: 30). If after these repetitions the errors still exists, then command "write", "fsync" or "close" returns EIO (input/output error). 3. What consistency level: Strong Consistency or Weak Consistency or Eventually Consistency (see <http://en.wikipedia.org/wiki/Eventual_consistency> en.wikipedia.org/wiki/Eventual_consistency ) [MB] A successful write (where fsync/close has finished with success) guarantees data consistency. If a write operation due to some reasons has not been finished (it returned EIO or client's machine has been restarted during saving) then data of a file being written would not be consistent. 4. How to ensue data consistency? [MB] For writing operations which finished with success there is guarantee of consistency. For other files there is no consistency guarantee. You need to delete such files and write them again or make a successful copy. Probably in the future will prepare a module for testing data consistency. The question is what to do if it finds files with not consistent copies. Probably a tool like "mfsfilerepair" should ask the user which copy to keep. If you obey to the rules given here: http://www.moosefs.org/moosefs-faq.html#wriiten you should have your data consistent. We hope these information give you more insight in read/write operations. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 I'v almost read every page in moosefs.org and every mail in mail-lists. but little information about w/r operation detail. I find some information from mail list " With goal=3 data transmission looks like this: client <=> cs1 <=> cs2 <=> cs3 client waiting write operation end util : cs1 finish writing, and cs2 finish writing, and cs3 finish writing. In this case, when client finish writing, MFS have 3 copies of data. Your point B is closer to the real writing process. But the client doesn't wait with sending data. It sends new data before it receives confirmation of writing previous data. Only removing from write queue takes place after writing confirmation. " I also read the source with the help of SourceInsight, trace the code from mfs_write() ---> write_data(), in write_data(), I see source if (status==0) { if (offset+size>id->maxfleng) { // move fleng id->maxfleng = offset+size; } id->writewaiting++; while (id->flushwaiting>0) { pthread_cond_wait(&(id->writecond),&glock); } id->writewaiting--; } pthread_mutex_unlock(&glock); if (status!=0) { return status; } then call write_block() to send write operation to jqueue, thread: write_worker() will send the real data. Does write operation will wait on pthread_cond_wait() above?? In mfs_read() function, I find many functions about write, such as write_data_flush(), write_data_end, write_data_flush_inode. It make me confused. Would please provide more documents about Write/Read operation, thanks a lot! -- --------------------------------------------------------------- by fishwarter |