From: Goswin v. B. <gos...@we...> - 2010-08-09 22:20:30
|
James Rhodes <jr...@ro...> writes: > On Sun, Aug 8, 2010 at 4:55 AM, Goswin von Brederlow <gos...@we...> wrote: >> James Rhodes <jr...@ro...> writes: >> >>> Yeah, essentially each block on the filesystem has an originality >>> mark, which is set to 0x00000000 if the segment is unoriginal, >>> 0xFFFFFFFF to indicate it's an original block with no non-original >>> counterpart (i.e. it's been untouched since it was marked as >>> original), otherwise it contains the address of the non-original block >>> (so that all references to the original block will be forwarded to the >>> non-original block without having to change addresses in other >>> blocks). In addition, the filesystem information block (which is not >>> yet implemented) will contain a field which indicates whether or not >>> the package contains an original state. >>> >>> When you reset the filesystem (if the info block states that there is >>> an original state), it will delete any segments whose mark is >>> 0x00000000, and reset all others to 0xFFFFFFF. That way all the space >>> is freed and there are no more block redirections. >> >> That means that you need to overwrite original blocks with the same data >> but altered headers. If the system crashes mid write then the original >> data may become corrupted. Not the best design. > > No, you only need to modify the single field. I don't rewrite out > entire blocks when changing a single field, I just change that field. But you are writing to a BLOCK device. It will write the block as one. >>> When you make a change to a directory (such as adding a new entry to >>> the directory), the directory inode is duplicated with the original >>> now having it's originality mark pointing to the duplicated block and >>> the duplicated block having it's originality mark set to 0x00000000. >>> >>> When you make a change to file data, it only duplicates the segments >>> that would be changed by the write operation. This means that you >>> reduce the storage requirements because if you change 1 character in a >>> 1MB file, you only duplicate 500 bytes (512 byte blocks minus segment >>> headers), instead of the entire file. The originality marking is the >> >> That makes it bad for mmap. Binaries and files that are mapped into >> memory are maped in pages (4096 byte usualy). You now need to read 9 >> blocks and copy 4096 bytes from then in 500 byte chunks. So you will >> always have bad alignment and partial blocks. Most read/writes will also >> be page aligned and inefficient because you need to do a >> read-modify-write cycle. > > Yeah, but how else are you meant to store data when the entire block must be > 512 bytes minus headers? Either way, file segments still need their own header > information which means you can never use the entire 512 (or 4096) bytes for > the data. Don't forget the new disks that have 4096 byte blocks. Working in 512 byte units will seriously degrade performance and even lifetime of the disks. Usualy what filesystems do is seperate data and metadata and store them in different blocks. That means data can use the full 4096 bytes of a block. With metadata being smaller then a block multiple items are often combined into the same block. E.g. in ext2 inodes are 128-512 bytes usualy so 4-16 to a 4k block. MfG Goswin |