From: Alexander A. <akh...@ri...> - 2012-05-21 09:13:52
|
Hi Deon! It was a new pool for chunks without any data when I turned dedup on. And You are right: on that pool whan i copy some normal file more than once - free space shown by df does not decrease. ====================================================== Hi Alex, Are you turning on dedup for a new ZFS filesystem or an existing one? (doesn't matter if its an existing zpool or not). ZFS uses in-line dedup, which means that if you are trying to dedup an existing ZFS filesystem it will only create new dedup blocks, existing blocks will not be deduped. If you are indeed trying this on a new zfs filesystem try making 10 copies of a large file. Deon On Mon, May 21, 2012 at 8:21 PM, Alexander Akhobadze <akh...@ri...> wrote: Hi Michal ! I have tested to turn on deduplication on a ZFS chunk storage but unfortunately did not get any profit :--( I thought that chunk file format prevents ZFS to find dups. May be I make mistake... Correct me if yes. wbr Alexander ====================================================== Hi! Users of MooseFS may now be interested in a new feature of ext4 called “bigalloc” introduced in 3.2 kernel. According to http://lwn.net/Articles/469805/: “The "bigalloc" patch set adds the concept of "block clusters" to the filesystem; rather than allocate single blocks, a filesystem using clusters will allocate them in larger groups. Mapping between these larger blocks and the 4KB blocks seen by the core kernel is handled entirely within the filesystem.” Setting 64KB cluster size may make sense as MooseFS operates on 64KB blocks. We have not tested it but we can expect it may give some performance boost. It would also depend on the average size of the files in your system. And as MooseFS doesn’t support deduplication by itself you can also consider using dedup functionality in ZFS. Kind regards Michal Borychowski From: Allen, Benjamin S [mailto:bs...@la...] Sent: Friday, May 18, 2012 5:30 PM To: moo...@li... Subject: Re: [Moosefs-users] Best FileSystem for chunkers. My chunkservers are on top of ZFS pools on Solaris. Using gzip-1 I get 2.32x, which is along the lines of the compress ratio I get with similar systems serving NFS. Note, my data is inherently well compressible. With 2x Intel X5675, load is never an issue. As you up the level of gzip you'll see a diminishing return, and pretty heavy hits on CPU load. I'd also suggest using ZFS for raid if you care about single stream performance. Serve up one or two big zpools per chunkserver to MFS. Keep in mind the size of your pool however, as having MFS fail off that HD will can take ages. Also of course you'll loose capacity in this approach to parity of RAIDZ or RAIDZ2, and then again to MFS' goal > 1 if you want high availability. If you're thinking of using ZFS, I'd highly suggest using one of the Illumos based OSes instead of FreeBSD or Linux variants. The Linux port is still pretty young in my opinion. I'd suggest Illumian, http://illumian.org/ which grew out of Nexenta Core. By the way MFS is the only distributed FS that I know of that compiles and runs well on Solaris. I've found small file performance isn't all that great in this setup. Sub what NFS can do on a similar ZFS pool, so I wouldn't get your hopes up much for it to solve this issue. You could perhaps throw a good amount of small SSD drives at ZFS' ZIL to improve synchronous write speeds, but when using ZIL you're funneling all your synchronous writes through the ZIL devices. So while using two SSDs will likely give you a touch better latency, it will kill your throughput compared to a full chassis of drives. I've also tested use of L2Arc on MLC SSDs for read cache. If its affordable for you, I'd suggest throwing RAM in the box for L1ARC instead. At least in my workload, I see very little L2Arc hits. Most hits (90%) comes from L1ARC in memory in my chunkservers that have 96GB. Next series of systems I buy will have 128G, and I'll cut my L2ARC SSDs to less than half them of my current systems( 2.4T -> 960G). I guessing I could actually remove L2ARC all together and not see a performance hit, but I haven't done enough benchmarking to prove that one way or another. Ben On May 17, 2012, at 2:22 PM, Steve Wilson wrote: On 05/17/2012 04:17 PM, Steve Wilson wrote: On 05/17/2012 04:05 PM, Atom Powers wrote: On 05/17/2012 12:44 PM, Steve Wilson wrote: On 05/17/2012 03:26 PM, Atom Powers wrote: * Compression, 1.16x in my environment I don't know if 1.16x would give me much improvement in performance. I typically see about 1.4x on my ZFS backup servers which made me think that this reduction in disk I/O could result in improved overall performance for MooseFS. Not for performance, for disk efficiency. Ostensibly those 64MiB chunks won't always use 64MiB with compression on, especially for smaller files. This is a good point and it might help where it's most needed: all those small configuration files, etc. that have a large impact on the user's perception of disk performance. Bad: * high RAM requirement Is the high RAM due to using raidz{2-3}? I was thinking of making each disk a separate ZFS volume and then letting MooseFS combine the disks into an MFS volume (i.e., no raidz). I realize that greater performance could be achieved by striping across disks in the chunk servers but I'm willing to trade off that performance gain for higher redundancy (in the case of using simple striping) and/or greater capacity (in the case of using raidz, raidz2, or raidz3). ZFS does a lot of caching in RAM. My chunk servers use hardware RAID, not raidz, and still use several hundred MiB of RAM. Personally, I would prefer to use raidz for muliple disks over MooseFS, because managing individual disks and disk failures should be much better. For example, to minimize the amount of re-balancing MooseFS needs to do; not to mention the possible performance benefit. But I can think of no reason why you couldn't do a combination of both. That is certainly worth considering. I hope to have enough time with the new chunk servers to try out different configurations before I have to put them into service. Steve ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |