From: Florent B. <fl...@co...> - 2011-08-31 15:18:58
|
Hi all, I would like to know if MooseFS will (in which future ?) take care about data deduplication. I know that MooseFS makes a checksum of every chunk, so would it be possible to have data deduplication at that level ? If two (or more) files with goal=3 each, have chunk(s) in common, only store 3 copies of that (those) chunk(s) and not 6 (or more) like today... For files having different goals, use the more important goal. I think it does not change the architecture of MooseFS... maybe the problem is that MFS Master do not know about checksums, it is made on CS... but we could find a way to go through Is it difficult to add that feature ? What do you think about it ? Thank you guys! -- Florent Bautista ------------------------------------------------------------------------ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. ------------------------------------------------------------------------ |
From: Allen, B. S <bs...@la...> - 2011-08-31 16:32:32
|
Florent, No idea on how difficult this would be to implement in MFS (I believe this might of been discuss on the mailing list in the past), but today you could use ZFS or LessFS for the Chunk server drives. Using either of these would allow you to enable dedup behind the scenes of MFS per chunk server. Obviously some efficiency is lost vs a MFS integrated solution as each filesystem doesn't know about the rest of the MFS environment. Also a caveat in this approach, MFS pays attention to actual blocks used on the drive, so it will balance new chunks onto the CS based on actual available space. This could have an affect if one CS has more dedup'able data, to cause an uneven chunk count across the chunk servers. This isn't necessarily a bad thing, but something to watch out for. Ben On Aug 31, 2011, at 8:53 AM, Florent Bautista wrote: Hi all, I would like to know if MooseFS will (in which future ?) take care about data deduplication. I know that MooseFS makes a checksum of every chunk, so would it be possible to have data deduplication at that level ? If two (or more) files with goal=3 each, have chunk(s) in common, only store 3 copies of that (those) chunk(s) and not 6 (or more) like today... For files having different goals, use the more important goal. I think it does not change the architecture of MooseFS... maybe the problem is that MFS Master do not know about checksums, it is made on CS... but we could find a way to go through Is it difficult to add that feature ? What do you think about it ? Thank you guys! -- Florent Bautista ________________________________ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. ________________________________ ------------------------------------------------------------------------------ Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free "Love Thy Logs" t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev_______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2011-09-24 13:05:20
|
Hi! We talk among us about this feature from time to time and our decision is not to implement it for the moment. There are much more important things to do about MooseFS (like quota, acls, RAM optimizations, optimization of deleting files process, making snapshots even better, etc.) that file deduplication is not that important. We know it is loss of space but nowadays it's not that expensive. So maybe one time we come back to this, but unfortunately not in the near future. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: Florent Bautista [mailto:fl...@co...] Sent: Wednesday, August 31, 2011 4:54 PM To: moo...@li... Subject: [Moosefs-users] Data deduplication Hi all, I would like to know if MooseFS will (in which future ?) take care about data deduplication. I know that MooseFS makes a checksum of every chunk, so would it be possible to have data deduplication at that level ? If two (or more) files with goal=3 each, have chunk(s) in common, only store 3 copies of that (those) chunk(s) and not 6 (or more) like today... For files having different goals, use the more important goal. I think it does not change the architecture of MooseFS... maybe the problem is that MFS Master do not know about checksums, it is made on CS... but we could find a way to go through Is it difficult to add that feature ? What do you think about it ? Thank you guys! -- Florent Bautista _____ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. _____ |
From: Florent B. <fl...@co...> - 2011-09-26 08:13:59
|
Hi, Thank you for you answer, I understand! By the way, what about quotas ? Will it be implemented in next version or later ? Thank you. Le 24/09/2011 15:04, Michał Borychowski a écrit : > > Hi! > > > > We talk among us about this feature from time to time and our decision > is not to implement it for the moment. There are much more important > things to do about MooseFS (like quota, acls, RAM optimizations, > optimization of deleting files process, making snapshots even better, > etc.) that file deduplication is not that important. We know it is > loss of space but nowadays it's not that expensive. So maybe one time > we come back to this, but unfortunately not in the near future. > > > > > > Kind regards > > Michał Borychowski > > MooseFS Support Manager > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > > Gemius S.A. > > ul. Wołoska 7, 02-672 Warszawa > > Budynek MARS, klatka D > > Tel.: +4822 874-41-00 > > Fax : +4822 874-41-01 > > > > > > *From:*Florent Bautista [mailto:fl...@co...] > *Sent:* Wednesday, August 31, 2011 4:54 PM > *To:* moo...@li... > *Subject:* [Moosefs-users] Data deduplication > > > > Hi all, > > I would like to know if MooseFS will (in which future ?) take care > about data deduplication. > > I know that MooseFS makes a checksum of every chunk, so would it be > possible to have data deduplication at that level ? > If two (or more) files with goal=3 each, have chunk(s) in common, only > store 3 copies of that (those) chunk(s) and not 6 (or more) like today... > > For files having different goals, use the more important goal. > > I think it does not change the architecture of MooseFS... maybe the > problem is that MFS Master do not know about checksums, it is made on > CS... but we could find a way to go through > > Is it difficult to add that feature ? What do you think about it ? > > Thank you guys! > > -- > > > Florent Bautista > > ------------------------------------------------------------------------ > > Ce message et ses éventuelles pièces jointes sont personnels, > confidentiels et à l'usage exclusif de leur destinataire. > Si vous n'êtes pas la personne à laquelle ce message est destiné, > veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous > est strictement interdit d'utiliser, de diffuser, de transférer, > d'imprimer ou de copier ce message. > > This e-mail and any attachments hereto are strictly personal, > confidential and intended solely for the addressee. > If you are not the intended recipient, be advised that you have > received this email in error and that any use, dissemination, > forwarding, printing, or copying of this message is strictly prohibited. > > ------------------------------------------------------------------------ -- Florent Bautista ------------------------------------------------------------------------ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. ------------------------------------------------------------------------ |
From: Michał B. <mic...@ge...> - 2011-09-26 08:50:49
|
Unfortunately quotas won't be in the next release Regards Michal From: Florent Bautista [mailto:fl...@co...] Sent: Monday, September 26, 2011 10:05 AM To: Michal Borychowski Cc: moo...@li... Subject: Re: [Moosefs-users] Data deduplication Hi, Thank you for you answer, I understand! By the way, what about quotas ? Will it be implemented in next version or later ? Thank you. Le 24/09/2011 15:04, Michał Borychowski a écrit : Hi! We talk among us about this feature from time to time and our decision is not to implement it for the moment. There are much more important things to do about MooseFS (like quota, acls, RAM optimizations, optimization of deleting files process, making snapshots even better, etc.) that file deduplication is not that important. We know it is loss of space but nowadays it's not that expensive. So maybe one time we come back to this, but unfortunately not in the near future. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: Florent Bautista [mailto:fl...@co...] Sent: Wednesday, August 31, 2011 4:54 PM To: moo...@li... Subject: [Moosefs-users] Data deduplication Hi all, I would like to know if MooseFS will (in which future ?) take care about data deduplication. I know that MooseFS makes a checksum of every chunk, so would it be possible to have data deduplication at that level ? If two (or more) files with goal=3 each, have chunk(s) in common, only store 3 copies of that (those) chunk(s) and not 6 (or more) like today... For files having different goals, use the more important goal. I think it does not change the architecture of MooseFS... maybe the problem is that MFS Master do not know about checksums, it is made on CS... but we could find a way to go through Is it difficult to add that feature ? What do you think about it ? Thank you guys! -- Florent Bautista _____ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. _____ -- Florent Bautista _____ Ce message et ses éventuelles pièces jointes sont personnels, confidentiels et à l'usage exclusif de leur destinataire. Si vous n'êtes pas la personne à laquelle ce message est destiné, veuillez noter que vous avez reçu ce courriel par erreur et qu'il vous est strictement interdit d'utiliser, de diffuser, de transférer, d'imprimer ou de copier ce message. This e-mail and any attachments hereto are strictly personal, confidential and intended solely for the addressee. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this message is strictly prohibited. _____ |