From: Michał B. <mic...@ge...> - 2010-05-27 09:24:07
|
Ricardo gave very accurate observations about MooseFS and what you would like to achieve. There is a setting "goal" which tells in how many copies you want to store the file. MooseFS won't automatically adjust this parameter for you, but you can prepare a script which would examine the popularity of the file and will change the goal to a higher number. If the file is not popular any longer, the goal will be reverted to 2 (recommended) or to 1. But the most important thing is the performance and throughput of the clients' machines. What should be interesting for you - we plan to introduce in the near future a cache mechanism so that when a file had been downloaded by a client machine from the chunk server it won't be downloaded again unless the file had been modified. So this would eliminate a problem of network speed between chunks and clients and it would not be necessary to store the file in more than 2 copies (it won't make the system work more quickly). If you need any further assistance please let us know. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 > -----Original Message----- > From: Ricardo J. Barberis [mailto:ric...@da...] > Sent: Wednesday, May 26, 2010 5:05 PM > To: moo...@li... > Subject: Re: [Moosefs-users] Fwd: how to install a cloud storage network > :which is best smart automatic file replication solution for cloud storage > based systems. > > El Martes 25 May 2010, Metin Akyalı escribió: > > Hello, > > Hi! > > > I am looking for a solution for a project i am working on. > > > > We are developing a website where people can upload their files and > > where they can share those files and other people can download them. > > (similar to rapidshare.com model) > > [ massive snippage ] > > > So i need a cloud based system, which will push the files into > > replicated nodes automatically when demanded to those files are high, > > and when the demand is low, they will delete from other nodes and it > > will stay in only 1 node. > > > > I have looked to glusterfs and asked in their irc channel that > > problem, and got an answer from the guys that gluster cant do such a > > thing. It is only able to replicate all the files or none of the > > files. (i have to define which files to be replicated) But i need it > > the cluster software to do it automatically. > > OK, I don't think MooseFS (mfs for short) will help you either, at least not > automagically: in mfs you can specify that some files have to have more > copies than others, but you have to do it by hand. > > However, I would really advise you to store at least 2 copies to avoid data > loss in case of one of the storage nodes going kaput. > > > The only distributed/replicated filesystem that I'm aware of capable of > automatic load balancing is Ceph (http://ceph.newdream.net/) but it's > currently in alpha status and not recommended for production sites. > > Nonetheless, maybe you can reach your goals with mfs, keep reading :) > > > I am sure afer some time, i will have some trouble using client > > server which i have to loadbalance them later, but that is the next > > step which i dont mind right now. > > Well, I'm going to actually suggest something along those lines. > > Consider this: no matter how many storage nodes you will have, the bottleneck > will then be the frontend server (client to mfs) bandwidth, so you _will_ > have to do frontend load balancing, might as well do it from the stat and > avoid doing it while in production. > > With than in mind, and to avoid manually increasing files copies on the > backend (storage) nodes, you could cache the files on the frontends (asuming > the files won't change often, once uploaded) to speed up serving them. > > In short: I think your best option is a combination of load balancing/proxy > cache and MooseFS with goal >= 2. > > > Best regards, > -- > Ricardo J. Barberis > Senior SysAdmin - I+D > Dattatec.com :: Soluciones de Web Hosting > Su Hosting hecho Simple..! > > ------------------------------------------ > > Nota de confidencialidad: Este mensaje y los archivos adjuntos al mismo > son confidenciales, de uso exclusivo para el destinatario del mismo. La > divulgación y/o uso del mismo sin autorización por parte de Dattatec.com > queda prohibida. Dattatec.com no se hace responsable del mensaje por la > falsificación y/o alteración del mismo. > De no ser Ud. el destinatario del mismo y lo ha recibido por error, por > favor notifique al remitente y elimínelo de su sistema. > > Confidentiality Note: This message and any attachments (the message) are > confidential and intended solely for the addressees. Any unauthorised use > or dissemination is prohibited by Dattatec.com. Dattatec.com shall not be > liable for the message if altered or falsified. > If you are not the intended addressee of this message, please cancel it > immediately and inform the sender. > > Nota de Confidencialidade: Esta mensagem e seus eventuais anexos podem > conter dados confidenciais ou privilegiados. Se você os recebeu por engano > ou não é um dos destinatários aos quais ela foi endereçada, por favor > destrua-a e a todos os seus eventuais anexos ou copias realizadas, > imediatamente. > É proibida a retenção, distribuição, divulgação ou utilização de quaisquer > informações aqui contidas. Por favor, informe-nos sobre o recebimento > indevido desta mensagem, retornando-a para o autor. > > ------------------------------------------------------------------------------ > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |