|
From: Bryan T. <br...@sy...> - 2011-11-28 11:49:15
|
Bigdata journals are expected to be on real file systems. If you use FUSE to map HDFS into a file system then you could provide that path. However, the last time I looked, HDFS does not provide some important guarantees (such as flush actually flushing through) and is oriented toward large block operations rather than the fine grained read/write model used by the bigdata journal. You would be much better off using a parallel file system [1]. There are several that would be suitable. In fact, AWS, recognizing the difference between a blob store and a true parallel file system, has recently release a parallel file system service. We are looking at a refactor to support "cloud" style blob stores, such as HDFS or S3. However that would be only for the bigdata federation, not an individual Journal file. The federation architecture is very different. Each file has a maximum size of ~ 200M. This gives them a good size for efficient block fetch from a blob store without excessive latency. With the refactor, the authoritative copy of the data will be in the cloud/blob store but the working copies will be cached on the instance nodes in the compute side of the cluster. Thanks, Bryan [1] http://en.wikipedia.org/wiki/List_of_file_systems#Distributed_parallel_fault-tolerant_file_systems ________________________________ From: tousif [mailto:to...@mo...] Sent: Monday, November 28, 2011 2:11 AM To: big...@li... Subject: [Bigdata-developers] is there a way to give hdfs path of jnl file in com.bigdata.journal.AbstractJournal.file -- Regards Tousif |