From: Youssef E. <you...@gm...> - 2012-02-02 08:28:09
|
At the Bibliotheca Alexandrina, we are in the process of migrating to the open-source Wayback. Our uncompressed CDX is around 13.5 TB. Compressed, those should come down to around 2 TB by rough interpolation. We have successfully deployed a Hadoop instance. How do we compress the 13.5 TB of CDX in HDFS such that the result is usable by the Wayback? Does the open-source Wayback expect the compressed CDX to be in ziplines format? Any hints or recommendations are much appreciated. - Youssef Eldakar |