|
From: louis f. <lou...@gm...> - 2007-11-26 21:45:28
|
The new celsius pipeline run has completed successfully. It was modified to run the web scraper (make load) and normalization (make quantify) once each, with the normalization making one full pass through the entire DB. It processed 58545 DB rows, each corresponding to 1 CEL file. So, it would seem that this is the true number of CEL files in our system, not 80 - 90k. In the course of processing those rows, the following numbers of normalizations were applied: gcrma 1068 rma 615 mas5 2111 Total 3794 The pipeline ran on neuron and submitted normalizations jobs to the cluster. Only a maximum of 4 cluster jobs ran concurrently. Dchip and Plier were intentionally excluded since they fail due to insufficient memory. They will have to be done with a similar, separate run on the new cluster. The total run time for this pipeline run was 3.5 days. So, it seems viable to set up a celsius-pipeline daemon to run this pipeline over-and-over continuously. -- Lou Fridkis UCLA Geffen School of Medicine Department of Human Genetics Nelson Lab 695 Charles E Young Drive S. Los Angeles, CA 90095 USA 310-825-7920 |