[Celsius-devel] new celsius pipeline

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

The new celsius pipeline run has completed successfully. It was
modified to run the web scraper (make load) and normalization (make
quantify) once each, with the normalization making one full pass
through the entire DB. It processed 58545 DB rows, each corresponding
to 1 CEL file. So, it would seem that this is the true number of CEL
files in our system, not 80 - 90k. In the course of processing those
rows, the following numbers of normalizations were applied:

gcrma                   1068
rma                     615
mas5                    2111
Total                   3794

The pipeline ran on neuron and submitted normalizations jobs to the
cluster. Only a maximum of 4 cluster jobs ran concurrently. Dchip and
Plier were intentionally excluded since they fail due to insufficient
memory. They will have to be done with a similar, separate run on the
new cluster. The total run time for this pipeline run was 3.5 days.
So, it seems viable to set up a celsius-pipeline daemon to run this
pipeline over-and-over continuously.

-- 
Lou Fridkis
UCLA Geffen School of Medicine
Department of Human Genetics
Nelson Lab
695 Charles E Young Drive S.
Los Angeles, CA 90095 USA
310-825-7920