|
From: Scott C. <ca...@cs...> - 2005-12-01 03:40:48
|
Don,
I'll try out sgd data tomorrow. I got 'out of memory' errors from DBI
after several hours (I accidentally closed the window, so I can't show
them to you).
Scott
On Wed, 2005-11-30 at 16:17 -0500, Don Gilbert wrote:
> Scott,
>
> Thanks much for the quick tryout. The preliminary configurations
> are be critical; I've used ENV{GMOD_ROOT} as a base for that, and see
> your system won't allow you to write there. In the top of each
> primary configuration file (e.g. sample conf/bulkfiles/sgdbulk.xml
> or your rice revision), find
>
> <opt
> name="sgdbulk"
> relid="5"
> date="20051129"
> ROOT="${GMOD_ROOT}/"
> TMP="${GMOD_ROOT}/tmp"
> datadir="genomes/Saccharomyces_cerevisiae"
>
> Change these ROOT,TMP,datadir to some paths that you want to
> be written to. If you don't have GMOD_ROOT defined in environment,
> it will use the GMODTools/ folder from the software, and should work
> with the sample sgdlite lite data set.
>
> One aspect I've not stressed well in the documents: proper configuration
> for a given data release set is essential to get it working, and this
> is an unusual program in that it need only be run once successfully for
> such a data release set, then the generated bulk files can be used by all.
> So expect to spend some time pondering the meaning of all those configuration
> options which are lacking good documentation in order to get it working for
> a new data set.
>
> Once a data release set is configured to work, it should work repeatably (given
> solution to things like a writable data root directory).
> I'd recommend testing first with the sgdlite data set, and after getting that
> to work, move on to a new data set.
>
> I hope to add some pre-make validation checks before long that will help with
> basic steps like "is your data output directory there?", "does your chado
> genome db have chromosomes/golden_paths that can be found?", "does the
> configured sql actually return data?" Then folks can save time running
> it on big datasets and wondering if they will get usable outputs.
>
> Take a look at $ROOT/$datadir/$releasedir/tmp/featdump/ (from your config values)
> for a 'chromosomes.tsv', an essential first step. If that doesn't exist
> and look valid for your organism's genome, the rest won't work.
>
> - Don
--
------------------------------------------------------------------------
Scott Cain, Ph. D. ca...@cs...
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
|