Re: [Gmod-schema] Re: [GMOD-devel] GMODTools package preview

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Don,

I'll try out sgd data tomorrow.  I got 'out of memory' errors from DBI
after several hours (I accidentally closed the window, so I can't show
them to you).

Scott

On Wed, 2005-11-30 at 16:17 -0500, Don Gilbert wrote:
> Scott,
> 
> Thanks much for the quick tryout.  The preliminary configurations
> are be critical; I've used ENV{GMOD_ROOT} as a base for that, and see
> your system won't allow you to write there. In the top of each
> primary configuration file (e.g. sample conf/bulkfiles/sgdbulk.xml
> or your rice revision), find
> 
> <opt
>   name="sgdbulk"
>   relid="5"
>   date="20051129"
>   ROOT="${GMOD_ROOT}/"
>   TMP="${GMOD_ROOT}/tmp"
>   datadir="genomes/Saccharomyces_cerevisiae"
> 
> Change these ROOT,TMP,datadir to some paths that you want to
> be written to.  If you don't have GMOD_ROOT defined in environment,
> it will use the GMODTools/ folder from the software, and should work
> with the sample sgdlite lite data set.
> 
> One aspect I've not stressed well in the documents: proper configuration
> for a given data release set is essential to get it working, and this
> is an unusual program in that it need only be run once successfully for
> such a data release set, then the generated bulk files can be used by all.
> So expect to spend some time pondering the meaning of all those configuration
> options which are lacking good documentation in order to get it working for
> a new data set.
> 
> Once a data release set is configured to work, it should work repeatably (given
> solution to things like a writable data root directory).
> I'd recommend testing first with the sgdlite data set, and after getting that
> to work, move on to a new data set.
> 
> I hope to add some pre-make validation checks before long that will help with
> basic steps like "is your data output directory there?", "does your chado
> genome db have chromosomes/golden_paths that can be found?", "does the
> configured sql actually return data?"  Then folks can save time running
> it on big datasets and wondering if they will get usable outputs.
> 
> Take a look at $ROOT/$datadir/$releasedir/tmp/featdump/ (from your config values)
> for a 'chromosomes.tsv', an essential first step.  If that doesn't exist
> and look valid for your organism's genome, the rest won't work.
> 
> - Don
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         ca...@cs...
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory