[Genex-users] Re: Fwd: Re: Genex2

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Yong,

I'm following up this discussion to the genex-users list, since this
will be relevant to anyone getting ready to use the upcoming genex
release. 

<yo...@ui...> writes:

> THank you very much for providing such useful information.
> 
> I am glad to hear that the installation is not that hard any 
> more. That's a good news for us.

Yes, Harry has really streamlined the configuration process by adding
a simple apache-like configuration file. The person who is installing
genex edits the three or four necessary pieces of the file, and then
runs:

  $ make configure
  $ make install

and the process runs to completion. The 'make install' step has a few
interactive steps, simply to explain what it is installing, where, and
why.

> As for data loading, that's probably the most concerned part of our
> people here. It would be nice if a detailed step by step guide can
> be provided for various file formats, such as affymetrix (different
> versions) and GenePix (from GenePix Pro 3.x to 4.x to 5.x). If you
> don't have such data loading examples, maybe I can write some by
> using our data to go through the steps.

Genex has a single configurable data loader that is intended to be
used for all tab-delimited file formats. The real issue for users is
to decide which columns of the data they want to load into the DB. The
default is all columns. Given some example data files for the needed
SW version, we can generate the templates for data loader, or teach
you how to make the templates. There is already a fledgling HOWTO for
that, but it needs work.

> As for the analysis part, it is actually better for GeneX not to
> include analysis tools since there are so many tools out there. The
> idea of using BioConductor is great!

Our focus is *not* to develop new tools, but instead to make it drop
dead simple to integrate existing tools into genex. The genex workflow
model is that *all* data stays in the DB:

1) MeasuredBioAssayData is loaded into genex (these are the tab
   delimited data files from GenePix or Affy. These tables are for
   archive only - no modification.
2) Data for processing is copied to the Scratch table.
3) Researcher applies processing Protocol - an automated set of
   processing scripts that filter and normalize the data. The outcome
   of the processing is stored back into the Scratch table.
4) Researcher chooses statistical tests or analyses to run on the data
   - the output is stored back into the Scratch table.

So our goal is to enable tools like BioConductor to read data from
Genex and write their output back to genex. This way researchers can
just drop in the new data processing and analysis tools that they want
to use.

This is different then the model that many people use. Currently, data
is not kept in a DB, and is instead maintained as flat files on the
computer and all the analyses are run directly from flat files. Genex
aims to document and record all steps in the experiment so that
researchers can most easily export their results as MAGE-ML to
repositories like ArrayExpress.

> The GeneX sounds very fit for our needs and I would like to 
> get the stable release as soon as possible.

Ok. We will continue working, and let you know as soon as the Beta
release is available.

Cheers,
jas.