|
From: <ja...@op...> - 2004-04-15 07:03:02
|
Hi Yong, I'm following up this discussion to the genex-users list, since this will be relevant to anyone getting ready to use the upcoming genex release. <yo...@ui...> writes: > THank you very much for providing such useful information. > > I am glad to hear that the installation is not that hard any > more. That's a good news for us. Yes, Harry has really streamlined the configuration process by adding a simple apache-like configuration file. The person who is installing genex edits the three or four necessary pieces of the file, and then runs: $ make configure $ make install and the process runs to completion. The 'make install' step has a few interactive steps, simply to explain what it is installing, where, and why. > As for data loading, that's probably the most concerned part of our > people here. It would be nice if a detailed step by step guide can > be provided for various file formats, such as affymetrix (different > versions) and GenePix (from GenePix Pro 3.x to 4.x to 5.x). If you > don't have such data loading examples, maybe I can write some by > using our data to go through the steps. Genex has a single configurable data loader that is intended to be used for all tab-delimited file formats. The real issue for users is to decide which columns of the data they want to load into the DB. The default is all columns. Given some example data files for the needed SW version, we can generate the templates for data loader, or teach you how to make the templates. There is already a fledgling HOWTO for that, but it needs work. > As for the analysis part, it is actually better for GeneX not to > include analysis tools since there are so many tools out there. The > idea of using BioConductor is great! Our focus is *not* to develop new tools, but instead to make it drop dead simple to integrate existing tools into genex. The genex workflow model is that *all* data stays in the DB: 1) MeasuredBioAssayData is loaded into genex (these are the tab delimited data files from GenePix or Affy. These tables are for archive only - no modification. 2) Data for processing is copied to the Scratch table. 3) Researcher applies processing Protocol - an automated set of processing scripts that filter and normalize the data. The outcome of the processing is stored back into the Scratch table. 4) Researcher chooses statistical tests or analyses to run on the data - the output is stored back into the Scratch table. So our goal is to enable tools like BioConductor to read data from Genex and write their output back to genex. This way researchers can just drop in the new data processing and analysis tools that they want to use. This is different then the model that many people use. Currently, data is not kept in a DB, and is instead maintained as flat files on the computer and all the analyses are run directly from flat files. Genex aims to document and record all steps in the experiment so that researchers can most easily export their results as MAGE-ML to repositories like ArrayExpress. > The GeneX sounds very fit for our needs and I would like to > get the stable release as soon as possible. Ok. We will continue working, and let you know as soon as the Beta release is available. Cheers, jas. |