You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(4) |
Nov
(3) |
Dec
(2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
(3) |
Feb
(9) |
Mar
(2) |
Apr
(3) |
May
(3) |
Jun
(8) |
Jul
(5) |
Aug
(7) |
Sep
(6) |
Oct
(4) |
Nov
|
Dec
(3) |
2007 |
Jan
|
Feb
|
Mar
(2) |
Apr
(1) |
May
(3) |
Jun
|
Jul
|
Aug
(4) |
Sep
(5) |
Oct
(4) |
Nov
(1) |
Dec
|
2008 |
Jan
|
Feb
(2) |
Mar
(4) |
Apr
(1) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Eleanor H. <ele...@ji...> - 2009-06-18 21:34:40
|
Hi John, Thanks for the fix! I'll propagate it to our bugfix branch and it will go out with v4.4.1 at the end of the month. Eleanor Braisted, John C. wrote: > > Hi Eleanor et al., > > I’ve committed this class with a small but important patch. This class > uses a Sterling approximation for large factorials when computing the > Fisher Exact probability for a 2x2 contingency matrix. NIAID and their > group provided this code to support EASE. The code is originally from > Bell Labs. > > The Nonpar Fisher exact provides a one and a two tailed probability. > For the two tailed part, the code iterates over matrices that are > equal or less extreme than the observed and sums probabilities that > are less than or the ‘same’ as the original matrix. All computations > are done as double precision numbers. The problem comes when you > encounter/consider the transpose of the originally observed matrix > when considering the other tail of the probability. The approximation > for p-value should be exactly the same but it varies out at the very > end. I’m now just casting it to (float) to make the probability > comparison so that this discrepancy (last digit of a double precision > value) doesn’t affect the result. There may be other ways to handle > this like recognizing the transposed matrix corresponding to the > original matrix and adding it’s probability regardless of the fact > that it’s just a bit larger due to the approximation error at the > limits of machine probability. > > Note that in many cases the transpose of the original matrix behaves > and is the same but I have found instances where this fails. > > Here’s the code that has been committed: > > http://mev-tm4.svn.sourceforge.net/viewvc/mev-tm4/trunk/source/org/tigr/microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbability.java?view=log > > Note that EASE doesn’t care about/compute the two tailed test since we > only care about over representation in the cluster (one sided). > Therefore, the FE in EASE doesn’t have the two-tailed test and doesn’t > need this patch. > > John > > John Braisted > Senior Software Engineer > Pathogen Functional Genomics Resource Center (PFGRC) > J. Craig Venter Institute > 9704 Medical Center Drive > Rockville, MD 20850 > |
From: Braisted, J. C. <bra...@jc...> - 2009-06-18 15:28:56
|
Sterling approximation should have been Stirling's approximation.... but you get the point.... John Braisted Senior Software Engineer Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 From: Braisted, John C. Sent: Thursday, June 18, 2009 11:23 AM To: 'mev...@li...'; mev; Mev Harvard Subject: Committed NonparHypergeometricProbability.java to the trunk area of SF SVN Hi Eleanor et al., I've committed this class with a small but important patch. This class uses a Sterling approximation for large factorials when computing the Fisher Exact probability for a 2x2 contingency matrix. NIAID and their group provided this code to support EASE. The code is originally from Bell Labs. The Nonpar Fisher exact provides a one and a two tailed probability. For the two tailed part, the code iterates over matrices that are equal or less extreme than the observed and sums probabilities that are less than or the 'same' as the original matrix. All computations are done as double precision numbers. The problem comes when you encounter/consider the transpose of the originally observed matrix when considering the other tail of the probability. The approximation for p-value should be exactly the same but it varies out at the very end. I'm now just casting it to (float) to make the probability comparison so that this discrepancy (last digit of a double precision value) doesn't affect the result. There may be other ways to handle this like recognizing the transposed matrix corresponding to the original matrix and adding it's probability regardless of the fact that it's just a bit larger due to the approximation error at the limits of machine probability. Note that in many cases the transpose of the original matrix behaves and is the same but I have found instances where this fails. Here's the code that has been committed: http://mev-tm4.svn.sourceforge.net/viewvc/mev-tm4/trunk/source/org/tigr/ microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbabi lity.java?view=log Note that EASE doesn't care about/compute the two tailed test since we only care about over representation in the cluster (one sided). Therefore, the FE in EASE doesn't have the two-tailed test and doesn't need this patch. John John Braisted Senior Software Engineer Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2009-06-18 15:22:58
|
Hi Eleanor et al., I've committed this class with a small but important patch. This class uses a Sterling approximation for large factorials when computing the Fisher Exact probability for a 2x2 contingency matrix. NIAID and their group provided this code to support EASE. The code is originally from Bell Labs. The Nonpar Fisher exact provides a one and a two tailed probability. For the two tailed part, the code iterates over matrices that are equal or less extreme than the observed and sums probabilities that are less than or the 'same' as the original matrix. All computations are done as double precision numbers. The problem comes when you encounter/consider the transpose of the originally observed matrix when considering the other tail of the probability. The approximation for p-value should be exactly the same but it varies out at the very end. I'm now just casting it to (float) to make the probability comparison so that this discrepancy (last digit of a double precision value) doesn't affect the result. There may be other ways to handle this like recognizing the transposed matrix corresponding to the original matrix and adding it's probability regardless of the fact that it's just a bit larger due to the approximation error at the limits of machine probability. Note that in many cases the transpose of the original matrix behaves and is the same but I have found instances where this fails. Here's the code that has been committed: http://mev-tm4.svn.sourceforge.net/viewvc/mev-tm4/trunk/source/org/tigr/ microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbabi lity.java?view=log Note that EASE doesn't care about/compute the two tailed test since we only care about over representation in the cluster (one sided). Therefore, the FE in EASE doesn't have the two-tailed test and doesn't need this patch. John John Braisted Senior Software Engineer Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2008-05-15 18:37:37
|
A patched version of .../mev/cluster/algorithm/nonpar/Nonpar.java has been committed to the Trunk and the 4.1.01 branch in the SF SVN. The Fisher Exact test in Nonpar wasn't correctly handling evaluation of group assignments in cases where some of the loaded samples were 'Excluded' by the user. The subset of data was pulled from the matrix but downstream, during taking the tally for the contingency matrix, the original grouping information was used rather than the sample subset reflecting excluded samples. You can diff to the previous version to view the change. This problem was found this morning and is confined to the Fisher's Exact Test. John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2008-04-08 15:04:59
|
Hi Rupert, Thanks for the report and sorry for the problem. Mac OSX renders Java file choosers a little differently than Windows. Java file choosers can be explicitly set as designed to be used for 'Open' mode where the user can select a particular *existing* file to open. The other mode is 'Save' where one can navigate the file system and then can specify a particular file name to save in a text area. For Windows it happens that both of these modes Java gives us a text field to specify an output file name. Java for Linux and at least current versions of Java for Mac will display these two modes differently and when the file chooser is not specified as a 'Save' type it is assumed to be an 'Open' type and lacks a text area to specify the output file name. The analysis save dialog in MeV is modified in that it has the text at the top. The file chooser portion of the dialog type (mode) is not set to 'Save' explicitly and renders in the 'Open' mode and for Linux and Mac it lacks the text field to specify the file. I have the Harvard/DFCI MeV team in the Cc field and this email can serve to notify them of this issue. We'll be in touch when a patch has been made. In the meantime, you can create an empty file such as 'analysis.anl' in the destination folder of your choice, and this can be selected from the existing file chooser. The contents (which should be empty at first) will be overwritten when you save the analysis. This is sort of a work-around but it should work for now. Eleanor, and the DFCI team, I think we need the following line at about line number 510 in the MultipleArrayViewer class to remedy this issue. chooser.setDialogType(JFileChooser.SAVE_DIALOG); Let me know if you would like me to commit this change or if you would like to drop this into the code. My change would be to the version in the 4.1.01 branch in SourceForge. If you have critical changes to this class you can either add this line or commit your current version of this class to SourceForge so that I can add this patch. Please let me know if you'll add the modification or if you need to update SVN's branch. Thanks, John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 ________________________________ From: Rupert Overall [mailto:rup...@cr...] Sent: Tuesday, April 08, 2008 10:31 AM To: mev Subject: Error in Save Dialog Having just installed the new MeV v4.1.01, I have already noticed a lot of improvements in the GUI, especially the use of native file browser dialogs. The Save Dialog for the 'Save Analysis As...' feature, however, has no provision for filename entry! I am using v4.1.01 on MacOSX 10.5.2 running Java version 1.5.0_13 Thanks, Rupert Overall |
From: Braisted, J. C. <bra...@jc...> - 2008-03-27 15:16:02
|
I've committed a new Info.plist to the branch revision Mac support folder. This one only references 91 jar files + the few jars that MeV actually builds. If you have a 4.1.1 candidate build, we will need to test this again in case the jars have been altered. The trunk Mac support area was reorganized. I've added the 'Contents' folder and the updated Info.plist. John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2008-03-21 21:04:18
|
I've committed a new tmev.sh to the 4.1.1 branch revision and the trunk. (had to commit twice to remove one unnecessary command that Mac didn't like much) The new tmev.sh will launch on Mac and Linux and eliminates the lib/* classpath specification so it can run on the JRE 1.5 for Mac and Linux users that don't have a JRE 1.6. This sh loops through the lib to build the CLASSPATH variable. John |
From: Braisted, J. C. <bra...@jc...> - 2008-03-03 21:33:41
|
Hi Everyone, Dan found a bug in the SOTAExperimentViewer that seems to go back at least to 4.0. When making a modification in the Display menu it would throw an exception. The patch has been committed to SVN. John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2008-02-21 22:05:33
|
Just to keep people in the loop.... *I committed the LinearExpressionMapViewer.java, in ..gui/impl/lem. A modification was needed to handle mapping to the full IData index to account for an Experiment object that was the result of a data filter. URL linking from locus information windows would have improperly formed urls (wrong key value) if a filter was imposed. *annotation_URLs.txt was committed to update the URL entry for CMR gene pages. John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2008-02-21 13:37:39
|
I've committed NonparGUI.java to SVN to enable local metric selection in HCL dialogs. The original implementation of this class called the basic HCLDialog constructor. This version presents the distance metric dropdown list. John |
From: Braisted, J. C. <bra...@jc...> - 2007-11-02 15:19:16
|
Hi Raktim, Thanks for pointing out the help dialog misbehavior in NonpaR. I was setting the wrong parent for the help dialog. I've made the following commits to SourceForge which just replace the parent for the help dialogs with the appropriate next level parent. 5 classes in ...cluster/gui/impl/nonpar NonparFisherPanel NonparGUI NonparInitWizard NonparModePanel NonparWilcoxonPanel 1 class in ...cluster/gui/impl/dialogs GroupSelectionColorPanel If you could drop these into the build for the next release NonpaR will present the help dialogs properly. Thanks again for catching that. Best, John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Matthias B. <be...@de...> - 2007-10-16 17:18:03
|
Hello everyone, thank you for these positive reactions. Regarding Gaggle as a communications API for TMeV, I think this could be the way to go. For our immediate needs we'll probably stay with the hack that we have now. I'll send a list of our changes to Eleanor by separate email then. There are some changes that are specific to our needs (e.g. removing some menu items, changing the look of buttons). Many others are good for the code (or so I'd like to believe), mostly they are refactorings and small improvements. I haven't found a nice tool that would let me dump a list of all our patches in HTML format so they are easy to see. However, checking in our patches on a branch in sourceforge should allow everyone to see and compare them. And then one could merge this into the main branch if / when it makes sense. We would definitely separate the generally useful stuff from our product-specific changes before checking in. Another similar route might be that we send patches to the mailing list or the issue tracker. I'd like to argue for earlier rather than later merging of changes, but I don't want to delay your release schedule. We're ready to give any help we can in the merging. Cheers Matthias PS: I suggest that we keep the discussion on the mev-tm4-devel list from now on. John Quackenbush schrieb: > Matthias, > > I want to echo Eleanor's enthusiasm. We would like to include the > modifications you and others have made into the code base. As soon as > the current release is completed, we will turn our attention to bringing > in your changes and will look at some of the others you suggest. > > If you have ideas about developing an API beyond the gaggle approach, we > would appreciate hearing them since the best API is one that people > outside our development group can use easily. > > JQ > > > Eleanor Howe wrote: >> Hi Matthias, >> >> Thanks for your interest in the MeV project. Right now there isn't a >> formalized way to integrate the community's contributions to the source >> base. I'd love to take a closer look at the work you've done and see how >> we could add it into our source. We are always happy when people decide >> they like MeV enough to try adding something to it. Is there a place where >> I can find more details about the patch you have made? >> >> We're in the home stretch of putting together a new release right now, so >> we're about to make a whole bunch of commits to the repository. Because of >> that I'd like to wait a little while before adding a bunch of new code. >> But I'd be more than happy to look into doing so after MeV v4.1 comes out. >> >> Thank you for pointing out these other projects that support new data >> sources for MeV! I'll have a look at those and see if I can identify >> common elements that we could integrate into our base code. As far as our >> own interoperability efforts, we are currently working on implementing the >> Gaggle API (http://gaggle.systemsbiology.net/docs/) in MeV. That >> implementation may be included in the MeV 4.1 release. >> >> Putting a link to the Sourceforge site onto tm4.org is an excellent idea, >> and we should have thought of it earlier. I'll make sure that gets done. >> >> Again, thanks so much for your interest and your work on MeV. >> >> Eleanor Howe >> Dana-Farber Cancer Institute >> >> >> On Sat, 13 Oct 2007, Matthias Berth wrote: >> >> >>> Hello, >>> >>> >>> we have made several changes to TMeV in order to adapt it to the needs >>> of Delta2D [1] our software for 2D gel image analysis. We have, for example: >>> >>> - used expression profiles from our spot data as input >>> >>> - made it possible to propagate the selection of expression profiles >>> back to our program >>> >>> - patched the analysis procedure such that newly generated analysis >>> branches are opened immediately >>> >>> - removed some code duplications >>> >>> We'd like to contribute our changes to the main code base. What would be >>> the process for this? >>> >>> I have found other projects that maintain patches to use other data sources: >>> >>> - The Generation Challenge Program (GCP) if the International Rice >>> Research Institute [2] [3] >>> >>> - BASE - BioArray Software Environment [4] [5] >>> >>> - Signae (uses BASE) [6] >>> >>> So using external data sources would be an interesting avenue for >>> further development, e.g. by defining an API. >>> >>> We'd be glad to contribute to this and to the TMeV package in general. >>> Please let me know how we can get involved. >>> >>> >>> best regards >>> >>> Matthias Berth >>> CTO, DECODON GmbH >>> www.decodon.com >>> >>> >>> PS: It was somewhat hard for me to find the sourceforge project page, >>> http://sourceforge.net/projects/mev-tm4/ how about putting a link to it >>> on http://www.tm4.org/mev.html ? >>> >>> >>> [1] Delta2D - http://www.decodon.com/Solutions/Delta2D.html >>> [2] GCP software https://cropforge.org/projects/pantheon/ >>> [3] TMeV patches from the GCP >>> https://cropforge.org/plugins/scmsvn/viewcvs.php/Osiris/projects/TMEV/?rev=9239&root=pantheon&sortby=rev#dirlist >>> [4] BASE - http://base.thep.lu.se/ >>> [5] BASE - patches to TMeV: http://trac.thep.lu.se/trac/basehacks/wiki/MeV >>> [6] Signae TMeV http://www.sigenae.org/index.php?id=88 >>> >>> -- Dr. Matthias Berth email: be...@de... DECODON GmbH phone: +49 (3834) 515234 W.-Rathenau-Str. 49a fax: +49 (3834) 515239 17489 Greifswald, Germany web: www.decodon.com |
From: John Q. <joh...@gm...> - 2007-10-13 16:03:47
|
Matthias, I want to echo Eleanor's enthusiasm. We would like to include the modifications you and others have made into the code base. As soon as the current release is completed, we will turn our attention to bringing in your changes and will look at some of the others you suggest. If you have ideas about developing an API beyond the gaggle approach, we would appreciate hearing them since the best API is one that people outside our development group can use easily. JQ Eleanor Howe wrote: > Hi Matthias, > > Thanks for your interest in the MeV project. Right now there isn't a > formalized way to integrate the community's contributions to the source > base. I'd love to take a closer look at the work you've done and see how > we could add it into our source. We are always happy when people decide > they like MeV enough to try adding something to it. Is there a place where > I can find more details about the patch you have made? > > We're in the home stretch of putting together a new release right now, so > we're about to make a whole bunch of commits to the repository. Because of > that I'd like to wait a little while before adding a bunch of new code. > But I'd be more than happy to look into doing so after MeV v4.1 comes out. > > Thank you for pointing out these other projects that support new data > sources for MeV! I'll have a look at those and see if I can identify > common elements that we could integrate into our base code. As far as our > own interoperability efforts, we are currently working on implementing the > Gaggle API (http://gaggle.systemsbiology.net/docs/) in MeV. That > implementation may be included in the MeV 4.1 release. > > Putting a link to the Sourceforge site onto tm4.org is an excellent idea, > and we should have thought of it earlier. I'll make sure that gets done. > > Again, thanks so much for your interest and your work on MeV. > > Eleanor Howe > Dana-Farber Cancer Institute > > > On Sat, 13 Oct 2007, Matthias Berth wrote: > > >> Hello, >> >> >> we have made several changes to TMeV in order to adapt it to the needs >> of Delta2D [1] our software for 2D gel image analysis. We have, for example: >> >> - used expression profiles from our spot data as input >> >> - made it possible to propagate the selection of expression profiles >> back to our program >> >> - patched the analysis procedure such that newly generated analysis >> branches are opened immediately >> >> - removed some code duplications >> >> We'd like to contribute our changes to the main code base. What would be >> the process for this? >> >> I have found other projects that maintain patches to use other data sources: >> >> - The Generation Challenge Program (GCP) if the International Rice >> Research Institute [2] [3] >> >> - BASE - BioArray Software Environment [4] [5] >> >> - Signae (uses BASE) [6] >> >> So using external data sources would be an interesting avenue for >> further development, e.g. by defining an API. >> >> We'd be glad to contribute to this and to the TMeV package in general. >> Please let me know how we can get involved. >> >> >> best regards >> >> Matthias Berth >> CTO, DECODON GmbH >> www.decodon.com >> >> >> PS: It was somewhat hard for me to find the sourceforge project page, >> http://sourceforge.net/projects/mev-tm4/ how about putting a link to it >> on http://www.tm4.org/mev.html ? >> >> >> [1] Delta2D - http://www.decodon.com/Solutions/Delta2D.html >> [2] GCP software https://cropforge.org/projects/pantheon/ >> [3] TMeV patches from the GCP >> https://cropforge.org/plugins/scmsvn/viewcvs.php/Osiris/projects/TMEV/?rev=9239&root=pantheon&sortby=rev#dirlist >> [4] BASE - http://base.thep.lu.se/ >> [5] BASE - patches to TMeV: http://trac.thep.lu.se/trac/basehacks/wiki/MeV >> [6] Signae TMeV http://www.sigenae.org/index.php?id=88 >> >> >> _______________________________________________ >> Devel mailing list >> De...@me... >> http://mev.tm4.org/mailman/listinfo/devel_mev.tm4.org >> >> > > > _______________________________________________ > Devel mailing list > De...@me... > http://mev.tm4.org/mailman/listinfo/devel_mev.tm4.org > |
From: Eleanor H. <ele...@ji...> - 2007-10-13 14:50:31
|
Hi Matthias, Thanks for your interest in the MeV project. Right now there isn't a formalized way to integrate the community's contributions to the source base. I'd love to take a closer look at the work you've done and see how we could add it into our source. We are always happy when people decide they like MeV enough to try adding something to it. Is there a place where I can find more details about the patch you have made? We're in the home stretch of putting together a new release right now, so we're about to make a whole bunch of commits to the repository. Because of that I'd like to wait a little while before adding a bunch of new code. But I'd be more than happy to look into doing so after MeV v4.1 comes out. Thank you for pointing out these other projects that support new data sources for MeV! I'll have a look at those and see if I can identify common elements that we could integrate into our base code. As far as our own interoperability efforts, we are currently working on implementing the Gaggle API (http://gaggle.systemsbiology.net/docs/) in MeV. That implementation may be included in the MeV 4.1 release. Putting a link to the Sourceforge site onto tm4.org is an excellent idea, and we should have thought of it earlier. I'll make sure that gets done. Again, thanks so much for your interest and your work on MeV. Eleanor Howe Dana-Farber Cancer Institute On Sat, 13 Oct 2007, Matthias Berth wrote: > Hello, > > > we have made several changes to TMeV in order to adapt it to the needs > of Delta2D [1] our software for 2D gel image analysis. We have, for example: > > - used expression profiles from our spot data as input > > - made it possible to propagate the selection of expression profiles > back to our program > > - patched the analysis procedure such that newly generated analysis > branches are opened immediately > > - removed some code duplications > > We'd like to contribute our changes to the main code base. What would be > the process for this? > > I have found other projects that maintain patches to use other data sources: > > - The Generation Challenge Program (GCP) if the International Rice > Research Institute [2] [3] > > - BASE - BioArray Software Environment [4] [5] > > - Signae (uses BASE) [6] > > So using external data sources would be an interesting avenue for > further development, e.g. by defining an API. > > We'd be glad to contribute to this and to the TMeV package in general. > Please let me know how we can get involved. > > > best regards > > Matthias Berth > CTO, DECODON GmbH > www.decodon.com > > > PS: It was somewhat hard for me to find the sourceforge project page, > http://sourceforge.net/projects/mev-tm4/ how about putting a link to it > on http://www.tm4.org/mev.html ? > > > [1] Delta2D - http://www.decodon.com/Solutions/Delta2D.html > [2] GCP software https://cropforge.org/projects/pantheon/ > [3] TMeV patches from the GCP > https://cropforge.org/plugins/scmsvn/viewcvs.php/Osiris/projects/TMEV/?rev=9239&root=pantheon&sortby=rev#dirlist > [4] BASE - http://base.thep.lu.se/ > [5] BASE - patches to TMeV: http://trac.thep.lu.se/trac/basehacks/wiki/MeV > [6] Signae TMeV http://www.sigenae.org/index.php?id=88 > > > _______________________________________________ > Devel mailing list > De...@me... > http://mev.tm4.org/mailman/listinfo/devel_mev.tm4.org > |
From: Matthias B. <mat...@go...> - 2007-10-13 11:38:38
|
Hello, we have made several changes to TMeV in order to adapt it to the needs of Delta2D [1] our software for 2D gel image analysis. We have, for example: - used expression profiles from our spot data as input - made it possible to propagate the selection of expression profiles back to our program - patched the analysis procedure such that newly generated analysis branches are opened immediately - removed some code duplications We'd like to contribute our changes to the main code base. What would be the process for this? I have found other projects that maintain patches to use other data sources: - The Generation Challenge Program (GCP) if the International Rice Research Institute [2] [3] - BASE - BioArray Software Environment [4] [5] - Signae (uses BASE) [6] So using external data sources would be an interesting avenue for further development, e.g. by defining an API. We'd be glad to contribute to this and to the TMeV package in general. Please let me know how we can get involved. best regards Matthias Berth CTO, DECODON GmbH www.decodon.com PS: It was somewhat hard for me to find the sourceforge project page, http://sourceforge.net/projects/mev-tm4/ how about putting a link to it on http://www.tm4.org/mev.html ? [1] Delta2D - http://www.decodon.com/Solutions/Delta2D.html [2] GCP software https://cropforge.org/projects/pantheon/ [3] TMeV patches from the GCP https://cropforge.org/plugins/scmsvn/viewcvs.php/Osiris/projects/TMEV/?rev=9239&root=pantheon&sortby=rev#dirlist [4] BASE - http://base.thep.lu.se/ [5] BASE - patches to TMeV: http://trac.thep.lu.se/trac/basehacks/wiki/MeV [6] Signae TMeV http://www.sigenae.org/index.php?id=88 |
From: Sinha, R. D. <Rak...@df...> - 2007-09-14 13:17:10
|
Thanks JohnB, =20 The "lib" folder on SF for jars has just been created is not updated with a= ll the dependent libraries yet. We will do that soon. Raktim =20 ________________________________ From: Braisted, John C. [mailto:bra...@jc...]=20 Sent: Thursday, September 13, 2007 5:30 PM To: mev...@li...; me...@ji...; mev Subject: SourceForge Commits for next MeV =20 Hi Folks,=20 I've committed code for the next MeV release.=20 Attached you will find a list of CVS commits of new classes and CVS commits= that correspond to updates to existing classes. I'm providing this list in case= you want to update certain sections of your local code base rather than a full checkout. The first part of the list includes commits from today while the lower portion of the list includes some commits from the last year that sho= uld also go into the new build. Note that I checked out the full code set from SF and found that I couldn't build it due to a 'bn' package that seems to possibly have some dependency = on Weka. Perhaps we need a jar. Not sure but here's some output from ant. **It might just be compilation errors in the current commit of this class.=20 [javac] C:\Temp\MeV_5_CO\source\org\tigr\microarray\mev\cluster\gui\imp= l\bn\ FromWekaToSif.java:82: ';' expected=20 [javac] Hashtable<String, String> AccGeneMap =3D new Hashtable<= String, String>();=20 [javac] ^=20 [javac] 3 errors=20 Rather than trying to update the old build.xml at SF, since it doesn't refl= ect new packages, I've attached the targets for NonpaR in a text document. Th= ese can be inserted into your build.xml. Please commit a working build.xml whe= n you feel it is ready. =20 <<JCB_SourceForge_Commits_for_MeV.doc>> <<nonpar_build_targets.txt>>=20 Thanks for putting the jars on SF. Let me know if there are questions. The manual entry is ready to go for Nonpar. Let me know if you want me to send= it of if I should integrate it with a working copy of the document. Thanks,=20 John=20 =20 John Braisted=20 Software Engineer II=20 Pathogen Functional Genomics Resource Center (PFGRC)=20 J=2E Craig Venter Institute=20 9704 Medical Center Drive=20 Rockville, MD 20850=20 The information transmitted in this electronic communication is intended on= ly for the person or entity to whom it is addressed and may contain confide= ntial and/or privileged material. Any review, retransmission, dissemination= or other use of or taking of any action in reliance upon this information = by persons or entities other than the intended recipient is prohibited. If = you received this information in error, please contact the Compliance HelpL= ine at 800-856-1983 and properly dispose of this information. |
From: Braisted, J. C. <bra...@jc...> - 2007-09-13 21:33:10
|
ICAgIDx0YXJnZXQgbmFtZT0iTk9OUEFSIiBkZXBlbmRzPSJOT05QQVItR1VJIiBpZj0iTk9OUEFS Ij4NCiAgICAgICAgPGphdmFjIHNvdXJjZXBhdGg9IiIgc3JjZGlyPSIke2FsZy5pbXBsLmRpcn0i IGRlc3RkaXI9IiR7ZGVzdC5kaXJ9Ij4NCiAgICAgICAgICAgIDxpbmNsdWRlIG5hbWU9Ik5vbnBh ci5qYXZhIi8+DQogICAgICAgICAgICA8Y2xhc3NwYXRoPg0KICAgICAgICAgICAgICAgIDxwYXRo ZWxlbWVudCBsb2NhdGlvbj0iJHtsaWIuZGlyfS9KU2NpQ29yZS5qYXIiLz4NCg0KICAgICAgICAg ICAgICAgIDwhLS0gamFycyB0byBzdXBwb3J0IG1vZHVsZSBjb21waWxhdGlvbiAtLT4NCiAgICAg ICAgICAgICAgICA8cGF0aGVsZW1lbnQgbG9jYXRpb249IiR7bGliLmRpcn0vbWV2LXV0aWwuamFy Ii8+DQogICAgICAgICAgICAgICAgPHBhdGhlbGVtZW50IGxvY2F0aW9uPSIke2xpYi5kaXJ9L21l di1ndWktaW1wbC5qYXIiLz4NCiAgICAgICAgICAgICAgICA8cGF0aGVsZW1lbnQgbG9jYXRpb249 IiR7bGliLmRpcn0vbWV2LWd1aS1zdXBwb3J0LmphciIvPg0KICAgICAgICAgICAgICAgIDxwYXRo ZWxlbWVudCBsb2NhdGlvbj0iJHtsaWIuZGlyfS9tZXYtYWxnb3JpdGhtLWltcGwuamFyIi8+DQog ICAgICAgICAgICAgICAgPHBhdGhlbGVtZW50IGxvY2F0aW9uPSIke2xpYi5kaXJ9L21ldi1hbGdv cml0aG0tc3VwcG9ydC5qYXIiLz4NCiAgICAgICAgICAgICAgICA8cGF0aGVsZW1lbnQgbG9jYXRp b249IiR7bGliLmRpcn0vbWV2LWJhc2UuamFyIi8+ICAgICAgICAgICAgDQogICAgICAgICAgICA8 L2NsYXNzcGF0aD4NCgkgIDwvamF2YWM+DQoJICA8cHJvcGVydHlmaWxlIGZpbGU9IiR7YWxnLnBy b3BlcnRpZXMuZmlsZX0iPg0KICAgICAgICAgICAgPGVudHJ5IGtleT0iTk9OUEFSIiB2YWx1ZT0i b3JnLnRpZ3IubWljcm9hcnJheS5tZXYuY2x1c3Rlci5hbGdvcml0aG0uaW1wbC5ub25wYXIuTm9u cGFyIi8+DQogICAgICAgIDwvcHJvcGVydHlmaWxlPg0KICAgIDwvdGFyZ2V0Pg0KDQogICAgPHRh cmdldCBuYW1lPSJOT05QQVItR1VJIj4NCiAgICAgICAgPGphdmFjIHNyY2Rpcj0iJHtndWkuaW1w bC5kaXJ9L25vbnBhciIgZGVzdGRpcj0iJHtkZXN0LmRpcn0iPg0KICAgICAgICAgICAgPGNsYXNz cGF0aCByZWZpZD0ibW9kdWxlLmJ1aWxkLmNsYXNzLnBhdGgiLz4NCgkgIDwvamF2YWM+DQoJICA8 cHJvcGVydHlmaWxlIGZpbGU9IiR7Z3VpLnByb3BlcnRpZXMuZmlsZX0iPg0KCQk8ZW50cnkga2V5 PSJndWkubmFtZXMiIHZhbHVlPSJOT05QQVI6IiBvcGVyYXRpb249IisiLz4NCiAgICAgICAgICAg IDxlbnRyeSBrZXk9Ik5PTlBBUi5uYW1lIiB2YWx1ZT0iTk9OUEFSIi8+DQogICAgICAgICAgICA8 ZW50cnkga2V5PSJOT05QQVIuY2xhc3MiIHZhbHVlPSJvcmcudGlnci5taWNyb2FycmF5Lm1ldi5j bHVzdGVyLmd1aS5pbXBsLm5vbnBhci5Ob25wYXJHVUkiLz4NCgkgIAk8ZW50cnkga2V5PSJOT05Q QVIuY2F0ZWdvcnkiIHZhbHVlPSIke1NUQVRJU1RJQ1N9Ii8+IA0KCSAgCTxlbnRyeSBrZXk9Ik5P TlBBUi5zbWFsbEljb24iIHZhbHVlPSJhbmFseXNpczE2LmdpZiIvPg0KICAgICAgICAgICAgPGVu dHJ5IGtleT0iTk9OUEFSLmxhcmdlSWNvbiIgdmFsdWU9Im5vbnBhcl9idXR0b24uZ2lmIi8+DQog ICAgICAgICAgICA8ZW50cnkga2V5PSJOT05QQVIudG9vbHRpcCIgdmFsdWU9Ik5vbnBhcmFtZXRy aWMgVGVzdHMiLz4NCiAgICAgICAgPC9wcm9wZXJ0eWZpbGU+DQogICAgPC90YXJnZXQ+DQo= |
From: John Q. <jo...@ji...> - 2007-09-13 16:36:06
|
Raktim and Sarita, Would you work on updates to the relevant sections as well? JQ Braisted, John C. wrote: > > Would it be possible to commit an updated word version of the MeV > manual to the user_docs area of the SourceForge repository? That > would allow me to integrate my nonparametric module manual section > into the manual. Either that or put it on your ftp site so that I > can grab it. I can send my section in a word doc but I thought it > might be easier on you guys if I were to integrate the text. > > ==================== > Regarding commits to CVS... > > I have all changes ready to go into the SourceForge cvs area. Before > I commit I need to make a list of the new package and classes that > support it. I should have the code in SourceForge tomorrow and have a > build.xml that has targets for the new module. > > I'll also include a list of the little add-ons that were previously > committed since the last release. > > John > > > > John Braisted > Software Engineer II > Pathogen Functional Genomics Resource Center (PFGRC) > J. Craig Venter Institute > 9704 Medical Center Drive > Rockville, MD 20850 > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > ------------------------------------------------------------------------ > > _______________________________________________ > mev-tm4-devel mailing list > mev...@li... > https://lists.sourceforge.net/lists/listinfo/mev-tm4-devel > |
From: Braisted, J. C. <bra...@jc...> - 2007-09-13 16:29:30
|
Would it be possible to commit an updated word version of the MeV manual to the user_docs area of the SourceForge repository? That would allow me to integrate my nonparametric module manual section into the manual. Either that or put it on your ftp site so that I can grab it. I can send my section in a word doc but I thought it might be easier on you guys if I were to integrate the text. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Regarding commits to CVS... I have all changes ready to go into the SourceForge cvs area. Before I commit I need to make a list of the new package and classes that support it. I should have the code in SourceForge tomorrow and have a build.xml that has targets for the new module. I'll also include a list of the little add-ons that were previously committed since the last release. John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: Braisted, J. C. <bra...@jc...> - 2007-09-12 13:31:22
|
FYI: Quite frequently we need to tell people to increase the maximum heap size using the -Xmx<somenumber>m argument in the bat file for MeV. People often ask (and I've wondered) exactly why the limit varies on machines that seem to have the same OS (same maximum addressable limits) and RAM and why a 2GB-RAM machine won't allow us to get closer to the 2GB addressable memory limit. This little section of the HotSpot JRE faqs at least comes out and says that the limit falls between 1.4 and 1.6GB which is just what we've found empirically on 32-bit windows machines. The page provides a few factors. http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_heap_32bit John |
From: John Q. <jo...@ji...> - 2007-08-29 13:35:18
|
This sounds great John. Thanks for the update. Maybe we should have a conference call in a few weeks to touch base a bit and see when we want to have a next release. JQ Braisted, John C. wrote: > Hi John and others, > > I've developed a nonparametric statistical module that includes: > > -Wilcoxon Rank Sum > -Kruskal-Wallace Test > -Mack-Skillings ( a generalization of the Friedman 2-way test that > handles replication in cells) > (the MS test is probably the most important novelty since it's not > available in R (I think) and it handles a common task) > -Fisher Exact (for special cases where data values are more like A/P > calls as in CGH) > > These tests fall under one module. Originally the plan was to use R but > that really wasn't needed and it seemed like it would require extra > effort to deploy with dependencies on R and RServe. I think it was your > advice at our last meeting to just implement the tests without R and > while R worked fine with MeV in my tests, the deployment complication > did outweighed the development cost of developing without R. > > In addition the module, I've created a wizard dialog system that can > control the process of parameter collection for stat methods. This > eliminates the need for tabbed panes and the size of dialogs are > constrained. This new stat wizard system is only utilized in my NonpaR > module but it can be used by other methods in the future. > > I have a manual entry finished. The last step is to build some help > pages. The NonpaR module results are verified against R and also > examples from the Hollander and Wolfe book on nonparametric methods. > The module is currently being used by a couple of groups internal to > JCVI that are part of the PFGRC. > > The manual entry pdf that I've attached will need a bit of formatting > but it gives an overview of the module and also the look of the dialog > wizard system (at least as rendered by Java 1.4.2). > > John > > > ________________________________ > > From: John Quackenbush [mailto:jo...@ji...] > Sent: Tuesday, August 28, 2007 9:31 AM > To: Braisted, John C. > Subject: Re: loading GEO format files > > > John, > > It would be good to know about which tests you are working on since we > are doing some parallel work. Do you have a list you could send? > > JQ > > Braisted, John C. wrote: > > Hi Steve, > > I just saw JQ's response as I was drafting mine... MeV is still > being > actively developed. The Dana Farber Cancer Institute (DFCI) > affiliated > with Harvard has a very large and capable MeV/TM4 development > team with > diverse skills. The vast majority of new work is coming from > the DFCI > group headed by John Quackenbush. There are developers at the > University of Washington in Seattle that have added many > important > features and modules over several years. The group at UW is > headed by > Roger Bumgarner and while I haven't been in touch with them for > several > months I'm pretty sure they are going strong. > > The JCVI team (formally we were TIGR) has development goals that > are a > bit more aligned with supporting specific research needs and > less > focused on general software development (like features and > under-the-hood details). One module close to release is a > collection of > nonparametric statistical tests. > > John > > -----Original Message----- > From: Steve Taylor [mailto:st...@mo...] > Sent: Tuesday, August 28, 2007 4:18 AM > To: Braisted, John C. > Cc: mev; me...@ji... > Subject: Re: loading GEO format files > > Hi John, > > Thanks for looking at this. I look forward to your reply. Is MeV > still > actively being developed BTW? > > Steve > > > > I took a look at the file and came across the same > error. The code > doesn't seem to handle the format of your file because > it looks for > certain features in the file that can indicate where a > data matrix > starts and stops but your file seems to deviate from the > sample GEO > file enough to break the loading process. > > The GEO file loader was developed at Harvard so I'm not > an expert on > the file format or the loader. I think the loader can > be made more > robust by testing with files like yours but the update > of the loader > isn't in my current domain of work. I've cc'ed the > group at Harvard > in case they can try to tackle the problem but I don't > know their > priorities and so I can't say whether it will be > addressed. It's not > very complex but it's a bit of work to make the changes. > The group up > > > > > > there at Harvard may also have advice about either > modification of the > > > > > > file or whether another format can be used from GEO. > > ============ > > I downloaded the matrix file which is supposed to be a > matrix of > values where rows are spots or features and columns are > separate > hybridization results. This normally can be easily > modified to load > as a TDMS file in MeV. It looks like there may have > been a possible > formatting issue with the matrix file submission to GEO > where array > results were sort of concatenated or stacked. There are > about 400,000 > > > > > > rows and I realize it's a SNP study but I'm not sure > it's a huge > tiling array or if the row count is correct. > > Need to use WordPad (not Excel to modify the file into > TDMS format): > You would need to remove the header section of the file > except for the > > > > > > last header row with column ids that is just above the > 'matrix' of > values. It's pretty sparse. Then go the last row and > remove the last > > > > > > line labeled '!....'. > > The file is huge and will load into MEV but I suspect > it's sort of a > stacking of array results rather than a properly formed > data matrix. > The row ids are not unique meaning that either there is > a lot of > replication or that the results are stacked. The number > of columns in > > > > > > the matrix is consistent but most rows only contain data > for one hyb > but that's not a strict rule as some rows have several > values. > > You can try to load the matrix file but unless the > strange format > (sparse matrix with 400K rows) seems to make sense given > the > experiment it might be better to either see if a GEO > loader can be > re-worked or if you can contact GEO or the authors for a > well > formatted matrix of values. > > John Braisted > > John Braisted > Software Engineer II > Pathogen Functional Genomics Resource Center (PFGRC) J. > Craig Venter > Institute > 9704 Medical Center Drive > Rockville, MD 20850 > > > -----Original Message----- > From: Steve Taylor [mailto:st...@mo...] > Sent: Friday, August 24, 2007 5:05 AM > To: mev > Subject: loading GEO format files > > Hi, > > I was trying to load a GEO SOFT format file in TMEV4 on > Windows XP SP2 > > > > > > (for example the one in > > ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SOFT/by_series/GSE5291/). I > uncompressed it and renamed the extension at the end to > .txt but when > I tried to load it didn't show any columns had loaded > (see attached > screenshot). > > Has the SOFT format changed or does this feature just > not work? > > Thanks for any help, > > Steve > > ------------------------------------------------------------------ > Head of Computational Biology Research Group Medical > Sciences Division > > > > > > Weatherall Institute of Molecular Medicine/Sir William > Dunn School > Oxford University > Tel: +44 (0)1865 (2)22640 (WIMM - Monday to Wednesday) > Tel: +44 (0)1865 (2)85732 (Dunn - Thursday to Friday) > Web: http://www.compbio.ox.ac.uk > > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > ------------------------------------------------------------------------ > > _______________________________________________ > mev-tm4-devel mailing list > mev...@li... > https://lists.sourceforge.net/lists/listinfo/mev-tm4-devel > |
From: Sinha, R. D. <Rak...@df...> - 2007-08-28 15:11:48
|
Hello All, Along with the Bayesian Network analysis tool that JohnQ already mentioned = the pipeline also has the following: 1=2E New CGH module called ChARM with new viewers. 2=2E A new Data Annotation model for MeV. 3=2E The File Loaders are being re-worked to accommodate the new Model and = make them more user-friendly. Raktim -----Original Message----- From: Braisted, John C. [mailto:bra...@jc...]=20 Sent: Tuesday, August 28, 2007 10:35 AM To: Quackenbush, John Cc: mev; me...@ji...; mev...@li...; Saeed, Alexander I. Subject: RE: loading GEO format files Hi John and others, I've developed a nonparametric statistical module that includes: -Wilcoxon Rank Sum -Kruskal-Wallace Test=20 -Mack-Skillings ( a generalization of the Friedman 2-way test that handles replication in cells) (the MS test is probably the most important novelty since it's not available in R (I think) and it handles a common task) -Fisher Exact (for special cases where data values are more like A/P calls as in CGH) These tests fall under one module. Originally the plan was to use R but that really wasn't needed and it seemed like it would require extra effort to deploy with dependencies on R and RServe. I think it was your advice at our last meeting to just implement the tests without R and while R worked fine with MeV in my tests, the deployment complication did outweighed the development cost of developing without R. In addition the module, I've created a wizard dialog system that can control the process of parameter collection for stat methods. This eliminates the need for tabbed panes and the size of dialogs are constrained. This new stat wizard system is only utilized in my NonpaR module but it can be used by other methods in the future. I have a manual entry finished. The last step is to build some help pages. The NonpaR module results are verified against R and also examples from the Hollander and Wolfe book on nonparametric methods. The module is currently being used by a couple of groups internal to JCVI that are part of the PFGRC. The manual entry pdf that I've attached will need a bit of formatting but it gives an overview of the module and also the look of the dialog wizard system (at least as rendered by Java 1.4.2).=20 John ________________________________ From: John Quackenbush [mailto:jo...@ji...]=20 Sent: Tuesday, August 28, 2007 9:31 AM To: Braisted, John C. Subject: Re: loading GEO format files John, It would be good to know about which tests you are working on since we are doing some parallel work. Do you have a list you could send? JQ Braisted, John C. wrote:=20 Hi Steve, =09 I just saw JQ's response as I was drafting mine... MeV is still being actively developed. The Dana Farber Cancer Institute (DFCI) affiliated with Harvard has a very large and capable MeV/TM4 development team with diverse skills. The vast majority of new work is coming from the DFCI group headed by John Quackenbush. There are developers at the University of Washington in Seattle that have added many important features and modules over several years. The group at UW is headed by Roger Bumgarner and while I haven't been in touch with them for several months I'm pretty sure they are going strong. =20 =09 The JCVI team (formally we were TIGR) has development goals that are a bit more aligned with supporting specific research needs and less focused on general software development (like features and under-the-hood details). One module close to release is a collection of nonparametric statistical tests. =09 John =09 -----Original Message----- From: Steve Taylor [mailto:st...@mo...]=20 Sent: Tuesday, August 28, 2007 4:18 AM To: Braisted, John C. Cc: mev; me...@ji... Subject: Re: loading GEO format files =09 Hi John, =09 Thanks for looking at this. I look forward to your reply. Is MeV still actively being developed BTW? =09 Steve =09 =20 I took a look at the file and came across the same error. The code=20 doesn't seem to handle the format of your file because it looks for=20 certain features in the file that can indicate where a data matrix=20 starts and stops but your file seems to deviate from the sample GEO=20 file enough to break the loading process. =09 The GEO file loader was developed at Harvard so I'm not an expert on=20 the file format or the loader. I think the loader can be made more=20 robust by testing with files like yours but the update of the loader=20 isn't in my current domain of work. I've cc'ed the group at Harvard=20 in case they can try to tackle the problem but I don't know their=20 priorities and so I can't say whether it will be addressed. It's not=20 very complex but it's a bit of work to make the changes. The group up =20 =09 =20 there at Harvard may also have advice about either modification of the =20 =09 =20 file or whether another format can be used from GEO. =09 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =09 I downloaded the matrix file which is supposed to be a matrix of=20 values where rows are spots or features and columns are separate=20 hybridization results. This normally can be easily modified to load=20 as a TDMS file in MeV. It looks like there may have been a possible=20 formatting issue with the matrix file submission to GEO where array=20 results were sort of concatenated or stacked. There are about 400,000 =20 =09 =20 rows and I realize it's a SNP study but I'm not sure it's a huge=20 tiling array or if the row count is correct. =09 Need to use WordPad (not Excel to modify the file into TDMS format): You would need to remove the header section of the file except for the =20 =09 =20 last header row with column ids that is just above the 'matrix' of=20 values. It's pretty sparse. Then go the last row and remove the last =20 =09 =20 line labeled '!....'. =09 The file is huge and will load into MEV but I suspect it's sort of a=20 stacking of array results rather than a properly formed data matrix. The row ids are not unique meaning that either there is a lot of=20 replication or that the results are stacked. The number of columns in =20 =09 =20 the matrix is consistent but most rows only contain data for one hyb=20 but that's not a strict rule as some rows have several values. =09 You can try to load the matrix file but unless the strange format=20 (sparse matrix with 400K rows) seems to make sense given the=20 experiment it might be better to either see if a GEO loader can be=20 re-worked or if you can contact GEO or the authors for a well=20 formatted matrix of values. =09 John Braisted =09 John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter=20 Institute 9704 Medical Center Drive Rockville, MD 20850 =09 =09 -----Original Message----- From: Steve Taylor [mailto:st...@mo...] Sent: Friday, August 24, 2007 5:05 AM To: mev Subject: loading GEO format files =09 Hi, =09 I was trying to load a GEO SOFT format file in TMEV4 on Windows XP SP2 =20 =09 =20 (for example the one in=20 =09 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SOFT/by_series/GSE5291/). I=20 uncompressed it and renamed the extension at the end to .txt but when=20 I tried to load it didn't show any columns had loaded (see attached=20 screenshot). =09 Has the SOFT format changed or does this feature just not work? =09 Thanks for any help, =09 Steve =09 ------------------------------------------------------------------ Head of Computational Biology Research Group Medical Sciences Division =20 =09 =20 Weatherall Institute of Molecular Medicine/Sir William Dunn School=20 Oxford University Tel: +44 (0)1865 (2)22640 (WIMM - Monday to Wednesday) Tel: +44 (0)1865 (2)85732 (Dunn - Thursday to Friday) Web: http://www.compbio.ox.ac.uk =09 =20 The information transmitted in this electronic communication is intended on= ly for the person or entity to whom it is addressed and may contain confide= ntial and/or privileged material. Any review, retransmission, dissemination= or other use of or taking of any action in reliance upon this information = by persons or entities other than the intended recipient is prohibited. If = you received this information in error, please contact the Compliance HelpL= ine at 800-856-1983 and properly dispose of this information. |
From: Braisted, J. C. <bra...@jc...> - 2007-08-28 14:35:01
|
Hi John and others, I've developed a nonparametric statistical module that includes: -Wilcoxon Rank Sum -Kruskal-Wallace Test=20 -Mack-Skillings ( a generalization of the Friedman 2-way test that handles replication in cells) (the MS test is probably the most important novelty since it's not available in R (I think) and it handles a common task) -Fisher Exact (for special cases where data values are more like A/P calls as in CGH) These tests fall under one module. Originally the plan was to use R but that really wasn't needed and it seemed like it would require extra effort to deploy with dependencies on R and RServe. I think it was your advice at our last meeting to just implement the tests without R and while R worked fine with MeV in my tests, the deployment complication did outweighed the development cost of developing without R. In addition the module, I've created a wizard dialog system that can control the process of parameter collection for stat methods. This eliminates the need for tabbed panes and the size of dialogs are constrained. This new stat wizard system is only utilized in my NonpaR module but it can be used by other methods in the future. I have a manual entry finished. The last step is to build some help pages. The NonpaR module results are verified against R and also examples from the Hollander and Wolfe book on nonparametric methods. The module is currently being used by a couple of groups internal to JCVI that are part of the PFGRC. The manual entry pdf that I've attached will need a bit of formatting but it gives an overview of the module and also the look of the dialog wizard system (at least as rendered by Java 1.4.2).=20 John ________________________________ From: John Quackenbush [mailto:jo...@ji...]=20 Sent: Tuesday, August 28, 2007 9:31 AM To: Braisted, John C. Subject: Re: loading GEO format files John, It would be good to know about which tests you are working on since we are doing some parallel work. Do you have a list you could send? JQ Braisted, John C. wrote:=20 Hi Steve, =09 I just saw JQ's response as I was drafting mine... MeV is still being actively developed. The Dana Farber Cancer Institute (DFCI) affiliated with Harvard has a very large and capable MeV/TM4 development team with diverse skills. The vast majority of new work is coming from the DFCI group headed by John Quackenbush. There are developers at the University of Washington in Seattle that have added many important features and modules over several years. The group at UW is headed by Roger Bumgarner and while I haven't been in touch with them for several months I'm pretty sure they are going strong. =20 =09 The JCVI team (formally we were TIGR) has development goals that are a bit more aligned with supporting specific research needs and less focused on general software development (like features and under-the-hood details). One module close to release is a collection of nonparametric statistical tests. =09 John =09 -----Original Message----- From: Steve Taylor [mailto:st...@mo...]=20 Sent: Tuesday, August 28, 2007 4:18 AM To: Braisted, John C. Cc: mev; me...@ji... Subject: Re: loading GEO format files =09 Hi John, =09 Thanks for looking at this. I look forward to your reply. Is MeV still actively being developed BTW? =09 Steve =09 =20 I took a look at the file and came across the same error. The code=20 doesn't seem to handle the format of your file because it looks for=20 certain features in the file that can indicate where a data matrix=20 starts and stops but your file seems to deviate from the sample GEO=20 file enough to break the loading process. =09 The GEO file loader was developed at Harvard so I'm not an expert on=20 the file format or the loader. I think the loader can be made more=20 robust by testing with files like yours but the update of the loader=20 isn't in my current domain of work. I've cc'ed the group at Harvard=20 in case they can try to tackle the problem but I don't know their=20 priorities and so I can't say whether it will be addressed. It's not=20 very complex but it's a bit of work to make the changes. The group up =20 =09 =20 there at Harvard may also have advice about either modification of the =20 =09 =20 file or whether another format can be used from GEO. =09 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =09 I downloaded the matrix file which is supposed to be a matrix of=20 values where rows are spots or features and columns are separate=20 hybridization results. This normally can be easily modified to load=20 as a TDMS file in MeV. It looks like there may have been a possible=20 formatting issue with the matrix file submission to GEO where array=20 results were sort of concatenated or stacked. There are about 400,000 =20 =09 =20 rows and I realize it's a SNP study but I'm not sure it's a huge=20 tiling array or if the row count is correct. =09 Need to use WordPad (not Excel to modify the file into TDMS format): You would need to remove the header section of the file except for the =20 =09 =20 last header row with column ids that is just above the 'matrix' of=20 values. It's pretty sparse. Then go the last row and remove the last =20 =09 =20 line labeled '!....'. =09 The file is huge and will load into MEV but I suspect it's sort of a=20 stacking of array results rather than a properly formed data matrix. The row ids are not unique meaning that either there is a lot of=20 replication or that the results are stacked. The number of columns in =20 =09 =20 the matrix is consistent but most rows only contain data for one hyb=20 but that's not a strict rule as some rows have several values. =09 You can try to load the matrix file but unless the strange format=20 (sparse matrix with 400K rows) seems to make sense given the=20 experiment it might be better to either see if a GEO loader can be=20 re-worked or if you can contact GEO or the authors for a well=20 formatted matrix of values. =09 John Braisted =09 John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter=20 Institute 9704 Medical Center Drive Rockville, MD 20850 =09 =09 -----Original Message----- From: Steve Taylor [mailto:st...@mo...] Sent: Friday, August 24, 2007 5:05 AM To: mev Subject: loading GEO format files =09 Hi, =09 I was trying to load a GEO SOFT format file in TMEV4 on Windows XP SP2 =20 =09 =20 (for example the one in=20 =09 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SOFT/by_series/GSE5291/). I=20 uncompressed it and renamed the extension at the end to .txt but when=20 I tried to load it didn't show any columns had loaded (see attached=20 screenshot). =09 Has the SOFT format changed or does this feature just not work? =09 Thanks for any help, =09 Steve =09 ------------------------------------------------------------------ Head of Computational Biology Research Group Medical Sciences Division =20 =09 =20 Weatherall Institute of Molecular Medicine/Sir William Dunn School=20 Oxford University Tel: +44 (0)1865 (2)22640 (WIMM - Monday to Wednesday) Tel: +44 (0)1865 (2)85732 (Dunn - Thursday to Friday) Web: http://www.compbio.ox.ac.uk =09 =20 |
From: Braisted, J. C. <bra...@jc...> - 2007-08-20 21:05:39
|
The annotation file parser has a method to remove quotes on annotation fields. The mev file loader uses this when the user selects the option to strip out quotes. The method removed an extra character before the trailing quote. It's not a commonly used feature but a user just reported the problem. A patched AnnFileParser.java has been committed to the MeV CVS on SourceForge. John John Braisted Software Engineer II Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 |
From: John Q. <jo...@ji...> - 2007-05-21 21:20:56
|
Thanks John! Braisted, John C. wrote: > > I've committed: > > /org/tigr/microarray/mev/cluster/gui/impl/hcl/HCLNodeHeightGraph.java > _http://mev-tm4.cvs.sourceforge.net/mev-tm4/org/tigr/microarray/mev/cluster/gui/impl/hcl/_ > > > This version has one additional menu option to output the graph data. > > John > > > John Braisted > Software Engineer II > Pathogen Functional Genomics Resource Center (PFGRC) > J. Craig Venter Institute > 9704 Medical Center Drive > Rockville, MD 20850 > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > ------------------------------------------------------------------------ > > _______________________________________________ > mev-tm4-devel mailing list > mev...@li... > https://lists.sourceforge.net/lists/listinfo/mev-tm4-devel > |