From: Nathan O. S. <Nat...@bm...> - 2003-08-16 16:30:41
|
Scott: The tar.gz file is uploaded to the sourceforge site: -rw-r--r-- 1 ftp nogroup 14685681 Aug 16 16:28 Refseq_Genome_TBLASTX.tar.gz A brief README describes the methods and contents. Nathan Scott Cain wrote: > Nathan, > > I think that is quite reasonable in size--one thing I would suggest > would be to tar all of them up in one directory with a README to explain > what everything is. You can go ahead and upload whenever you are ready; > just let me know what the final file name is. > > As for blast2gff.pl--I understand; there is a fair amount of work in > getting code ready for 'public consumption,' so I can see why you might > want to do that if nobody else wants it. I guess we'll see. > > Thanks again--I know I am interested to see what this data looks like. > > Scott > > > On Tue, 2003-08-12 at 10:09, Nathan Siemers wrote: > >>Hello Scott: >> >>Here are the sizes of the files once gzipped. Is this an appropriate >>size range to upload? >>The SGD and worm genome came from the GMOD sourceforge site, the Fly >>assembly is more recent >>I'll give details in a readme file. >> >>Give me the go and I'll upload. >> >>I'm also happy to deliver the perl source 'blast2gff.pl' to anyone who >>is interested for testing, but I'd rather not >>put it on the site for general consumption until people feel it is of >>general value and I can document it >>in a reasonable way. (that being said, it *does* work) >> >>If there are better/other genome assemblies available for yeast or worm >>that folks wish to use as the basis for these annotations, >>point me to the source and I can begin the annotation process. It takes >>about a week per genome on a 32 CPU SGI Origin >>class machine... We have in the past also annotated ensembl >>transcripts, this is of course more expensive. >> >> >> >>nathan >> >> >> >>-rw-rw-r-- 1 nathan bioinfo 2236225 May 20 22:19 >>refseq_fly_nuc.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 678823 May 13 13:44 >>refseq_fly_nuc.tblastx.worm.fasta.shredded.20000.10000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 619831 May 22 16:02 >>refseq_fly_nuc.tblastx.yeast.fasta.shredded.10000.5000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 1231670 May 17 21:34 >>refseq_human_nuc.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 978217 May 11 18:45 >>refseq_human_nuc.tblastx.worm.fasta.shredded.20000.10000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 896240 May 22 00:53 >>refseq_human_nuc.tblastx.yeast.fasta.shredded.10000.5000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 865829 May 18 23:41 >>refseq_mouse_nuc.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 695655 May 12 12:12 >>refseq_mouse_nuc.tblastx.worm.fasta.shredded.20000.10000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 622679 May 22 07:42 >>refseq_mouse_nuc.tblastx.yeast.fasta.shredded.10000.5000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 345329 May 19 06:13 >>refseq_rat_nuc.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 289964 May 12 16:53 >>refseq_rat_nuc.tblastx.worm.fasta.shredded.20000.10000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 235056 May 22 08:33 >>refseq_rat_nuc.tblastx.yeast.fasta.shredded.10000.5000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 607232 May 21 17:04 >>refseq_worm_nuc.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 3450339 May 14 13:46 >>refseq_worm_nuc.tblastx.worm.fasta.shredded.20000.10000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 487261 May 22 18:18 >>refseq_worm_nuc.tblastx.yeast.fasta.shredded.10000.5000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 60939 May 16 06:52 >>refseq_zebrafish_nuc.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 50235 May 9 17:44 >>refseq_zebrafish_nuc.tblastx.worm.fasta.shredded.20000.10000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 35922 May 21 21:01 >>refseq_zebrafish_nuc.tblastx.yeast.fasta.shredded.10000.5000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 161185 May 19 11:32 >>sgd_yeast_orfs.tblastx.flynew.fasta.shredded.40000.20000.gff.gz >>-rw-rw-r-- 1 nathan bioinfo 138983 May 12 20:55 >>sgd_yeast_orfs.tblastx.worm.fasta.shredded.20000.10000.gff.gz >> >> >> >> >> >>Scott Cain wrote: >> >> >>>Hello Nathan and Donald, >>> >>>Thanks very much for your offer. I am sure there will be interest in >>>your data as well as your code. The obvious thing to do with the data >>>is to release it under the "Sample Data Files." The easiest way to do >>>the is for me to get the files and move them to the prerelease area and >>>jump through a few hoops. How big is the (compressed) file? Or, what >>>might be better, you can upload the file via anonymous ftp to >>>upload.sourceforge.net/incoming and then let me know that you did it and >>>what the file name is, and I will add it to the download area. Be sure >>>to use binary transfer for the upload. >>> >>>As for the code, would you like to release it under some open source >>>license and get it under CVS control with GMOD? More eyes and all that. >>> >>>Thanks again, >>>Scott >>> >>> >>>On Sun, 2003-08-10 at 12:25, Nathan O. Siemers wrote: >>> >>> >>> >>>> Hello All, >>>> >>>> >>>> We've spent a bit of time cross-annotating the model organism databases >>>>against Refseq and Yeast transcripts. This has been accomplished by >>>>running tblastx searches of the transcripts against the genomes of the >>>>model organisms (broken into chunks), followed by an attempt to perform >>>>intelligent harmonization of the HSPs into a minimal, consistent set. >>>>Only the single best hit of any query transcript against a genome is >>>>saved, and we find this useful for some orthology explorations. >>>> >>>> We have received blessing from BMS to release all these annotations to >>>>the GMOD group. If there is interest we will need a site to upload them >>>>to - we don't have one easily available to us. >>>> >>>> >>>> I've enclosed a snapshot of what these annotations look like from the >>>>perspective of C. elegans. I would be happy to share the blast2gff code >>>>that does the HSP analysis, but it is clearly early work in progress... >>>> >>>> Let me know if there is any interest. >>>> >>>> >>>> Nathan >>>> >>>> >> >> >> >>------------------------------------------------------- >>This SF.Net email sponsored by: Free pre-built ASP.NET sites including >>Data Reports, E-commerce, Portals, and Forums are available now. >>Download today and enter to win an XBOX or Visual Studio .NET. >>http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 >>_______________________________________________ >>Gmod-devel mailing list >>Gmo...@li... >>https://lists.sourceforge.net/lists/listinfo/gmod-devel -- Nathan Siemers|Associate Director|Applied Genomics|Bristol-Myers Squibb Pharmaceutical Research Institute|HW3-0.07|P.O. Box 5400|Princeton, NJ 08543-5400|(609)818-6568|nat...@bm... |