You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(11) |
Jul
(34) |
Aug
(14) |
Sep
(10) |
Oct
(10) |
Nov
(11) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(56) |
Feb
(76) |
Mar
(68) |
Apr
(11) |
May
(97) |
Jun
(16) |
Jul
(29) |
Aug
(35) |
Sep
(18) |
Oct
(32) |
Nov
(23) |
Dec
(77) |
2004 |
Jan
(52) |
Feb
(44) |
Mar
(55) |
Apr
(38) |
May
(106) |
Jun
(82) |
Jul
(76) |
Aug
(47) |
Sep
(36) |
Oct
(56) |
Nov
(46) |
Dec
(61) |
2005 |
Jan
(52) |
Feb
(118) |
Mar
(41) |
Apr
(40) |
May
(35) |
Jun
(99) |
Jul
(84) |
Aug
(104) |
Sep
(53) |
Oct
(107) |
Nov
(68) |
Dec
(30) |
2006 |
Jan
(19) |
Feb
(27) |
Mar
(24) |
Apr
(9) |
May
(22) |
Jun
(11) |
Jul
(34) |
Aug
(8) |
Sep
(15) |
Oct
(55) |
Nov
(16) |
Dec
(2) |
2007 |
Jan
(12) |
Feb
(4) |
Mar
(8) |
Apr
|
May
(19) |
Jun
(3) |
Jul
(1) |
Aug
(6) |
Sep
(12) |
Oct
(3) |
Nov
|
Dec
|
2008 |
Jan
(4) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(21) |
2009 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(1) |
Jun
(8) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(19) |
Jun
(14) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
(22) |
Apr
(12) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2016 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Steve F. <sfi...@pc...> - 2005-02-15 17:28:54
|
sucheta- i *think* the answer to your question may be that features have parent_id. this is used to build feature trees. steve Sucheta Tripathy wrote: > Thank you all for the response. My original question was bit premature > and based on more of our convenience than the design of GUS per se. > > But these are some issues, I think the group may need to consider: > > By keeping different prediction outputs as different feature_ids and > finally merging them through gene and gene_instance seems fine. But we > need to distinguish between features within a gene and features of a > gene. What I mean by that is a gene may have different features like > an UTR feature , a promoter element feature etc, a cpg island feature > and so on. which needs to be linked to a particular prediction output. > > For example I have gene 1 with gene_id 1 which has several > gene_instances represented by different prediction algorithms and each > having a different na_feature_id say: 1,2,3,4. > > For prediction algorithm with na_feature_id=1, I have a set of UTR > locations and say I have feature_id 5 for UTR_5 and 6 for UTR_3 and a > promoter feature with na_feature_id 7. Then how would I link > na_feature_ids 1,5,6 and 7. > > So, for this I was thinking if one na_feature_id could represent a > gene and its locations could have a description each with reviewer's > status etc. And to save the geneinstances for any kind of splice > variants or any other types of instances. > > Many thanks > > Sucheta > > At 09:40 AM 2/14/2005 -0500, you wrote: > >> This question is a bit more complex than it seems. >> >> All three of these may be necessary for every level of >> analysis. first, was the overall gene prediction/feature >> prediction reviewed and how was it algorithmically arrived at? >> Then you can ask the same question about the locations. >> Early locations may be provided by the same algorithm as the >> feature, but these may be further defined later and require >> their own review annotations. >> >> One thing that I think needs to be addressed, however, is how >> these columns appear throughout the schema like mushrooms. I >> have been told that GUS was hyper-normalized when it was first >> written to 4N or 5N form, but that is definitely not the case >> now. Why do you want review_status, reviewer and algorithm in >> the feature table? Since these usually will appear in >> clusters (i.e. running an algorithm once gives you 500 cases >> of the same entry for all three on the same date), shouldn't >> we have a table to collect these into an annotation_status >> table and just have an FK to an entry in that table for every >> table that uses these? The same goes for other SRes entries >> such as taxon, project, etc. >> >> -Ed >> >> >> >> ---- Original message ---- >> >Date: Sat, 12 Feb 2005 22:34:03 -0500 (EST) >> >From: "Sucheta Tripathy" <su...@vb...> >> >Subject: [Gusdev-gusdev] dots.nalocation table >> >To: Gus...@li... >> > >> > >> >Hi Group, >> > >> >From community annotation point of view, I was wondering if >> it is a good >> >idea to have is_reviewed, algorithm_id and reviewer_id in >> dots.nalocation >> >table. >> > >> >Since one na_feature_id( a transcript or a gene) may be >> having multiple >> >sets of nalocations, so one can easily capture them in >> nalocation with >> >different algorithms and with a reviewed option. >> > >> >In our application we need several gene calling programs to >> have locations >> >as well as related information registered. >> > >> >Sucheta >> > >> > >> >-- >> >Sucheta Tripathy >> >Virginia Bioinformatics Institute Phase-I >> >Washington street. >> >Virginia Tech. >> >Blacksburg,VA 24061-0447 >> >phone:(540)231-8138 >> >Fax: (540) 231-2606 >> > >> > >> >------------------------------------------------------- >> >SF email is sponsored by - The IT Product Guide >> >Read honest & candid reviews on hundreds of IT Products from >> real users. >> >Discover which products truly live up to the hype. Start >> reading now. >> >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >> >_______________________________________________ >> >Gusdev-gusdev mailing list >> >Gus...@li... >> >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> ----------------- >> Ed Robinson >> Center for Tropical and Emerging Global Diseases >> University of Georgia, Athens, GA 30602 >> ero...@ug.../(706)542.1447/254.8883 >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: <fed...@bi...> - 2005-02-15 17:24:41
|
Hi guys! I'm registering the plugin class name with ga; I had some problems because ga can't find some perl files: AASequenceGOFunction.pm, ArrayControlLoader.pm, AssayAnalysis.pm. And olso in the MakeGoPredictions plugin I had an error: Can't locate object method "new" via package "GUS::GOPredict::Plugin::MakeGoPredictions". Do you know why?? I think the building process has been correctly made... Thanks Federica |
From: Steve F. <sfi...@pc...> - 2005-02-15 17:24:25
|
Folks- we have been discusing ReviewStatus. here, as far as i can tell, are the use cases, in order of commoness: (a) loading data into a table that expects a review status (eg, features), but where none of the data has been reviewed (b) updating data in a table that uses review status, and the logic states that rows that have been reviewed should not be updated (c) loading data that is tagged with a review status, but using a vocabulary determined by the data provider, not by the database (d) curating data locally, which would use the review status vocabulary stored in the db. my goals are : (1) to remove the requirement that gus installations use ReviewStatus (2) to allow sites that want to use it to be responsible for developing and loading their own ReviewStatus CV. Here is my proposal: - make ReviewStatus nullable in all tables that use it - null is synonymous with "not reviewed" - plugins that load data for use case (a) above can just leave ReviewStatus null - plugins that need to know what review status means "don't update" offer a --reviewedCode argument that solicits from the user the name of the review status that means "don't update" - plugins that need to load ReviewStatus values from a data provider's input offer a --reviewStatusMap argument that takes a file specifying the mapping from input values to database values - curation applications read the review status CV stored in the database and offer it in a pulldown to the curator The immediate action for GUS 3.5 would be: - name ReviewStatus nullable - upgrade supported plugins to conform to this proposal comments? steve |
From: Ed R. <ero...@ug...> - 2005-02-15 16:06:16
|
The long and short of it is at the end of the message. A table that stores the values in short output is really all I need (Despite my dislike for storing string-coded arrays in row attributes, normalizing the long ourput to two tables is a bit excessive). I assume this is all in the table Mike mentioned: AASequenceID, Length, NumberAAsPredictedTM, NumberPredictedTMFirst60, TotalNumberTMRegions, Topology. -ed Long Output: # COX2_BACSU Length: 278 # COX2_BACSU Number of predicted TMHs: 3 # COX2_BACSU Exp number of AAs in TMHs: 68.6888999999999 # COX2_BACSU Exp number, first 60 AAs: 39.8875 # COX2_BACSU Total prob of N-in: 0.99950 # COX2_BACSU POSSIBLE N-term signal sequence COX2_BACSU TMHMM2.0 inside 1 6 COX2_BACSU TMHMM2.0 TMhelix 7 29 COX2_BACSU TMHMM2.0 outside 30 43 COX2_BACSU TMHMM2.0 TMhelix 44 66 COX2_BACSU TMHMM2.0 inside 67 86 COX2_BACSU TMHMM2.0 TMhelix 87 109 COX2_BACSU TMHMM2.0 outside 110 278 Short Output: COX2_BACSU len=278 ExpAA=68.69 First60=39.89 PredHel=3 Topology=i7-29o44-66i87-109o ---- Original message ---- >Date: Tue, 15 Feb 2005 10:37:16 -0500 >From: Steve Fischer <sfi...@pc...> >Subject: Re: [Gusdev-gusdev] New Table View? TransMembrane AASeqFeature >To: Ed Robinson <ero...@ug...> >Cc: gus...@li... > >ed- > >can you provide a more detailed proposal? > >steve > >Ed Robinson wrote: > >>I need to store more information for TransMembrane domains >>than PredictedAAFeature presently allows. Can we create a new >>view on AAFeatureImp or can we expand the number of attributes >>in PredictedAAFeature? >> >>-ed >> >>----------------- >>Ed Robinson >>Center for Tropical and Emerging Global Diseases >>University of Georgia, Athens, GA 30602 >>ero...@ug.../(706)542.1447/254.8883 >> >> >>------------------------------------------------------- >>SF email is sponsored by - The IT Product Guide >>Read honest & candid reviews on hundreds of IT Products from real users. >>Discover which products truly live up to the hype. Start reading now. >>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >>_______________________________________________ >>Gusdev-gusdev mailing list >>Gus...@li... >>https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev ----------------- Ed Robinson Center for Tropical and Emerging Global Diseases University of Georgia, Athens, GA 30602 ero...@ug.../(706)542.1447/254.8883 |
From: Alberto D. <da...@io...> - 2005-02-15 15:46:08
|
Thanks Bindu, We will have a look on it... also, just found a Bioperl module for GlimmerM: http://doc.bioperl.org/bioperl-live/Bio/Tools/Glimmer.html Cheers, Alberto On Mon, 2005-02-14 at 14:45 -0500, Bindu Gajria wrote: > hi Alberto - > PlasmoDB project uses a plugin to load the GlimmerM results; it is > GUS::Common::Plugin::ImportPlasmoDBPrediction plugin in the Sanger cvs > repository. however, please note that this plugin is not generalized, > and has been used here only for the PlasmoDB project so far. > It would be useful to generalize this plugin some day, so that all can > benefit. > > Bindu > > > On Feb 11, 2005, at 12:44 PM, Alberto Davila wrote: > > > Hey Steve, Thomas, > > > > Thanks a lot for the tips, really helpful.. now, few more questions: > > > >> ok. NR = NRDB > >> > >> the way we have used gus with similarities is that both the query and > >> subject are loaded into gus. As thomas explained, the similarity > >> table > >> captures similarity between sequences that are in gus. > >> > >> our approach has always been to just load (warehouse) the entire > >> subject > >> database (NR, EST) that we are blasting against. > >> > >> the current plugins and blastSimilarity are set up for this. > >> > >> obviously, this takes a lot of disk space. two major efficiencies > >> that > >> we don't currently have plugins for would be: > >> 1. to only store in gus a *reference* to the external sequence (ie, > >> don't store the actgs). > >> 2. only store in gus the sequences that actually have similarities > > > > Option 2 sound better for us, since we will be blasting against several > > databases (> 10GB databases) > > > > What about the plugins to load Interpro and "gene finder" (glimmer, > > etc) > > results ? Is there any at all ? > > > > Cheers, Alberto > > > >> > >> steve > >> > >> Alberto Davila wrote: > >> > >>> All the blastable databases I mentioned are standard databases from > >>> NCBI > >>> (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > >>> > >>> NT = nucleotides > >>> > >>> ~30000 entries from genbank (genbank format) are loaded into GUS now. > >>> > >>> Not sure about your "NRDB", I know NR from NCBI that is a collection > >>> of > >>> aminoacid entries, could it be the same ? > >>> > >>> Alberto > >>> > >>> On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > >>> > >>> > >>>> (what is NT?) > >>>> > >>>> which of these (genbank, your fasta, NRDB, NT, EST) have you loaded > >>>> into > >>>> gus? > >>>> > >>>> steve > >>>> > >>>> Alberto Davila wrote: > >>>> > >>>> > >>>> > >>>>> Query: > >>>>> > >>>>> Either sequences from genbank (genbank format) or sequences > >>>>> generated in > >>>>> the lab (fasta format) > >>>>> > >>>>> Blastable databases (all are formatted databases from NCBI): > >>>>> > >>>>> NR > >>>>> NT > >>>>> EST > >>>>> > >>>>> Alberto > >>>>> > >>>>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>> for the blast, what are the query sequences and what are the > >>>>>> blastable > >>>>>> databases? > >>>>>> > >>>>>> steve > >>>>>> > >>>>>> Alberto Davila wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>> Basically we will use sequences (loaded into GUS with the > >>>>>>> GBParser) for > >>>>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will > >>>>>>> be also > >>>>>>> used for Interpro analyses. Results of both (Blast and Interpro) > >>>>>>> will be > >>>>>>> loaded into GUS. We will parse specific things from the Blast > >>>>>>> results, I > >>>>>>> would say: > >>>>>>> > >>>>>>> `Gi` > >>>>>>> `Accession` > >>>>>>> `Description` > >>>>>>> `E_value` > >>>>>>> `Score` > >>>>>>> `Length` > >>>>>>> `Frame_Query` > >>>>>>> `Frame_Hit` > >>>>>>> `Identical` > >>>>>>> `Hsp_Frac_Identical` > >>>>>>> `Conserved` > >>>>>>> `Hsp_Frac_Conserved` > >>>>>>> `Query_Start` > >>>>>>> `Query_End` > >>>>>>> `Hit_Start` > >>>>>>> `Hit_End` > >>>>>>> `Hsp_Align` > >>>>>>> `database_letters` > >>>>>>> `database_entries` > >>>>>>> > >>>>>>> We already have a Bioperl parser for that (specific for another > >>>>>>> system: > >>>>>>> GARSA) that could be adapted to GUS, problem being we are not > >>>>>>> sure what > >>>>>>> tables should be used to store those data in GUS. > >>>>>>> > >>>>>>> Cheers, Alberto > >>>>>>> > >>>>>>> > >>>>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> what are you planning on blasting? > >>>>>>>> > >>>>>>>> steve > >>>>>>>> > >>>>>>>> Alberto Davila wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> Hi Steve, > >>>>>>>>> > >>>>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> poliana- > >>>>>>>>>> > >>>>>>>>>> oops, the usage statement for LoadBlastSimFast is out of > >>>>>>>>>> date. it > >>>>>>>>>> should instruct you to use the blastSimilarity command. > >>>>>>>>>> > >>>>>>>>>> LoadBlastSimFast makes a big assumption, that the subject and > >>>>>>>>>> query > >>>>>>>>>> sequences are in GUS, and their def. lines have GUS primary > >>>>>>>>>> keys. > >>>>>>>>>> > >>>>>>>>>> Are your sequences already loaded into GUS? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> They are not, there would be any howto/tips for that plugin ? > >>>>>>>>> We will > >>>>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" > >>>>>>>>> results > >>>>>>>>> into GUS... If they are not available, then maybe we will have > >>>>>>>>> to write > >>>>>>>>> them ... > >>>>>>>>> > >>>>>>>>> Cheers, Alberto > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> steve > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Poliana Mateus wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> Hello all, > >>>>>>>>>>> > >>>>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? > >>>>>>>>>>> I'm trying to run LoadBlastSimFast... > >>>>>>>>>>> > >>>>>>>>>>> Poliana |
From: Michael S. <msa...@pc...> - 2005-02-15 15:42:43
|
I'm not sure if it will fully address Ed's needs, but GUS 3.5 is planning on an new TransmembraneAAFeature table/view. The final definition has not yet been decided. See: http://www.gusdb.org/wiki/index.php/Gus3.5RoadMap --Mike Steve Fischer wrote: > ed- > > can you provide a more detailed proposal? > > steve > > Ed Robinson wrote: > >> I need to store more information for TransMembrane domains >> than PredictedAAFeature presently allows. Can we create a new >> view on AAFeatureImp or can we expand the number of attributes >> in PredictedAAFeature? >> >> -ed >> >> ----------------- >> Ed Robinson >> Center for Tropical and Emerging Global Diseases >> University of Georgia, Athens, GA 30602 >> ero...@ug.../(706)542.1447/254.8883 >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real users. >> Discover which products truly live up to the hype. Start reading now. >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Steve F. <sfi...@pc...> - 2005-02-15 15:37:23
|
ed- can you provide a more detailed proposal? steve Ed Robinson wrote: >I need to store more information for TransMembrane domains >than PredictedAAFeature presently allows. Can we create a new >view on AAFeatureImp or can we expand the number of attributes >in PredictedAAFeature? > >-ed > >----------------- >Ed Robinson >Center for Tropical and Emerging Global Diseases >University of Georgia, Athens, GA 30602 >ero...@ug.../(706)542.1447/254.8883 > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > > |
From: Ed R. <ero...@ug...> - 2005-02-15 15:17:34
|
I need to store more information for TransMembrane domains than PredictedAAFeature presently allows. Can we create a new view on AAFeatureImp or can we expand the number of attributes in PredictedAAFeature? -ed ----------------- Ed Robinson Center for Tropical and Emerging Global Diseases University of Georgia, Athens, GA 30602 ero...@ug.../(706)542.1447/254.8883 |
From: Ed R. <ero...@ug...> - 2005-02-15 15:15:07
|
I have two issues with some plugins I am writing. The first is that when I load data sets produced by various predictive algorithms, I need to include a review status. This is a non-null field in most of the tables. The easy solution, and the one many older plugins use, is simply to hardcode it. Of course, hardcoding means everyone else that uses the plugin will have to use the same values in their database. It also creates the possibility of different plugins having different values. What I would like to do, and this is also a suggestion of Steve's, is to assume that all GUS instances have named values in SRes.ReviewStatus, "reviewed" and "unreviewed". All of my plugins will begin by loading the values for these into cache and using them. If they are not present in the table, then the plugins will die with an error message. This will be the default operation of all of my plugins. I would like to propse that this become a standard for all plugins. In fact, it probably should be part of the plugin superclass. If we want to have a few more assumed review status types, then we can include those as standard names also. Ultimately, these names should all be included somewhere in the Installation scripts. Comments, disagreements? thanks -Ed ----------------- Ed Robinson Center for Tropical and Emerging Global Diseases University of Georgia, Athens, GA 30602 ero...@ug.../(706)542.1447/254.8883 |
From: Aaron J. M. <am...@pc...> - 2005-02-15 02:33:30
|
FYI, if you wish to store details of a pairwise sequence alignment, =20 it's arguably preferable to not store three separate alignment strings =20= (query-with-gaps, target-with-gaps and similarity string), since these =20= are not quite independent attributes (you can not truly change one =20 without changing the others, in some sense), but rather to save only =20 the location of gaps (requiring one to reconstruct the various =20 alignment strings if/when necessary). EnsEMBL chose to use the "CIGAR" =20= string representation of an alignment; this format has made its way =20 into GFF3 as well: http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/CigarFormat.html The FASTA programs have a similar (but more expressive) alignment =20 "encoding" that includes possibilities for forward and backwards =20 frameshifts (i.e. for protein-to-DNA alignments). Either of these =20 encodings are also (somewhat) more "computable" than the raw string =20 alignment representation. -Aaron On Feb 14, 2005, at 8:27 PM, davila wrote: > Hi Steve, > > I wonder to know if you think it would be interesting to expand the =20= > "Similarity and SimilaritySpan" tables ? Some blast results, > eg: query_string, hit_string, homology_string and alignment don=C2=B4t = =20 > appear to be represented in those tables (of course, I might be =20 > wrong)... > > Ideally, those tables should be able to store most data parsed from =20= > Blast results, an example of most important data is listed in the =20 > Bio::SearchIO system of Bioperl: =20 > http://bioperl.org/HOWTOs/SearchIO/use.html > > Cheers, Alberto > > > -----Mensagem original----- > De: Steve Fischer [mailto:sfi...@pc...] > Enviada: seg 14/2/2005 17:53 > Para: Poliana Mateus > Cc: davila; gus...@li... > Assunto: Re: [Gusdev-gusdev] parseBlastFilesForSimilarity.pl > =09 > =09 > Poliana- > =09 > the only blast plugins we have are LoadBlastSimFast and > LoadBlastSimilarityPK. > =09 > the only tables are Similarity and SimilaritySpan > =09 > steve > =09 > Poliana Mateus wrote: > =09 > >Hi Steve > > > >I need to insert given in the GUS (resulted blast) as: > > > >---------------------------------------------------- > >extracted data of ours script > >---------------------------------------------------- > >query_name > >name > >accession > >description > >significance > >raw_score > >length > >num_identical > >frac_identical > >num_conserved > >frac_conserved > >start('query') > >end('query') > >start('hit') > >end('hit') > >---------------------------------------------------- > > > >Analyzing the LoadBlastSimFast Plugin I verified that it = inserts in > >tables DoTs.Similarity and DoTs.SymilaritySpan, both only = accept =20 > given > >numerics. > >Exists into GUS other tables that store resulted of Blast? > > > >Poliana > > > > > > > > > > > > > >On Fri, 11 Feb 2005 13:50:32 -0500, Steve Fischer > ><sfi...@pc...> wrote: > > > > > >>see below > >> > >>Alberto Davila wrote: > >> > >> > >> > >>>We are doing this for Garsa (another system) .. basically we = have a > >>>bioperl parser (Bio::Search::IO) that reads the Blast results = file =20 > and > >>>extract all the needed info (to the "Blast_Hit" table)... and = also =20 > load > >>>into a given table (eg: External_DB) all the sequences (in = fasta =20 > format) > >>>presenting similarity with the queries... at the end we have =20= > "Blast_Hit" > >>>and "External_DB" populated with the same script. > >>> > >>> > >>> > >>> > >>> > >>wow, great. could you make a gus plugin from that? > >> > >> > >> > >>>Regarding Interpro and Glimmer, the main problem is to know = in =20 > which > >>>tables we should load the parsed results ? > >>> > >>> > >>> > >>> > >>> > >>describe the info you want to store. > >> > >>steve > >> > >> > >> > >>>Alberto > >>> > >>>On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: > >>> > >>> > >>> > >>> > >>>>I was going to give the same answer steve gave for interpro = and =20 > gene > >>>>finding results. > >>>> > >>>>For loading sequences into GUS, the dillema with option 2 = is: how =20 > do you > >>>>know which sequence to load when you load (which is before = you =20 > actually > >>>>have the similarity result)? One solution would be to = initially =20 > load > >>>>complete dataset(s) but delete those without similarity = after =20 > loading > >>>>similarity results. > >>>> > >>>>-Thomas > >>>> > >>>>On Fri, 11 Feb 2005, Steve Fischer wrote: > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>>alberto- > >>>>> > >>>>>we've never loaded interpro, so there isn't a plugin. > >>>>>i believe plasmodb has loaded glimmer results, though i'm = not =20 > sure. i have > >>>>>asked a plasmodb developer to answer that question. > >>>>> > >>>>>steve > >>>>> > >>>>>Alberto Davila wrote: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>Hey Steve, Thomas, > >>>>>> > >>>>>>Thanks a lot for the tips, really helpful.. now, few more =20= > questions: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>>ok. NR =3D NRDB > >>>>>>> > >>>>>>>the way we have used gus with similarities is that both = the =20 > query and > >>>>>>>subject are loaded into gus. As thomas explained, the =20= > similarity table > >>>>>>>captures similarity between sequences that are in gus. > >>>>>>>our approach has always been to just load (warehouse) the = =20 > entire subject > >>>>>>>database (NR, EST) that we are blasting against. > >>>>>>> > >>>>>>>the current plugins and blastSimilarity are set up for = this. > >>>>>>> > >>>>>>>obviously, this takes a lot of disk space. two major =20 > efficiencies that we > >>>>>>>don't currently have plugins for would be: > >>>>>>>1. to only store in gus a *reference* to the external = sequence =20 > (ie, don't > >>>>>>>store the actgs). > >>>>>>>2. only store in gus the sequences that actually have =20 > similarities > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>Option 2 sound better for us, since we will be blasting = against =20 > several > >>>>>>databases (> 10GB databases) > >>>>>> > >>>>>>What about the plugins to load Interpro and "gene finder" =20= > (glimmer, etc) > >>>>>>results ? Is there any at all ? > >>>>>> > >>>>>>Cheers, Alberto > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>>steve > >>>>>>> > >>>>>>>Alberto Davila wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>All the blastable databases I mentioned are standard =20 > databases from NCBI > >>>>>>>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > >>>>>>>> > >>>>>>>>NT =3D nucleotides > >>>>>>>> > >>>>>>>>~30000 entries from genbank (genbank format) are loaded = into =20 > GUS now. > >>>>>>>> > >>>>>>>>Not sure about your "NRDB", I know NR from NCBI that is = a =20 > collection of > >>>>>>>>aminoacid entries, could it be the same ? > >>>>>>>> > >>>>>>>>Alberto > >>>>>>>> > >>>>>>>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>>(what is NT?) > >>>>>>>>> > >>>>>>>>>which of these (genbank, your fasta, NRDB, NT, EST) = have you =20 > loaded into > >>>>>>>>>gus? > >>>>>>>>> > >>>>>>>>>steve > >>>>>>>>> > >>>>>>>>>Alberto Davila wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>Query: > >>>>>>>>>> > >>>>>>>>>>Either sequences from genbank (genbank format) or = sequences =20 > generated > >>>>>>>>>>in > >>>>>>>>>>the lab (fasta format) > >>>>>>>>>> > >>>>>>>>>>Blastable databases (all are formatted databases from = NCBI): > >>>>>>>>>> > >>>>>>>>>>NR > >>>>>>>>>>NT > >>>>>>>>>>EST > >>>>>>>>>> > >>>>>>>>>>Alberto > >>>>>>>>>> > >>>>>>>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer = wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>>for the blast, what are the query sequences and what = are =20 > the blastable > >>>>>>>>>>>databases? > >>>>>>>>>>> > >>>>>>>>>>>steve > >>>>>>>>>>> > >>>>>>>>>>>Alberto Davila wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>>Basically we will use sequences (loaded into GUS = with the =20 > GBParser) > >>>>>>>>>>>>for > >>>>>>>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same =20= > sequences will be > >>>>>>>>>>>>also > >>>>>>>>>>>>used for Interpro analyses. Results of both (Blast = and =20 > Interpro) will > >>>>>>>>>>>>be > >>>>>>>>>>>>loaded into GUS. We will parse specific things from = the =20 > Blast > >>>>>>>>>>>>results, I > >>>>>>>>>>>>would say: > >>>>>>>>>>>> > >>>>>>>>>>>>`Gi` `Accession` `Description` `E_value` `Score` = `Length` > >>>>>>>>>>>>`Frame_Query` `Frame_Hit` `Identical` = `Hsp_Frac_Identical` > >>>>>>>>>>>>`Conserved` `Hsp_Frac_Conserved` > >>>>>>>>>>>>`Query_Start` > >>>>>>>>>>>>`Query_End` `Hit_Start` `Hit_End` `Hsp_Align` =20 > `database_letters` > >>>>>>>>>>>>`database_entries` > >>>>>>>>>>>>We already have a Bioperl parser for that (specific = for =20 > another > >>>>>>>>>>>>system: > >>>>>>>>>>>>GARSA) that could be adapted to GUS, problem being = we are =20 > not sure > >>>>>>>>>>>>what > >>>>>>>>>>>>tables should be used to store those data in GUS. > >>>>>>>>>>>> > >>>>>>>>>>>>Cheers, Alberto > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer = wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>>what are you planning on blasting? > >>>>>>>>>>>>> > >>>>>>>>>>>>>steve > >>>>>>>>>>>>> > >>>>>>>>>>>>>Alberto Davila wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>>Hi Steve, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer = wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>>poliana- > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>oops, the usage statement for LoadBlastSimFast is = out =20 > of date. > >>>>>>>>>>>>>>>it should instruct you to use the blastSimilarity = =20 > command. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>LoadBlastSimFast makes a big assumption, that the = =20 > subject and > >>>>>>>>>>>>>>>query sequences are in GUS, and their def. lines = have =20 > GUS primary > >>>>>>>>>>>>>>>keys. > >>>>>>>>>>>>>>>Are your sequences already loaded into GUS? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>They are not, there would be any howto/tips for = that =20 > plugin ? We > >>>>>>>>>>>>>>will > >>>>>>>>>>>>>>certainly need a plugin to load "Interpro" and = "ORF =20 > finding" > >>>>>>>>>>>>>>results > >>>>>>>>>>>>>>into GUS... If they are not available, then maybe = we =20 > will have to > >>>>>>>>>>>>>>write > >>>>>>>>>>>>>>them ... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>Cheers, Alberto > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>>steve > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>Poliana Mateus wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>Hello all, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>Where can find the script =20 > parseBlastFilesForSimilarity.pl?? > >>>>>>>>>>>>>>>>I'm trying to run LoadBlastSimFast... > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>Poliana > >>>>>>>>>>>>>>>> > =09 > > =FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF= =FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=D2=15=E9=9A=8AX=AC=B2=9A= '=B2=8A=DEu=BC=FFN=17=88L=FA=E8v=E7-=20 > = =1A=E8=9Dy=17=9Av=1A'z=CB=FFq=A9=DD=89=DA=DE=BE'=B0=B2=89=E1=BAwky=DB(|=84= =CF=AE=87nr=DB=1F=AE=89=ABy=A9n=B1=EA=EC=FC8=ACr=8B=DE=AF=08br=1Ak=A1=DB=9C= =B6=CBk=BA\=A5=8A=F7=20 > =AE=A6=DA- > =E8r=A5=EF=D2=B5=AA=ED=AD=20 > = =E6=9D=8Ax'=A3=0F=E1=B6=DA=FF=FF=F6=9D=B3=FA,v=7F=DC=A2o=FFi=DF=E2=F7=9F=DA= =96Z=1C=FE'=D7=8D=FD=EB=FA)rO=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF= =FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=20 > =FF=FF=FF=FF=FF=FCk=ACu=EB=FF=82=EB=1Dz=F9=9A=8AX=A7=82X=AC=B4k=ACu=EB=FF= =82=EB=1Dz=FF=E5=8A=CBl=FE=CA.=AD=C7=9F=A2=B8=1E=FEw=AD=86=DBi=B3=FF=FF=96= +-=20 > =B3=FB(=BA=B7=1E~=8A=E0{=F9=DE=B7=F9b=B2=DB?=96+-=8Aw=E8=FE=0B=ACu=EB=FF= =82=EB=1D > -- Aaron J. Mackey, Ph.D. Dept. of Biology, Goddard 212 University of Pennsylvania email: am...@pc... 415 S. University Avenue office: 215-898-1205 Philadelphia, PA 19104-6017 fax: 215-746-6697 |
From: Steve F. <sfi...@pc...> - 2005-02-15 02:17:32
|
alberto- you're right. the similarity tables in gus capture the essence of the=20 similarity, not the details. am i correct in thinking that the information you are describing is in a=20 1-1 relationship with a SimilaritySpan? If so, you could prototype your idea by adding a table called=20 SimilaritySpanDetails to your gus. It would have a link to SimilaritySpa= n. steve davila wrote: >Hi Steve, >=20 >I wonder to know if you think it would be interesting to expand the "Sim= ilarity and SimilaritySpan" tables ? Some blast results,=20 >eg: query_string, hit_string, homology_string and alignment don=C2=B4t a= ppear to be represented in those tables (of course, I might be wrong)... >=20 >Ideally, those tables should be able to store most data parsed from Blas= t results, an example of most important data is listed in the Bio::Search= IO system of Bioperl: http://bioperl.org/HOWTOs/SearchIO/use.html >=20 >Cheers, Alberto >=20 > > -----Mensagem original-----=20 > De: Steve Fischer [mailto:sfi...@pc...]=20 > Enviada: seg 14/2/2005 17:53=20 > Para: Poliana Mateus=20 > Cc: davila; gus...@li...=20 > Assunto: Re: [Gusdev-gusdev] parseBlastFilesForSimilarity.pl >=09 >=09 > Poliana- >=09 > the only blast plugins we have are LoadBlastSimFast and > LoadBlastSimilarityPK. >=09 > the only tables are Similarity and SimilaritySpan >=09 > steve >=09 > Poliana Mateus wrote: >=09 > >Hi Steve > > > >I need to insert given in the GUS (resulted blast) as: > > > >---------------------------------------------------- > >extracted data of ours script > >---------------------------------------------------- > >query_name > >name > >accession > >description > >significance > >raw_score > >length > >num_identical > >frac_identical > >num_conserved > >frac_conserved > >start('query') > >end('query') > >start('hit') > >end('hit') > >---------------------------------------------------- > > > >Analyzing the LoadBlastSimFast Plugin I verified that it inserts in > >tables DoTs.Similarity and DoTs.SymilaritySpan, both only accept given > >numerics. > >Exists into GUS other tables that store resulted of Blast? > > > >Poliana > > > > > > > > > > > > > >On Fri, 11 Feb 2005 13:50:32 -0500, Steve Fischer > ><sfi...@pc...> wrote: > >=20 > > > >>see below > >> > >>Alberto Davila wrote: > >> > >> =20 > >> > >>>We are doing this for Garsa (another system) .. basically we have a > >>>bioperl parser (Bio::Search::IO) that reads the Blast results file a= nd > >>>extract all the needed info (to the "Blast_Hit" table)... and also l= oad > >>>into a given table (eg: External_DB) all the sequences (in fasta for= mat) > >>>presenting similarity with the queries... at the end we have "Blast_= Hit" > >>>and "External_DB" populated with the same script. > >>> > >>> > >>> > >>> =20 > >>> > >>wow, great. could you make a gus plugin from that? > >> > >> =20 > >> > >>>Regarding Interpro and Glimmer, the main problem is to know in which > >>>tables we should load the parsed results ? > >>> > >>> > >>> > >>> =20 > >>> > >>describe the info you want to store. > >> > >>steve > >> > >> =20 > >> > >>>Alberto > >>> > >>>On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: > >>> > >>> > >>> =20 > >>> > >>>>I was going to give the same answer steve gave for interpro and gen= e > >>>>finding results. > >>>> > >>>>For loading sequences into GUS, the dillema with option 2 is: how d= o you > >>>>know which sequence to load when you load (which is before you actu= ally > >>>>have the similarity result)? One solution would be to initially loa= d > >>>>complete dataset(s) but delete those without similarity after loadi= ng > >>>>similarity results. > >>>> > >>>>-Thomas > >>>> > >>>>On Fri, 11 Feb 2005, Steve Fischer wrote: > >>>> > >>>> > >>>> > >>>> =20 > >>>> > >>>>>alberto- > >>>>> > >>>>>we've never loaded interpro, so there isn't a plugin. > >>>>>i believe plasmodb has loaded glimmer results, though i'm not sure= . i have > >>>>>asked a plasmodb developer to answer that question. > >>>>> > >>>>>steve > >>>>> > >>>>>Alberto Davila wrote: > >>>>> > >>>>> > >>>>> > >>>>> =20 > >>>>> > >>>>>>Hey Steve, Thomas, > >>>>>> > >>>>>>Thanks a lot for the tips, really helpful.. now, few more questio= ns: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> =20 > >>>>>> > >>>>>>>ok. NR =3D NRDB > >>>>>>> > >>>>>>>the way we have used gus with similarities is that both the quer= y and > >>>>>>>subject are loaded into gus. As thomas explained, the similarit= y table > >>>>>>>captures similarity between sequences that are in gus. > >>>>>>>our approach has always been to just load (warehouse) the entire= subject > >>>>>>>database (NR, EST) that we are blasting against. > >>>>>>> > >>>>>>>the current plugins and blastSimilarity are set up for this. > >>>>>>> > >>>>>>>obviously, this takes a lot of disk space. two major efficienci= es that we > >>>>>>>don't currently have plugins for would be: > >>>>>>>1. to only store in gus a *reference* to the external sequence (= ie, don't > >>>>>>>store the actgs). > >>>>>>>2. only store in gus the sequences that actually have similariti= es > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> =20 > >>>>>>> > >>>>>>Option 2 sound better for us, since we will be blasting against s= everal > >>>>>>databases (> 10GB databases) > >>>>>> > >>>>>>What about the plugins to load Interpro and "gene finder" (glimme= r, etc) > >>>>>>results ? Is there any at all ? > >>>>>> > >>>>>>Cheers, Alberto > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> =20 > >>>>>> > >>>>>>>steve > >>>>>>> > >>>>>>>Alberto Davila wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> =20 > >>>>>>> > >>>>>>>>All the blastable databases I mentioned are standard databases = from NCBI > >>>>>>>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > >>>>>>>> > >>>>>>>>NT =3D nucleotides > >>>>>>>> > >>>>>>>>~30000 entries from genbank (genbank format) are loaded into GU= S now. > >>>>>>>> > >>>>>>>>Not sure about your "NRDB", I know NR from NCBI that is a colle= ction of > >>>>>>>>aminoacid entries, could it be the same ? > >>>>>>>> > >>>>>>>>Alberto > >>>>>>>> > >>>>>>>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> =20 > >>>>>>>> > >>>>>>>>>(what is NT?) > >>>>>>>>> > >>>>>>>>>which of these (genbank, your fasta, NRDB, NT, EST) have you l= oaded into > >>>>>>>>>gus? > >>>>>>>>> > >>>>>>>>>steve > >>>>>>>>> > >>>>>>>>>Alberto Davila wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> =20 > >>>>>>>>> > >>>>>>>>>>Query: > >>>>>>>>>> > >>>>>>>>>>Either sequences from genbank (genbank format) or sequences g= enerated > >>>>>>>>>>in > >>>>>>>>>>the lab (fasta format) > >>>>>>>>>> > >>>>>>>>>>Blastable databases (all are formatted databases from NCBI): > >>>>>>>>>> > >>>>>>>>>>NR > >>>>>>>>>>NT > >>>>>>>>>>EST > >>>>>>>>>> > >>>>>>>>>>Alberto > >>>>>>>>>> > >>>>>>>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> =20 > >>>>>>>>>> > >>>>>>>>>>>for the blast, what are the query sequences and what are the= blastable > >>>>>>>>>>>databases? > >>>>>>>>>>> > >>>>>>>>>>>steve > >>>>>>>>>>> > >>>>>>>>>>>Alberto Davila wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> =20 > >>>>>>>>>>> > >>>>>>>>>>>>Basically we will use sequences (loaded into GUS with the G= BParser) > >>>>>>>>>>>>for > >>>>>>>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences= will be > >>>>>>>>>>>>also > >>>>>>>>>>>>used for Interpro analyses. Results of both (Blast and Inte= rpro) will > >>>>>>>>>>>>be > >>>>>>>>>>>>loaded into GUS. We will parse specific things from the Bla= st > >>>>>>>>>>>>results, I > >>>>>>>>>>>>would say: > >>>>>>>>>>>> > >>>>>>>>>>>>`Gi` `Accession` `Description` `E_value` `Score` `Length` > >>>>>>>>>>>>`Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` > >>>>>>>>>>>>`Conserved` `Hsp_Frac_Conserved` > >>>>>>>>>>>>`Query_Start` > >>>>>>>>>>>>`Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_let= ters` > >>>>>>>>>>>>`database_entries` > >>>>>>>>>>>>We already have a Bioperl parser for that (specific for ano= ther > >>>>>>>>>>>>system: > >>>>>>>>>>>>GARSA) that could be adapted to GUS, problem being we are n= ot sure > >>>>>>>>>>>>what > >>>>>>>>>>>>tables should be used to store those data in GUS. > >>>>>>>>>>>> > >>>>>>>>>>>>Cheers, Alberto > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> =20 > >>>>>>>>>>>> > >>>>>>>>>>>>>what are you planning on blasting? > >>>>>>>>>>>>> > >>>>>>>>>>>>>steve > >>>>>>>>>>>>> > >>>>>>>>>>>>>Alberto Davila wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> =20 > >>>>>>>>>>>>> > >>>>>>>>>>>>>>Hi Steve, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> =20 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>>poliana- > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>oops, the usage statement for LoadBlastSimFast is out of= date. > >>>>>>>>>>>>>>>it should instruct you to use the blastSimilarity comman= d. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>LoadBlastSimFast makes a big assumption, that the subjec= t and > >>>>>>>>>>>>>>>query sequences are in GUS, and their def. lines have GU= S primary > >>>>>>>>>>>>>>>keys. > >>>>>>>>>>>>>>>Are your sequences already loaded into GUS? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> =20 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>They are not, there would be any howto/tips for that plug= in ? We > >>>>>>>>>>>>>>will > >>>>>>>>>>>>>>certainly need a plugin to load "Interpro" and "ORF findi= ng" > >>>>>>>>>>>>>>results > >>>>>>>>>>>>>>into GUS... If they are not available, then maybe we will= have to > >>>>>>>>>>>>>>write > >>>>>>>>>>>>>>them ... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>Cheers, Alberto > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> =20 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>>steve > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>Poliana Mateus wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> =20 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>Hello all, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>Where can find the script parseBlastFilesForSimilarity.= pl?? > >>>>>>>>>>>>>>>>I'm trying to run LoadBlastSimFast... > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>Poliana > >>>>>>>>>>>>>>>> >=09 > > =20 > |
From: davila <da...@io...> - 2005-02-15 01:32:50
|
SGkgU3RldmUsDQogDQpJIHdvbmRlciB0byBrbm93IGlmIHlvdSB0aGluayBpdCB3b3VsZCBiZSBp bnRlcmVzdGluZyB0byBleHBhbmQgdGhlICJTaW1pbGFyaXR5IGFuZCBTaW1pbGFyaXR5U3BhbiIg dGFibGVzID8gU29tZSBibGFzdCByZXN1bHRzLCANCmVnOiBxdWVyeV9zdHJpbmcsIGhpdF9zdHJp bmcsIGhvbW9sb2d5X3N0cmluZyBhbmQgYWxpZ25tZW50IGRvbsK0dCBhcHBlYXIgdG8gYmUgcmVw cmVzZW50ZWQgaW4gdGhvc2UgdGFibGVzIChvZiBjb3Vyc2UsIEkgbWlnaHQgYmUgd3JvbmcpLi4u DQogDQpJZGVhbGx5LCB0aG9zZSB0YWJsZXMgc2hvdWxkIGJlIGFibGUgdG8gc3RvcmUgbW9zdCBk YXRhIHBhcnNlZCBmcm9tIEJsYXN0IHJlc3VsdHMsIGFuIGV4YW1wbGUgb2YgbW9zdCBpbXBvcnRh bnQgZGF0YSBpcyBsaXN0ZWQgaW4gdGhlIEJpbzo6U2VhcmNoSU8gc3lzdGVtIG9mIEJpb3Blcmw6 IGh0dHA6Ly9iaW9wZXJsLm9yZy9IT1dUT3MvU2VhcmNoSU8vdXNlLmh0bWwNCiANCkNoZWVycywg QWxiZXJ0bw0KIA0KDQoJLS0tLS1NZW5zYWdlbSBvcmlnaW5hbC0tLS0tIA0KCURlOiBTdGV2ZSBG aXNjaGVyIFttYWlsdG86c2Zpc2NoZXJAcGNiaS51cGVubi5lZHVdIA0KCUVudmlhZGE6IHNlZyAx NC8yLzIwMDUgMTc6NTMgDQoJUGFyYTogUG9saWFuYSBNYXRldXMgDQoJQ2M6IGRhdmlsYTsgZ3Vz ZGV2LWd1c2RldkBsaXN0cy5zb3VyY2Vmb3JnZS5uZXQgDQoJQXNzdW50bzogUmU6IFtHdXNkZXYt Z3VzZGV2XSBwYXJzZUJsYXN0RmlsZXNGb3JTaW1pbGFyaXR5LnBsDQoJDQoJDQoJUG9saWFuYS0N CgkNCgl0aGUgb25seSBibGFzdCBwbHVnaW5zIHdlIGhhdmUgYXJlIExvYWRCbGFzdFNpbUZhc3Qg YW5kDQoJTG9hZEJsYXN0U2ltaWxhcml0eVBLLg0KCQ0KCXRoZSBvbmx5IHRhYmxlcyBhcmUgU2lt aWxhcml0eSBhbmQgU2ltaWxhcml0eVNwYW4NCgkNCglzdGV2ZQ0KCQ0KCVBvbGlhbmEgTWF0ZXVz IHdyb3RlOg0KCQ0KCT5IaSBTdGV2ZQ0KCT4NCgk+SSBuZWVkIHRvIGluc2VydCBnaXZlbiBpbiB0 aGUgR1VTIChyZXN1bHRlZCBibGFzdCkgYXM6DQoJPg0KCT4tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQoJPmV4dHJhY3RlZCBkYXRhIG9mIG91cnMg c2NyaXB0DQoJPi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0NCgk+cXVlcnlfbmFtZQ0KCT5uYW1lDQoJPmFjY2Vzc2lvbg0KCT5kZXNjcmlwdGlvbg0K CT5zaWduaWZpY2FuY2UNCgk+cmF3X3Njb3JlDQoJPmxlbmd0aA0KCT5udW1faWRlbnRpY2FsDQoJ PmZyYWNfaWRlbnRpY2FsDQoJPm51bV9jb25zZXJ2ZWQNCgk+ZnJhY19jb25zZXJ2ZWQNCgk+c3Rh cnQoJ3F1ZXJ5JykNCgk+ZW5kKCdxdWVyeScpDQoJPnN0YXJ0KCdoaXQnKQ0KCT5lbmQoJ2hpdCcp DQoJPi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0N Cgk+DQoJPkFuYWx5emluZyB0aGUgTG9hZEJsYXN0U2ltRmFzdCBQbHVnaW4gSSB2ZXJpZmllZCB0 aGF0IGl0IGluc2VydHMgaW4NCgk+dGFibGVzIERvVHMuU2ltaWxhcml0eSBhbmQgRG9Ucy5TeW1p bGFyaXR5U3BhbiwgYm90aCBvbmx5IGFjY2VwdCBnaXZlbg0KCT5udW1lcmljcy4NCgk+RXhpc3Rz IGludG8gR1VTIG90aGVyIHRhYmxlcyB0aGF0IHN0b3JlIHJlc3VsdGVkIG9mIEJsYXN0Pw0KCT4N Cgk+UG9saWFuYQ0KCT4NCgk+DQoJPg0KCT4NCgk+DQoJPg0KCT5PbiBGcmksIDExIEZlYiAyMDA1 IDEzOjUwOjMyIC0wNTAwLCBTdGV2ZSBGaXNjaGVyDQoJPjxzZmlzY2hlckBwY2JpLnVwZW5uLmVk dT4gd3JvdGU6DQoJPiANCgk+DQoJPj5zZWUgYmVsb3cNCgk+Pg0KCT4+QWxiZXJ0byBEYXZpbGEg d3JvdGU6DQoJPj4NCgk+PiAgIA0KCT4+DQoJPj4+V2UgYXJlIGRvaW5nIHRoaXMgZm9yIEdhcnNh IChhbm90aGVyIHN5c3RlbSkgLi4gYmFzaWNhbGx5IHdlIGhhdmUgYQ0KCT4+PmJpb3BlcmwgcGFy c2VyIChCaW86OlNlYXJjaDo6SU8pIHRoYXQgcmVhZHMgdGhlIEJsYXN0IHJlc3VsdHMgZmlsZSBh bmQNCgk+Pj5leHRyYWN0IGFsbCB0aGUgbmVlZGVkIGluZm8gKHRvIHRoZSAiQmxhc3RfSGl0IiB0 YWJsZSkuLi4gYW5kIGFsc28gbG9hZA0KCT4+PmludG8gYSBnaXZlbiB0YWJsZSAoZWc6IEV4dGVy bmFsX0RCKSBhbGwgdGhlIHNlcXVlbmNlcyAoaW4gZmFzdGEgZm9ybWF0KQ0KCT4+PnByZXNlbnRp bmcgc2ltaWxhcml0eSB3aXRoIHRoZSBxdWVyaWVzLi4uIGF0IHRoZSBlbmQgd2UgaGF2ZSAiQmxh c3RfSGl0Ig0KCT4+PmFuZCAiRXh0ZXJuYWxfREIiIHBvcHVsYXRlZCB3aXRoIHRoZSBzYW1lIHNj cmlwdC4NCgk+Pj4NCgk+Pj4NCgk+Pj4NCgk+Pj4gICAgIA0KCT4+Pg0KCT4+d293LCBncmVhdC4g IGNvdWxkIHlvdSBtYWtlIGEgZ3VzIHBsdWdpbiBmcm9tIHRoYXQ/DQoJPj4NCgk+PiAgIA0KCT4+ DQoJPj4+UmVnYXJkaW5nIEludGVycHJvIGFuZCBHbGltbWVyLCB0aGUgbWFpbiBwcm9ibGVtIGlz IHRvIGtub3cgaW4gd2hpY2gNCgk+Pj50YWJsZXMgd2Ugc2hvdWxkIGxvYWQgdGhlIHBhcnNlZCBy ZXN1bHRzID8NCgk+Pj4NCgk+Pj4NCgk+Pj4NCgk+Pj4gICAgIA0KCT4+Pg0KCT4+ZGVzY3JpYmUg dGhlIGluZm8geW91IHdhbnQgdG8gc3RvcmUuDQoJPj4NCgk+PnN0ZXZlDQoJPj4NCgk+PiAgIA0K CT4+DQoJPj4+QWxiZXJ0bw0KCT4+Pg0KCT4+Pk9uIEZyaSwgMjAwNS0wMi0xMSBhdCAxMzoyMSAt MDUwMCwgWS4gVGhvbWFzIEdhbiB3cm90ZToNCgk+Pj4NCgk+Pj4NCgk+Pj4gICAgIA0KCT4+Pg0K CT4+Pj5JIHdhcyBnb2luZyB0byBnaXZlIHRoZSBzYW1lIGFuc3dlciBzdGV2ZSBnYXZlIGZvciBp bnRlcnBybyBhbmQgZ2VuZQ0KCT4+Pj5maW5kaW5nIHJlc3VsdHMuDQoJPj4+Pg0KCT4+Pj5Gb3Ig bG9hZGluZyBzZXF1ZW5jZXMgaW50byBHVVMsIHRoZSBkaWxsZW1hIHdpdGggb3B0aW9uIDIgaXM6 IGhvdyBkbyB5b3UNCgk+Pj4+a25vdyB3aGljaCBzZXF1ZW5jZSB0byBsb2FkIHdoZW4geW91IGxv YWQgKHdoaWNoIGlzIGJlZm9yZSB5b3UgYWN0dWFsbHkNCgk+Pj4+aGF2ZSB0aGUgc2ltaWxhcml0 eSByZXN1bHQpPyBPbmUgc29sdXRpb24gd291bGQgYmUgdG8gaW5pdGlhbGx5IGxvYWQNCgk+Pj4+ Y29tcGxldGUgZGF0YXNldChzKSBidXQgZGVsZXRlIHRob3NlIHdpdGhvdXQgc2ltaWxhcml0eSBh ZnRlciBsb2FkaW5nDQoJPj4+PnNpbWlsYXJpdHkgcmVzdWx0cy4NCgk+Pj4+DQoJPj4+Pi1UaG9t YXMNCgk+Pj4+DQoJPj4+Pk9uIEZyaSwgMTEgRmViIDIwMDUsIFN0ZXZlIEZpc2NoZXIgd3JvdGU6 DQoJPj4+Pg0KCT4+Pj4NCgk+Pj4+DQoJPj4+PiAgICAgICANCgk+Pj4+DQoJPj4+Pj5hbGJlcnRv LQ0KCT4+Pj4+DQoJPj4+Pj53ZSd2ZSBuZXZlciBsb2FkZWQgaW50ZXJwcm8sIHNvIHRoZXJlIGlz bid0IGEgcGx1Z2luLg0KCT4+Pj4+aSBiZWxpZXZlIHBsYXNtb2RiIGhhcyBsb2FkZWQgZ2xpbW1l ciByZXN1bHRzLCB0aG91Z2ggaSdtIG5vdCBzdXJlLiAgIGkgaGF2ZQ0KCT4+Pj4+YXNrZWQgYSBw bGFzbW9kYiBkZXZlbG9wZXIgdG8gYW5zd2VyIHRoYXQgcXVlc3Rpb24uDQoJPj4+Pj4NCgk+Pj4+ PnN0ZXZlDQoJPj4+Pj4NCgk+Pj4+PkFsYmVydG8gRGF2aWxhIHdyb3RlOg0KCT4+Pj4+DQoJPj4+ Pj4NCgk+Pj4+Pg0KCT4+Pj4+ICAgICAgICAgDQoJPj4+Pj4NCgk+Pj4+Pj5IZXkgU3RldmUsIFRo b21hcywNCgk+Pj4+Pj4NCgk+Pj4+Pj5UaGFua3MgYSBsb3QgZm9yIHRoZSB0aXBzLCByZWFsbHkg aGVscGZ1bC4uIG5vdywgZmV3IG1vcmUgcXVlc3Rpb25zOg0KCT4+Pj4+Pg0KCT4+Pj4+Pg0KCT4+ Pj4+Pg0KCT4+Pj4+Pg0KCT4+Pj4+PiAgICAgICAgICAgDQoJPj4+Pj4+DQoJPj4+Pj4+Pm9rLiAg TlIgPSBOUkRCDQoJPj4+Pj4+Pg0KCT4+Pj4+Pj50aGUgd2F5IHdlIGhhdmUgdXNlZCBndXMgd2l0 aCBzaW1pbGFyaXRpZXMgaXMgdGhhdCBib3RoIHRoZSBxdWVyeSBhbmQNCgk+Pj4+Pj4+c3ViamVj dCBhcmUgbG9hZGVkIGludG8gZ3VzLiAgQXMgdGhvbWFzIGV4cGxhaW5lZCwgdGhlIHNpbWlsYXJp dHkgdGFibGUNCgk+Pj4+Pj4+Y2FwdHVyZXMgc2ltaWxhcml0eSBiZXR3ZWVuIHNlcXVlbmNlcyB0 aGF0IGFyZSBpbiBndXMuDQoJPj4+Pj4+Pm91ciBhcHByb2FjaCBoYXMgYWx3YXlzIGJlZW4gdG8g anVzdCBsb2FkICh3YXJlaG91c2UpIHRoZSBlbnRpcmUgc3ViamVjdA0KCT4+Pj4+Pj5kYXRhYmFz ZSAoTlIsIEVTVCkgdGhhdCB3ZSBhcmUgYmxhc3RpbmcgYWdhaW5zdC4NCgk+Pj4+Pj4+DQoJPj4+ Pj4+PnRoZSBjdXJyZW50IHBsdWdpbnMgYW5kIGJsYXN0U2ltaWxhcml0eSBhcmUgc2V0IHVwIGZv ciB0aGlzLg0KCT4+Pj4+Pj4NCgk+Pj4+Pj4+b2J2aW91c2x5LCB0aGlzIHRha2VzIGEgbG90IG9m IGRpc2sgc3BhY2UuICB0d28gbWFqb3IgZWZmaWNpZW5jaWVzIHRoYXQgd2UNCgk+Pj4+Pj4+ZG9u J3QgY3VycmVudGx5IGhhdmUgcGx1Z2lucyBmb3Igd291bGQgYmU6DQoJPj4+Pj4+PjEuIHRvIG9u bHkgc3RvcmUgaW4gZ3VzIGEgKnJlZmVyZW5jZSogdG8gdGhlIGV4dGVybmFsIHNlcXVlbmNlIChp ZSwgZG9uJ3QNCgk+Pj4+Pj4+c3RvcmUgdGhlIGFjdGdzKS4NCgk+Pj4+Pj4+Mi4gb25seSBzdG9y ZSBpbiBndXMgdGhlIHNlcXVlbmNlcyB0aGF0IGFjdHVhbGx5IGhhdmUgc2ltaWxhcml0aWVzDQoJ Pj4+Pj4+Pg0KCT4+Pj4+Pj4NCgk+Pj4+Pj4+DQoJPj4+Pj4+PiAgICAgICAgICAgICANCgk+Pj4+ Pj4+DQoJPj4+Pj4+T3B0aW9uIDIgc291bmQgYmV0dGVyIGZvciB1cywgc2luY2Ugd2Ugd2lsbCBi ZSBibGFzdGluZyBhZ2FpbnN0IHNldmVyYWwNCgk+Pj4+Pj5kYXRhYmFzZXMgKD4gMTBHQiBkYXRh YmFzZXMpDQoJPj4+Pj4+DQoJPj4+Pj4+V2hhdCBhYm91dCB0aGUgcGx1Z2lucyB0byBsb2FkIElu dGVycHJvIGFuZCAiZ2VuZSBmaW5kZXIiIChnbGltbWVyLCBldGMpDQoJPj4+Pj4+cmVzdWx0cyA/ IElzIHRoZXJlIGFueSBhdCBhbGwgPw0KCT4+Pj4+Pg0KCT4+Pj4+PkNoZWVycywgQWxiZXJ0bw0K CT4+Pj4+Pg0KCT4+Pj4+Pg0KCT4+Pj4+Pg0KCT4+Pj4+Pg0KCT4+Pj4+PiAgICAgICAgICAgDQoJ Pj4+Pj4+DQoJPj4+Pj4+PnN0ZXZlDQoJPj4+Pj4+Pg0KCT4+Pj4+Pj5BbGJlcnRvIERhdmlsYSB3 cm90ZToNCgk+Pj4+Pj4+DQoJPj4+Pj4+Pg0KCT4+Pj4+Pj4NCgk+Pj4+Pj4+DQoJPj4+Pj4+PiAg ICAgICAgICAgICANCgk+Pj4+Pj4+DQoJPj4+Pj4+Pj5BbGwgdGhlIGJsYXN0YWJsZSBkYXRhYmFz ZXMgSSBtZW50aW9uZWQgYXJlIHN0YW5kYXJkIGRhdGFiYXNlcyBmcm9tIE5DQkkNCgk+Pj4+Pj4+ PihmdHA6Ly9mdHAubmNiaS5ubG0ubmloLmdvdi9ibGFzdC9kYi9ibGFzdGRiLnR4dCk6DQoJPj4+ Pj4+Pj4NCgk+Pj4+Pj4+Pk5UID0gbnVjbGVvdGlkZXMNCgk+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+fjMw MDAwIGVudHJpZXMgZnJvbSBnZW5iYW5rIChnZW5iYW5rIGZvcm1hdCkgYXJlIGxvYWRlZCBpbnRv IEdVUyBub3cuDQoJPj4+Pj4+Pj4NCgk+Pj4+Pj4+Pk5vdCBzdXJlIGFib3V0IHlvdXIgIk5SREIi LCBJIGtub3cgTlIgZnJvbSBOQ0JJIHRoYXQgaXMgYSBjb2xsZWN0aW9uIG9mDQoJPj4+Pj4+Pj5h bWlub2FjaWQgZW50cmllcywgY291bGQgaXQgYmUgdGhlIHNhbWUgPw0KCT4+Pj4+Pj4+DQoJPj4+ Pj4+Pj5BbGJlcnRvDQoJPj4+Pj4+Pj4NCgk+Pj4+Pj4+Pk9uIEZyaSwgMjAwNS0wMi0xMSBhdCAx MDo0MyAtMDUwMCwgU3RldmUgRmlzY2hlciB3cm90ZToNCgk+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+DQoJ Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+DQoJPj4+Pj4+Pj4gICAgICAgICAgICAgICAN Cgk+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pih3aGF0IGlzIE5UPykNCgk+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+ Pj53aGljaCBvZiB0aGVzZSAoZ2VuYmFuaywgeW91ciBmYXN0YSwgTlJEQiwgTlQsIEVTVCkgaGF2 ZSB5b3UgbG9hZGVkIGludG8NCgk+Pj4+Pj4+Pj5ndXM/DQoJPj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+ c3RldmUNCgk+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj5BbGJlcnRvIERhdmlsYSB3cm90ZToNCgk+Pj4+ Pj4+Pj4NCgk+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4NCgk+ Pj4+Pj4+Pj4gICAgICAgICAgICAgICAgIA0KCT4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj5RdWVyeToN Cgk+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+PkVpdGhlciBzZXF1ZW5jZXMgZnJvbSBnZW5iYW5rIChn ZW5iYW5rIGZvcm1hdCkgb3Igc2VxdWVuY2VzIGdlbmVyYXRlZA0KCT4+Pj4+Pj4+Pj5pbg0KCT4+ Pj4+Pj4+Pj50aGUgbGFiIChmYXN0YSBmb3JtYXQpDQoJPj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj5C bGFzdGFibGUgZGF0YWJhc2VzIChhbGwgYXJlIGZvcm1hdHRlZCBkYXRhYmFzZXMgZnJvbSBOQ0JJ KToNCgk+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pk5SDQoJPj4+Pj4+Pj4+Pk5UDQoJPj4+Pj4+Pj4+ PkVTVA0KCT4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+QWxiZXJ0bw0KCT4+Pj4+Pj4+Pj4NCgk+Pj4+ Pj4+Pj4+T24gRnJpLCAyMDA1LTAyLTExIGF0IDEwOjM0IC0wNTAwLCBTdGV2ZSBGaXNjaGVyIHdy b3RlOg0KCT4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4N Cgk+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4gICAgICAgICAgICAgICAgICAg DQoJPj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Zm9yIHRoZSBibGFzdCwgd2hhdCBhcmUgdGhlIHF1 ZXJ5IHNlcXVlbmNlcyBhbmQgd2hhdCBhcmUgdGhlIGJsYXN0YWJsZQ0KCT4+Pj4+Pj4+Pj4+ZGF0 YWJhc2VzPw0KCT4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj5zdGV2ZQ0KCT4+Pj4+Pj4+Pj4+DQoJ Pj4+Pj4+Pj4+Pj5BbGJlcnRvIERhdmlsYSB3cm90ZToNCgk+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+ Pj4+DQoJPj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+ Pj4NCgk+Pj4+Pj4+Pj4+PiAgICAgICAgICAgICAgICAgICAgIA0KCT4+Pj4+Pj4+Pj4+DQoJPj4+ Pj4+Pj4+Pj4+QmFzaWNhbGx5IHdlIHdpbGwgdXNlIHNlcXVlbmNlcyAobG9hZGVkIGludG8gR1VT IHdpdGggdGhlIEdCUGFyc2VyKQ0KCT4+Pj4+Pj4+Pj4+PmZvcg0KCT4+Pj4+Pj4+Pj4+Pk5DQkkg Qmxhc3QgKEJsYXN0eCwgQmxhc3RwIGFuZCBUQmxhc3RYKSwgdGhlIHNhbWUgc2VxdWVuY2VzIHdp bGwgYmUNCgk+Pj4+Pj4+Pj4+Pj5hbHNvDQoJPj4+Pj4+Pj4+Pj4+dXNlZCBmb3IgSW50ZXJwcm8g YW5hbHlzZXMuIFJlc3VsdHMgb2YgYm90aCAoQmxhc3QgYW5kIEludGVycHJvKSB3aWxsDQoJPj4+ Pj4+Pj4+Pj4+YmUNCgk+Pj4+Pj4+Pj4+Pj5sb2FkZWQgaW50byBHVVMuIFdlIHdpbGwgcGFyc2Ug c3BlY2lmaWMgdGhpbmdzIGZyb20gdGhlIEJsYXN0DQoJPj4+Pj4+Pj4+Pj4+cmVzdWx0cywgSQ0K CT4+Pj4+Pj4+Pj4+PndvdWxkIHNheToNCgk+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj5gR2lg IGBBY2Nlc3Npb25gIGBEZXNjcmlwdGlvbmAgYEVfdmFsdWVgIGBTY29yZWAgYExlbmd0aGANCgk+ Pj4+Pj4+Pj4+Pj5gRnJhbWVfUXVlcnlgIGBGcmFtZV9IaXRgIGBJZGVudGljYWxgIGBIc3BfRnJh Y19JZGVudGljYWxgDQoJPj4+Pj4+Pj4+Pj4+YENvbnNlcnZlZGAgYEhzcF9GcmFjX0NvbnNlcnZl ZGANCgk+Pj4+Pj4+Pj4+Pj5gUXVlcnlfU3RhcnRgDQoJPj4+Pj4+Pj4+Pj4+YFF1ZXJ5X0VuZGAg YEhpdF9TdGFydGAgYEhpdF9FbmRgIGBIc3BfQWxpZ25gIGBkYXRhYmFzZV9sZXR0ZXJzYA0KCT4+ Pj4+Pj4+Pj4+PmBkYXRhYmFzZV9lbnRyaWVzYA0KCT4+Pj4+Pj4+Pj4+PldlIGFscmVhZHkgaGF2 ZSBhIEJpb3BlcmwgcGFyc2VyIGZvciB0aGF0IChzcGVjaWZpYyBmb3IgYW5vdGhlcg0KCT4+Pj4+ Pj4+Pj4+PnN5c3RlbToNCgk+Pj4+Pj4+Pj4+Pj5HQVJTQSkgdGhhdCBjb3VsZCBiZSBhZGFwdGVk IHRvIEdVUywgcHJvYmxlbSBiZWluZyB3ZSBhcmUgbm90IHN1cmUNCgk+Pj4+Pj4+Pj4+Pj53aGF0 DQoJPj4+Pj4+Pj4+Pj4+dGFibGVzIHNob3VsZCBiZSB1c2VkIHRvIHN0b3JlIHRob3NlIGRhdGEg aW4gR1VTLg0KCT4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+PkNoZWVycywgQWxiZXJ0bw0KCT4+ Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pk9uIEZyaSwgMjAwNS0wMi0x MSBhdCAxMDowNiAtMDUwMCwgU3RldmUgRmlzY2hlciB3cm90ZToNCgk+Pj4+Pj4+Pj4+Pj4NCgk+ Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4N Cgk+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4gICAgICAgICAgICAg ICAgICAgICAgIA0KCT4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj53aGF0IGFyZSB5b3UgcGxh bm5pbmcgb24gYmxhc3Rpbmc/DQoJPj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj5zdGV2ZQ0K CT4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+QWxiZXJ0byBEYXZpbGEgd3JvdGU6DQoJPj4+ Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+ Pg0KCT4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+ Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+ICAgICAgICAgICAgICAgICAgICAgICAgIA0KCT4+Pj4+ Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+PkhpIFN0ZXZlLA0KCT4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+ Pj4+Pj4+Pj4+Pj5PbiBGcmksIDIwMDUtMDItMTEgYXQgMDg6NTYgLTA1MDAsIFN0ZXZlIEZpc2No ZXIgd3JvdGU6DQoJPj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+ Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+DQoJ Pj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+ICAgICAgICAg ICAgICAgICAgICAgICAgICAgDQoJPj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pj5wb2xp YW5hLQ0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Pm9vcHMsIHRoZSB1c2FnZSBz dGF0ZW1lbnQgZm9yIExvYWRCbGFzdFNpbUZhc3QgaXMgb3V0IG9mIGRhdGUuDQoJPj4+Pj4+Pj4+ Pj4+Pj4+aXQgc2hvdWxkIGluc3RydWN0IHlvdSB0byB1c2UgdGhlIGJsYXN0U2ltaWxhcml0eSBj b21tYW5kLg0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+PkxvYWRCbGFzdFNpbUZh c3QgbWFrZXMgYSBiaWcgYXNzdW1wdGlvbiwgdGhhdCB0aGUgc3ViamVjdCBhbmQNCgk+Pj4+Pj4+ Pj4+Pj4+Pj5xdWVyeSBzZXF1ZW5jZXMgYXJlIGluIEdVUywgYW5kIHRoZWlyIGRlZi4gbGluZXMg aGF2ZSBHVVMgcHJpbWFyeQ0KCT4+Pj4+Pj4+Pj4+Pj4+PmtleXMuDQoJPj4+Pj4+Pj4+Pj4+Pj4+ QXJlIHlvdXIgc2VxdWVuY2VzIGFscmVhZHkgbG9hZGVkIGludG8gR1VTPw0KCT4+Pj4+Pj4+Pj4+ Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+ Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Pg0K CT4+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+PiAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgDQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj5UaGV5IGFyZSBub3QsIHRo ZXJlIHdvdWxkIGJlIGFueSBob3d0by90aXBzIGZvciB0aGF0IHBsdWdpbiA/IFdlDQoJPj4+Pj4+ Pj4+Pj4+Pj53aWxsDQoJPj4+Pj4+Pj4+Pj4+Pj5jZXJ0YWlubHkgbmVlZCBhIHBsdWdpbiB0byBs b2FkICJJbnRlcnBybyIgYW5kICJPUkYgZmluZGluZyINCgk+Pj4+Pj4+Pj4+Pj4+PnJlc3VsdHMN Cgk+Pj4+Pj4+Pj4+Pj4+PmludG8gR1VTLi4uIElmIHRoZXkgYXJlIG5vdCBhdmFpbGFibGUsIHRo ZW4gbWF5YmUgd2Ugd2lsbCBoYXZlIHRvDQoJPj4+Pj4+Pj4+Pj4+Pj53cml0ZQ0KCT4+Pj4+Pj4+ Pj4+Pj4+dGhlbSAuLi4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+Q2hlZXJzLCBB bGJlcnRvDQoJPj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+ DQoJPj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+ Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+ Pj4+Pj4gICAgICAgICAgICAgICAgICAgICAgICAgICANCgk+Pj4+Pj4+Pj4+Pj4+Pg0KCT4+Pj4+ Pj4+Pj4+Pj4+PnN0ZXZlDQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+ Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+UG9saWFuYSBNYXRldXMgd3JvdGU6DQoJPj4+ Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+ Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+ Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+Pj4+DQoJPj4+Pj4+Pj4+Pj4+ Pj4+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICANCgk+Pj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+ Pj4+Pj4+Pj4+Pj4+SGVsbG8gYWxsLA0KCT4+Pj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+ Pj4+V2hlcmUgY2FuIGZpbmQgdGhlIHNjcmlwdCBwYXJzZUJsYXN0RmlsZXNGb3JTaW1pbGFyaXR5 LnBsPz8NCgk+Pj4+Pj4+Pj4+Pj4+Pj4+SSdtIHRyeWluZyB0byBydW4gTG9hZEJsYXN0U2ltRmFz dC4uLg0KCT4+Pj4+Pj4+Pj4+Pj4+Pj4NCgk+Pj4+Pj4+Pj4+Pj4+Pj4+UG9saWFuYQ0KCT4+Pj4+ Pj4+Pj4+Pj4+Pj4NCgkNCg0K |
From: Steve F. <sfi...@pc...> - 2005-02-14 20:53:37
|
Poliana- the only blast plugins we have are LoadBlastSimFast and LoadBlastSimilarityPK. the only tables are Similarity and SimilaritySpan steve Poliana Mateus wrote: >Hi Steve > >I need to insert given in the GUS (resulted blast) as: > >---------------------------------------------------- >extracted data of ours script >---------------------------------------------------- >query_name >name >accession >description >significance >raw_score >length >num_identical >frac_identical >num_conserved >frac_conserved >start('query') >end('query') >start('hit') >end('hit') >---------------------------------------------------- > >Analyzing the LoadBlastSimFast Plugin I verified that it inserts in >tables DoTs.Similarity and DoTs.SymilaritySpan, both only accept given >numerics. >Exists into GUS other tables that store resulted of Blast? > >Poliana > > > > > > >On Fri, 11 Feb 2005 13:50:32 -0500, Steve Fischer ><sfi...@pc...> wrote: > > >>see below >> >>Alberto Davila wrote: >> >> >> >>>We are doing this for Garsa (another system) .. basically we have a >>>bioperl parser (Bio::Search::IO) that reads the Blast results file and >>>extract all the needed info (to the "Blast_Hit" table)... and also load >>>into a given table (eg: External_DB) all the sequences (in fasta format) >>>presenting similarity with the queries... at the end we have "Blast_Hit" >>>and "External_DB" populated with the same script. >>> >>> >>> >>> >>> >>wow, great. could you make a gus plugin from that? >> >> >> >>>Regarding Interpro and Glimmer, the main problem is to know in which >>>tables we should load the parsed results ? >>> >>> >>> >>> >>> >>describe the info you want to store. >> >>steve >> >> >> >>>Alberto >>> >>>On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: >>> >>> >>> >>> >>>>I was going to give the same answer steve gave for interpro and gene >>>>finding results. >>>> >>>>For loading sequences into GUS, the dillema with option 2 is: how do you >>>>know which sequence to load when you load (which is before you actually >>>>have the similarity result)? One solution would be to initially load >>>>complete dataset(s) but delete those without similarity after loading >>>>similarity results. >>>> >>>>-Thomas >>>> >>>>On Fri, 11 Feb 2005, Steve Fischer wrote: >>>> >>>> >>>> >>>> >>>> >>>>>alberto- >>>>> >>>>>we've never loaded interpro, so there isn't a plugin. >>>>>i believe plasmodb has loaded glimmer results, though i'm not sure. i have >>>>>asked a plasmodb developer to answer that question. >>>>> >>>>>steve >>>>> >>>>>Alberto Davila wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>>Hey Steve, Thomas, >>>>>> >>>>>>Thanks a lot for the tips, really helpful.. now, few more questions: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>ok. NR = NRDB >>>>>>> >>>>>>>the way we have used gus with similarities is that both the query and >>>>>>>subject are loaded into gus. As thomas explained, the similarity table >>>>>>>captures similarity between sequences that are in gus. >>>>>>>our approach has always been to just load (warehouse) the entire subject >>>>>>>database (NR, EST) that we are blasting against. >>>>>>> >>>>>>>the current plugins and blastSimilarity are set up for this. >>>>>>> >>>>>>>obviously, this takes a lot of disk space. two major efficiencies that we >>>>>>>don't currently have plugins for would be: >>>>>>>1. to only store in gus a *reference* to the external sequence (ie, don't >>>>>>>store the actgs). >>>>>>>2. only store in gus the sequences that actually have similarities >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>Option 2 sound better for us, since we will be blasting against several >>>>>>databases (> 10GB databases) >>>>>> >>>>>>What about the plugins to load Interpro and "gene finder" (glimmer, etc) >>>>>>results ? Is there any at all ? >>>>>> >>>>>>Cheers, Alberto >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>steve >>>>>>> >>>>>>>Alberto Davila wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>All the blastable databases I mentioned are standard databases from NCBI >>>>>>>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>>>>>>> >>>>>>>>NT = nucleotides >>>>>>>> >>>>>>>>~30000 entries from genbank (genbank format) are loaded into GUS now. >>>>>>>> >>>>>>>>Not sure about your "NRDB", I know NR from NCBI that is a collection of >>>>>>>>aminoacid entries, could it be the same ? >>>>>>>> >>>>>>>>Alberto >>>>>>>> >>>>>>>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>(what is NT?) >>>>>>>>> >>>>>>>>>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >>>>>>>>>gus? >>>>>>>>> >>>>>>>>>steve >>>>>>>>> >>>>>>>>>Alberto Davila wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>>Query: >>>>>>>>>> >>>>>>>>>>Either sequences from genbank (genbank format) or sequences generated >>>>>>>>>>in >>>>>>>>>>the lab (fasta format) >>>>>>>>>> >>>>>>>>>>Blastable databases (all are formatted databases from NCBI): >>>>>>>>>> >>>>>>>>>>NR >>>>>>>>>>NT >>>>>>>>>>EST >>>>>>>>>> >>>>>>>>>>Alberto >>>>>>>>>> >>>>>>>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>for the blast, what are the query sequences and what are the blastable >>>>>>>>>>>databases? >>>>>>>>>>> >>>>>>>>>>>steve >>>>>>>>>>> >>>>>>>>>>>Alberto Davila wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>Basically we will use sequences (loaded into GUS with the GBParser) >>>>>>>>>>>>for >>>>>>>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be >>>>>>>>>>>>also >>>>>>>>>>>>used for Interpro analyses. Results of both (Blast and Interpro) will >>>>>>>>>>>>be >>>>>>>>>>>>loaded into GUS. We will parse specific things from the Blast >>>>>>>>>>>>results, I >>>>>>>>>>>>would say: >>>>>>>>>>>> >>>>>>>>>>>>`Gi` `Accession` `Description` `E_value` `Score` `Length` >>>>>>>>>>>>`Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` >>>>>>>>>>>>`Conserved` `Hsp_Frac_Conserved` >>>>>>>>>>>>`Query_Start` >>>>>>>>>>>>`Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` >>>>>>>>>>>>`database_entries` >>>>>>>>>>>>We already have a Bioperl parser for that (specific for another >>>>>>>>>>>>system: >>>>>>>>>>>>GARSA) that could be adapted to GUS, problem being we are not sure >>>>>>>>>>>>what >>>>>>>>>>>>tables should be used to store those data in GUS. >>>>>>>>>>>> >>>>>>>>>>>>Cheers, Alberto >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>what are you planning on blasting? >>>>>>>>>>>>> >>>>>>>>>>>>>steve >>>>>>>>>>>>> >>>>>>>>>>>>>Alberto Davila wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>Hi Steve, >>>>>>>>>>>>>> >>>>>>>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>poliana- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. >>>>>>>>>>>>>>>it should instruct you to use the blastSimilarity command. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and >>>>>>>>>>>>>>>query sequences are in GUS, and their def. lines have GUS primary >>>>>>>>>>>>>>>keys. >>>>>>>>>>>>>>>Are your sequences already loaded into GUS? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>They are not, there would be any howto/tips for that plugin ? We >>>>>>>>>>>>>>will >>>>>>>>>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" >>>>>>>>>>>>>>results >>>>>>>>>>>>>>into GUS... If they are not available, then maybe we will have to >>>>>>>>>>>>>>write >>>>>>>>>>>>>>them ... >>>>>>>>>>>>>> >>>>>>>>>>>>>>Cheers, Alberto >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>steve >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Poliana Mateus wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>Hello all, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>>>>>>>I'm trying to run LoadBlastSimFast... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>Poliana >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> |
From: Bindu G. <bi...@pc...> - 2005-02-14 19:45:55
|
hi Alberto - PlasmoDB project uses a plugin to load the GlimmerM results; it is GUS::Common::Plugin::ImportPlasmoDBPrediction plugin in the Sanger cvs repository. however, please note that this plugin is not generalized, and has been used here only for the PlasmoDB project so far. It would be useful to generalize this plugin some day, so that all can benefit. Bindu On Feb 11, 2005, at 12:44 PM, Alberto Davila wrote: > Hey Steve, Thomas, > > Thanks a lot for the tips, really helpful.. now, few more questions: > >> ok. NR = NRDB >> >> the way we have used gus with similarities is that both the query and >> subject are loaded into gus. As thomas explained, the similarity >> table >> captures similarity between sequences that are in gus. >> >> our approach has always been to just load (warehouse) the entire >> subject >> database (NR, EST) that we are blasting against. >> >> the current plugins and blastSimilarity are set up for this. >> >> obviously, this takes a lot of disk space. two major efficiencies >> that >> we don't currently have plugins for would be: >> 1. to only store in gus a *reference* to the external sequence (ie, >> don't store the actgs). >> 2. only store in gus the sequences that actually have similarities > > Option 2 sound better for us, since we will be blasting against several > databases (> 10GB databases) > > What about the plugins to load Interpro and "gene finder" (glimmer, > etc) > results ? Is there any at all ? > > Cheers, Alberto > >> >> steve >> >> Alberto Davila wrote: >> >>> All the blastable databases I mentioned are standard databases from >>> NCBI >>> (ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>> >>> NT = nucleotides >>> >>> ~30000 entries from genbank (genbank format) are loaded into GUS now. >>> >>> Not sure about your "NRDB", I know NR from NCBI that is a collection >>> of >>> aminoacid entries, could it be the same ? >>> >>> Alberto >>> >>> On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>> >>> >>>> (what is NT?) >>>> >>>> which of these (genbank, your fasta, NRDB, NT, EST) have you loaded >>>> into >>>> gus? >>>> >>>> steve >>>> >>>> Alberto Davila wrote: >>>> >>>> >>>> >>>>> Query: >>>>> >>>>> Either sequences from genbank (genbank format) or sequences >>>>> generated in >>>>> the lab (fasta format) >>>>> >>>>> Blastable databases (all are formatted databases from NCBI): >>>>> >>>>> NR >>>>> NT >>>>> EST >>>>> >>>>> Alberto >>>>> >>>>> On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> for the blast, what are the query sequences and what are the >>>>>> blastable >>>>>> databases? >>>>>> >>>>>> steve >>>>>> >>>>>> Alberto Davila wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Basically we will use sequences (loaded into GUS with the >>>>>>> GBParser) for >>>>>>> NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will >>>>>>> be also >>>>>>> used for Interpro analyses. Results of both (Blast and Interpro) >>>>>>> will be >>>>>>> loaded into GUS. We will parse specific things from the Blast >>>>>>> results, I >>>>>>> would say: >>>>>>> >>>>>>> `Gi` >>>>>>> `Accession` >>>>>>> `Description` >>>>>>> `E_value` >>>>>>> `Score` >>>>>>> `Length` >>>>>>> `Frame_Query` >>>>>>> `Frame_Hit` >>>>>>> `Identical` >>>>>>> `Hsp_Frac_Identical` >>>>>>> `Conserved` >>>>>>> `Hsp_Frac_Conserved` >>>>>>> `Query_Start` >>>>>>> `Query_End` >>>>>>> `Hit_Start` >>>>>>> `Hit_End` >>>>>>> `Hsp_Align` >>>>>>> `database_letters` >>>>>>> `database_entries` >>>>>>> >>>>>>> We already have a Bioperl parser for that (specific for another >>>>>>> system: >>>>>>> GARSA) that could be adapted to GUS, problem being we are not >>>>>>> sure what >>>>>>> tables should be used to store those data in GUS. >>>>>>> >>>>>>> Cheers, Alberto >>>>>>> >>>>>>> >>>>>>> On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> what are you planning on blasting? >>>>>>>> >>>>>>>> steve >>>>>>>> >>>>>>>> Alberto Davila wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hi Steve, >>>>>>>>> >>>>>>>>> On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> poliana- >>>>>>>>>> >>>>>>>>>> oops, the usage statement for LoadBlastSimFast is out of >>>>>>>>>> date. it >>>>>>>>>> should instruct you to use the blastSimilarity command. >>>>>>>>>> >>>>>>>>>> LoadBlastSimFast makes a big assumption, that the subject and >>>>>>>>>> query >>>>>>>>>> sequences are in GUS, and their def. lines have GUS primary >>>>>>>>>> keys. >>>>>>>>>> >>>>>>>>>> Are your sequences already loaded into GUS? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> They are not, there would be any howto/tips for that plugin ? >>>>>>>>> We will >>>>>>>>> certainly need a plugin to load "Interpro" and "ORF finding" >>>>>>>>> results >>>>>>>>> into GUS... If they are not available, then maybe we will have >>>>>>>>> to write >>>>>>>>> them ... >>>>>>>>> >>>>>>>>> Cheers, Alberto >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> steve >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Poliana Mateus wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Hello all, >>>>>>>>>>> >>>>>>>>>>> Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>> I'm trying to run LoadBlastSimFast... >>>>>>>>>>> >>>>>>>>>>> Poliana >>>>>>>>>>> >>>>>>>>>>> >>> >>> >>> > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real > users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Poliana M. <pol...@gm...> - 2005-02-14 19:15:48
|
Hi Steve I need to insert given in the GUS (resulted blast) as: ---------------------------------------------------- extracted data of ours script ---------------------------------------------------- query_name name accession description significance raw_score length num_identical frac_identical num_conserved frac_conserved start('query') end('query') start('hit') end('hit') ---------------------------------------------------- Analyzing the LoadBlastSimFast Plugin I verified that it inserts in tables DoTs.Similarity and DoTs.SymilaritySpan, both only accept given numerics. Exists into GUS other tables that store resulted of Blast? Poliana On Fri, 11 Feb 2005 13:50:32 -0500, Steve Fischer <sfi...@pc...> wrote: > see below > > Alberto Davila wrote: > > >We are doing this for Garsa (another system) .. basically we have a > >bioperl parser (Bio::Search::IO) that reads the Blast results file and > >extract all the needed info (to the "Blast_Hit" table)... and also load > >into a given table (eg: External_DB) all the sequences (in fasta format) > >presenting similarity with the queries... at the end we have "Blast_Hit" > >and "External_DB" populated with the same script. > > > > > > > wow, great. could you make a gus plugin from that? > > >Regarding Interpro and Glimmer, the main problem is to know in which > >tables we should load the parsed results ? > > > > > > > describe the info you want to store. > > steve > > >Alberto > > > >On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: > > > > > >>I was going to give the same answer steve gave for interpro and gene > >>finding results. > >> > >>For loading sequences into GUS, the dillema with option 2 is: how do you > >>know which sequence to load when you load (which is before you actually > >>have the similarity result)? One solution would be to initially load > >>complete dataset(s) but delete those without similarity after loading > >>similarity results. > >> > >>-Thomas > >> > >>On Fri, 11 Feb 2005, Steve Fischer wrote: > >> > >> > >> > >>>alberto- > >>> > >>>we've never loaded interpro, so there isn't a plugin. > >>>i believe plasmodb has loaded glimmer results, though i'm not sure. i have > >>>asked a plasmodb developer to answer that question. > >>> > >>>steve > >>> > >>>Alberto Davila wrote: > >>> > >>> > >>> > >>>>Hey Steve, Thomas, > >>>> > >>>>Thanks a lot for the tips, really helpful.. now, few more questions: > >>>> > >>>> > >>>> > >>>> > >>>>>ok. NR = NRDB > >>>>> > >>>>>the way we have used gus with similarities is that both the query and > >>>>>subject are loaded into gus. As thomas explained, the similarity table > >>>>>captures similarity between sequences that are in gus. > >>>>>our approach has always been to just load (warehouse) the entire subject > >>>>>database (NR, EST) that we are blasting against. > >>>>> > >>>>>the current plugins and blastSimilarity are set up for this. > >>>>> > >>>>>obviously, this takes a lot of disk space. two major efficiencies that we > >>>>>don't currently have plugins for would be: > >>>>> 1. to only store in gus a *reference* to the external sequence (ie, don't > >>>>>store the actgs). > >>>>> 2. only store in gus the sequences that actually have similarities > >>>>> > >>>>> > >>>>> > >>>>Option 2 sound better for us, since we will be blasting against several > >>>>databases (> 10GB databases) > >>>> > >>>>What about the plugins to load Interpro and "gene finder" (glimmer, etc) > >>>>results ? Is there any at all ? > >>>> > >>>>Cheers, Alberto > >>>> > >>>> > >>>> > >>>> > >>>>>steve > >>>>> > >>>>>Alberto Davila wrote: > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>All the blastable databases I mentioned are standard databases from NCBI > >>>>>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): > >>>>>> > >>>>>>NT = nucleotides > >>>>>> > >>>>>>~30000 entries from genbank (genbank format) are loaded into GUS now. > >>>>>> > >>>>>>Not sure about your "NRDB", I know NR from NCBI that is a collection of > >>>>>>aminoacid entries, could it be the same ? > >>>>>> > >>>>>>Alberto > >>>>>> > >>>>>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>>(what is NT?) > >>>>>>> > >>>>>>>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into > >>>>>>>gus? > >>>>>>> > >>>>>>>steve > >>>>>>> > >>>>>>>Alberto Davila wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>Query: > >>>>>>>> > >>>>>>>>Either sequences from genbank (genbank format) or sequences generated > >>>>>>>>in > >>>>>>>>the lab (fasta format) > >>>>>>>> > >>>>>>>>Blastable databases (all are formatted databases from NCBI): > >>>>>>>> > >>>>>>>>NR > >>>>>>>>NT > >>>>>>>>EST > >>>>>>>> > >>>>>>>>Alberto > >>>>>>>> > >>>>>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>>for the blast, what are the query sequences and what are the blastable > >>>>>>>>>databases? > >>>>>>>>> > >>>>>>>>>steve > >>>>>>>>> > >>>>>>>>>Alberto Davila wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>Basically we will use sequences (loaded into GUS with the GBParser) > >>>>>>>>>>for > >>>>>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be > >>>>>>>>>>also > >>>>>>>>>>used for Interpro analyses. Results of both (Blast and Interpro) will > >>>>>>>>>>be > >>>>>>>>>>loaded into GUS. We will parse specific things from the Blast > >>>>>>>>>>results, I > >>>>>>>>>>would say: > >>>>>>>>>> > >>>>>>>>>>`Gi` `Accession` `Description` `E_value` `Score` `Length` > >>>>>>>>>>`Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` > >>>>>>>>>>`Conserved` `Hsp_Frac_Conserved` > >>>>>>>>>>`Query_Start` > >>>>>>>>>>`Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` > >>>>>>>>>>`database_entries` > >>>>>>>>>>We already have a Bioperl parser for that (specific for another > >>>>>>>>>>system: > >>>>>>>>>>GARSA) that could be adapted to GUS, problem being we are not sure > >>>>>>>>>>what > >>>>>>>>>>tables should be used to store those data in GUS. > >>>>>>>>>> > >>>>>>>>>>Cheers, Alberto > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>>what are you planning on blasting? > >>>>>>>>>>> > >>>>>>>>>>>steve > >>>>>>>>>>> > >>>>>>>>>>>Alberto Davila wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>>Hi Steve, > >>>>>>>>>>>> > >>>>>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>>poliana- > >>>>>>>>>>>>> > >>>>>>>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. > >>>>>>>>>>>>>it should instruct you to use the blastSimilarity command. > >>>>>>>>>>>>> > >>>>>>>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and > >>>>>>>>>>>>>query sequences are in GUS, and their def. lines have GUS primary > >>>>>>>>>>>>>keys. > >>>>>>>>>>>>>Are your sequences already loaded into GUS? > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>They are not, there would be any howto/tips for that plugin ? We > >>>>>>>>>>>>will > >>>>>>>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" > >>>>>>>>>>>>results > >>>>>>>>>>>>into GUS... If they are not available, then maybe we will have to > >>>>>>>>>>>>write > >>>>>>>>>>>>them ... > >>>>>>>>>>>> > >>>>>>>>>>>>Cheers, Alberto > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>>steve > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>Poliana Mateus wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>>Hello all, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? > >>>>>>>>>>>>>>I'm trying to run LoadBlastSimFast... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>Poliana > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > |
From: davila <da...@io...> - 2005-02-14 15:58:26
|
Hey Ed, Great, I will look forward to it... Poliana just started to look at the = code since we are on a rush to meet some deadlines, anyway, she will = contact you by Friday to check your progresses with the document ;-) We are learning little by little about genomics databases (not too bad), = then "hope" to motivate my colleagues (the real DB experts, not = beginners like me) at the Federal University of Rio de Janeiro and IME = to offer a course on "Genomic Databases" as part of the graduate = programme for the second half of 2005. GUS and Chado schemas should = (hopefully) be a topic. Alberto -----Original Message----- From: Ed Robinson [mailto:ero...@ug...] Sent: Mon 2/14/2005 11:46 AM To: davila; Steve Fischer Cc: Y. Thomas Gan; Poliana Mateus; gus...@li... Subject: Re: RES: [Gusdev-gusdev] parseBlastFilesForSimilarity.pl Alberto, Poliana may be interested in a GUS developers guide I am trying to write this week for the course. I just went through the nightmare of learning how to correctly write GUS plugins for a completely undocumented API and little help or pointers to where that API can be found in the source code. There is a plugin description on the WIKI, but absolutely NO API for the GUS Model. I should have a document written for this with an API for the Plugin Class and a general API written for the GUS Model by Friday. It will also include other points for debugging GUS and some best practices I have collected in my notes. -Ed ---- Original message ---- >Date: Sun, 13 Feb 2005 18:01:22 -0300 >From: "davila" <da...@io...> =20 >Subject: RES: [Gusdev-gusdev] parseBlastFilesForSimilarity.pl =20 >To: "Steve Fischer" <sfi...@pc...> >Cc: "Y. Thomas Gan" <yon...@pc...>, "Poliana Mateus" <pol...@gm...>, <gus...@li...> > >Steve, >* see below > >Alberto Davila wrote: > >>We are doing this for Garsa (another system) .. basically we have a >>bioperl parser (Bio::Search::IO) that reads the Blast results file and >>extract all the needed info (to the "Blast_Hit" table)... and also load >>into a given table (eg: External_DB) all the sequences (in fasta format) >>presenting similarity with the queries... at the end we have "Blast_Hit" >>and "External_DB" populated with the same script. >> >>=20 >> >wow, great. could you make a gus plugin from that? > > >Should not be a big problem, I will ask Poliana to do that... she can ocassionally contact you asking for some details... at the end we will put things being debugged/developed by us at : www.biowebdb.org and also provide them to any interested people. In an ideal world, nobody should suffer twice with the same "bug" ;-) > >>Regarding Interpro and Glimmer, the main problem is to know in which >>tables we should load the parsed results ? >> >>=20 >> >* describe the info you want to store. > >Basically this: > >Frame_Hit, Method , Method_Accession, Accession, Hit_Status, Query_Start, Query_End, Description, E_value > >Again, I am asking Poliana to take care of that. > >* steve > >Cheers, Alberto > > >Alberto M. R. D=E1vila, PhD >Kinetoplastid Biology and Disease (Biomed Central) >http://www.kinetoplastids.com >http://www.darwin.fiocruz.br >DBBM / Instituto Oswaldo Cruz / FIOCRUZ >Av. Brasil 4365 >Rio de Janeiro, RJ, Brasil >CEP 21045-900 >Email: da...@fi... > amr...@ya... >Phone: 55-21-3865-8229 / 3865-8206 >Fax: 55-21-2590-3495 >------------------------------------------------- >The BiowebDB consortium: http://www.biowebdb.org > >=20 > > > >>Alberto >> >>On Fri, 2005-02-11 at 13:21 -0500, Y. Thomas Gan wrote: >>=20 >> >>>I was going to give the same answer steve gave for interpro and gene >>>finding results. >>> >>>For loading sequences into GUS, the dillema with option 2 is: how do you >>>know which sequence to load when you load (which is before you actually >>>have the similarity result)? One solution would be to initially load >>>complete dataset(s) but delete those without similarity after loading >>>similarity results. >>> >>>-Thomas >>> >>>On Fri, 11 Feb 2005, Steve Fischer wrote: >>> >>> =20 >>> >>>>alberto- >>>> >>>>we've never loaded interpro, so there isn't a plugin. >>>>i believe plasmodb has loaded glimmer results, though i'm not sure. i have >>>>asked a plasmodb developer to answer that question. >>>> >>>>steve >>>> >>>>Alberto Davila wrote: >>>> >>>> =20 >>>> >>>>>Hey Steve, Thomas, >>>>> >>>>>Thanks a lot for the tips, really helpful.. now, few more questions: >>>>> >>>>> >>>>> =20 >>>>> >>>>>>ok. NR =3D NRDB >>>>>> >>>>>>the way we have used gus with similarities is that both the query and >>>>>>subject are loaded into gus. As thomas explained, the similarity table >>>>>>captures similarity between sequences that are in gus. >>>>>>our approach has always been to just load (warehouse) the entire subject >>>>>>database (NR, EST) that we are blasting against. >>>>>> >>>>>>the current plugins and blastSimilarity are set up for this. >>>>>> >>>>>>obviously, this takes a lot of disk space. two major efficiencies that we >>>>>>don't currently have plugins for would be: >>>>>> 1. to only store in gus a *reference* to the external sequence (ie, don't >>>>>>store the actgs). >>>>>> 2. only store in gus the sequences that actually have similarities >>>>>> >>>>>> =20 >>>>>> >>>>>Option 2 sound better for us, since we will be blasting against several >>>>>databases (> 10GB databases) >>>>> >>>>>What about the plugins to load Interpro and "gene finder" (glimmer, etc) >>>>>results ? Is there any at all ? >>>>> >>>>>Cheers, Alberto >>>>> >>>>> >>>>> =20 >>>>> >>>>>>steve >>>>>> >>>>>>Alberto Davila wrote: >>>>>> >>>>>> >>>>>> =20 >>>>>> >>>>>>>All the blastable databases I mentioned are standard databases from NCBI >>>>>>>(ftp://ftp.ncbi.nlm.nih.gov/blast/db/blastdb.txt): >>>>>>> >>>>>>>NT =3D nucleotides >>>>>>> >>>>>>>~30000 entries from genbank (genbank format) are loaded into GUS now. >>>>>>> >>>>>>>Not sure about your "NRDB", I know NR from NCBI that is a collection of >>>>>>>aminoacid entries, could it be the same ? >>>>>>> >>>>>>>Alberto >>>>>>> >>>>>>>On Fri, 2005-02-11 at 10:43 -0500, Steve Fischer wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> =20 >>>>>>> >>>>>>>>(what is NT?) >>>>>>>> >>>>>>>>which of these (genbank, your fasta, NRDB, NT, EST) have you loaded into >>>>>>>>gus? >>>>>>>> >>>>>>>>steve >>>>>>>> >>>>>>>>Alberto Davila wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> =20 >>>>>>>> >>>>>>>>>Query: >>>>>>>>> >>>>>>>>>Either sequences from genbank (genbank format) or sequences generated >>>>>>>>>in >>>>>>>>>the lab (fasta format) >>>>>>>>> >>>>>>>>>Blastable databases (all are formatted databases from NCBI): >>>>>>>>> >>>>>>>>>NR >>>>>>>>>NT >>>>>>>>>EST >>>>>>>>> >>>>>>>>>Alberto >>>>>>>>> >>>>>>>>>On Fri, 2005-02-11 at 10:34 -0500, Steve Fischer wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> =20 >>>>>>>>> >>>>>>>>>>for the blast, what are the query sequences and what are the blastable >>>>>>>>>>databases? >>>>>>>>>> >>>>>>>>>>steve >>>>>>>>>> >>>>>>>>>>Alberto Davila wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> =20 >>>>>>>>>> >>>>>>>>>>>Basically we will use sequences (loaded into GUS with the GBParser) >>>>>>>>>>>for >>>>>>>>>>>NCBI Blast (Blastx, Blastp and TBlastX), the same sequences will be >>>>>>>>>>>also >>>>>>>>>>>used for Interpro analyses. Results of both (Blast and Interpro) will >>>>>>>>>>>be >>>>>>>>>>>loaded into GUS. We will parse specific things from the Blast >>>>>>>>>>>results, I >>>>>>>>>>>would say: >>>>>>>>>>> >>>>>>>>>>>`Gi` `Accession` `Description` `E_value` `Score` `Length` >>>>>>>>>>>`Frame_Query` `Frame_Hit` `Identical` `Hsp_Frac_Identical` >>>>>>>>>>>`Conserved` `Hsp_Frac_Conserved` >>>>>>>>>>>`Query_Start` >>>>>>>>>>>`Query_End` `Hit_Start` `Hit_End` `Hsp_Align` `database_letters` >>>>>>>>>>>`database_entries` >>>>>>>>>>>We already have a Bioperl parser for that (specific for another >>>>>>>>>>>system: >>>>>>>>>>>GARSA) that could be adapted to GUS, problem being we are not sure >>>>>>>>>>>what >>>>>>>>>>>tables should be used to store those data in GUS. >>>>>>>>>>> >>>>>>>>>>>Cheers, Alberto >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>On Fri, 2005-02-11 at 10:06 -0500, Steve Fischer wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> =20 >>>>>>>>>>> >>>>>>>>>>>>what are you planning on blasting? >>>>>>>>>>>> >>>>>>>>>>>>steve >>>>>>>>>>>> >>>>>>>>>>>>Alberto Davila wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> =20 >>>>>>>>>>>> >>>>>>>>>>>>>Hi Steve, >>>>>>>>>>>>> >>>>>>>>>>>>>On Fri, 2005-02-11 at 08:56 -0500, Steve Fischer wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> =20 >>>>>>>>>>>>> >>>>>>>>>>>>>>poliana- >>>>>>>>>>>>>> >>>>>>>>>>>>>>oops, the usage statement for LoadBlastSimFast is out of date. >>>>>>>>>>>>>>it should instruct you to use the blastSimilarity command. >>>>>>>>>>>>>> >>>>>>>>>>>>>>LoadBlastSimFast makes a big assumption, that the subject and >>>>>>>>>>>>>>query sequences are in GUS, and their def. lines have GUS primary >>>>>>>>>>>>>>keys. >>>>>>>>>>>>>>Are your sequences already loaded into GUS? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> =20 >>>>>>>>>>>>>> >>>>>>>>>>>>>They are not, there would be any howto/tips for that plugin ? We >>>>>>>>>>>>>will >>>>>>>>>>>>>certainly need a plugin to load "Interpro" and "ORF finding" >>>>>>>>>>>>>results >>>>>>>>>>>>>into GUS... If they are not available, then maybe we will have to >>>>>>>>>>>>>write >>>>>>>>>>>>>them ... >>>>>>>>>>>>> >>>>>>>>>>>>>Cheers, Alberto >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> =20 >>>>>>>>>>>>> >>>>>>>>>>>>>>steve >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>Poliana Mateus wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> =20 >>>>>>>>>>>>>> >>>>>>>>>>>>>>>Hello all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Where can find the script parseBlastFilesForSimilarity.pl?? >>>>>>>>>>>>>>>I'm trying to run LoadBlastSimFast... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Poliana > |
From: Sucheta T. <su...@vb...> - 2005-02-14 15:42:14
|
Thank you all for the response. My original question was bit premature and based on more of our convenience than the design of GUS per se. But these are some issues, I think the group may need to consider: By keeping different prediction outputs as different feature_ids and finally merging them through gene and gene_instance seems fine. But we need to distinguish between features within a gene and features of a gene. What I mean by that is a gene may have different features like an UTR feature , a promoter element feature etc, a cpg island feature and so on. which needs to be linked to a particular prediction output. For example I have gene 1 with gene_id 1 which has several gene_instances represented by different prediction algorithms and each having a different na_feature_id say: 1,2,3,4. For prediction algorithm with na_feature_id=1, I have a set of UTR locations and say I have feature_id 5 for UTR_5 and 6 for UTR_3 and a promoter feature with na_feature_id 7. Then how would I link na_feature_ids 1,5,6 and 7. So, for this I was thinking if one na_feature_id could represent a gene and its locations could have a description each with reviewer's status etc. And to save the geneinstances for any kind of splice variants or any other types of instances. Many thanks Sucheta At 09:40 AM 2/14/2005 -0500, you wrote: >This question is a bit more complex than it seems. > >All three of these may be necessary for every level of >analysis. first, was the overall gene prediction/feature >prediction reviewed and how was it algorithmically arrived at? > Then you can ask the same question about the locations. >Early locations may be provided by the same algorithm as the >feature, but these may be further defined later and require >their own review annotations. > >One thing that I think needs to be addressed, however, is how >these columns appear throughout the schema like mushrooms. I >have been told that GUS was hyper-normalized when it was first >written to 4N or 5N form, but that is definitely not the case >now. Why do you want review_status, reviewer and algorithm in >the feature table? Since these usually will appear in >clusters (i.e. running an algorithm once gives you 500 cases >of the same entry for all three on the same date), shouldn't >we have a table to collect these into an annotation_status >table and just have an FK to an entry in that table for every >table that uses these? The same goes for other SRes entries >such as taxon, project, etc. > >-Ed > > > >---- Original message ---- > >Date: Sat, 12 Feb 2005 22:34:03 -0500 (EST) > >From: "Sucheta Tripathy" <su...@vb...> > >Subject: [Gusdev-gusdev] dots.nalocation table > >To: Gus...@li... > > > > > >Hi Group, > > > >From community annotation point of view, I was wondering if >it is a good > >idea to have is_reviewed, algorithm_id and reviewer_id in >dots.nalocation > >table. > > > >Since one na_feature_id( a transcript or a gene) may be >having multiple > >sets of nalocations, so one can easily capture them in >nalocation with > >different algorithms and with a reviewed option. > > > >In our application we need several gene calling programs to >have locations > >as well as related information registered. > > > >Sucheta > > > > > >-- > >Sucheta Tripathy > >Virginia Bioinformatics Institute Phase-I > >Washington street. > >Virginia Tech. > >Blacksburg,VA 24061-0447 > >phone:(540)231-8138 > >Fax: (540) 231-2606 > > > > > >------------------------------------------------------- > >SF email is sponsored by - The IT Product Guide > >Read honest & candid reviews on hundreds of IT Products from >real users. > >Discover which products truly live up to the hype. Start >reading now. > >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > >_______________________________________________ > >Gusdev-gusdev mailing list > >Gus...@li... > >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >----------------- >Ed Robinson >Center for Tropical and Emerging Global Diseases >University of Georgia, Athens, GA 30602 >ero...@ug.../(706)542.1447/254.8883 > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
From: Ed R. <ero...@ug...> - 2005-02-14 14:46:50
|
QWxiZXJ0bywNCg0KUG9saWFuYSBtYXkgYmUgaW50ZXJlc3RlZCBpbiBhIEdVUyBkZXZlbG9w ZXJzIGd1aWRlIEkgYW0NCnRyeWluZyB0byB3cml0ZSB0aGlzIHdlZWsgZm9yIHRoZSBjb3Vy c2UuICBJIGp1c3Qgd2VudCB0aHJvdWdoDQp0aGUgbmlnaHRtYXJlIG9mIGxlYXJuaW5nIGhv dyB0byBjb3JyZWN0bHkgd3JpdGUgR1VTIHBsdWdpbnMNCmZvciBhIGNvbXBsZXRlbHkgdW5k b2N1bWVudGVkIEFQSSBhbmQgbGl0dGxlIGhlbHAgb3IgcG9pbnRlcnMNCnRvIHdoZXJlIHRo YXQgQVBJIGNhbiBiZSBmb3VuZCBpbiB0aGUgc291cmNlIGNvZGUuICBUaGVyZSBpcyBhDQpw bHVnaW4gZGVzY3JpcHRpb24gb24gdGhlIFdJS0ksIGJ1dCBhYnNvbHV0ZWx5IE5PIEFQSSBm b3IgdGhlDQpHVVMgTW9kZWwuICBJIHNob3VsZCBoYXZlIGEgZG9jdW1lbnQgd3JpdHRlbiBm b3IgdGhpcyB3aXRoIGFuDQpBUEkgZm9yIHRoZSBQbHVnaW4gQ2xhc3MgYW5kIGEgZ2VuZXJh bCBBUEkgd3JpdHRlbiBmb3IgdGhlIEdVUw0KTW9kZWwgYnkgRnJpZGF5LiAgSXQgd2lsbCBh bHNvIGluY2x1ZGUgb3RoZXIgcG9pbnRzIGZvcg0KZGVidWdnaW5nIEdVUyBhbmQgc29tZSBi ZXN0IHByYWN0aWNlcyBJIGhhdmUgY29sbGVjdGVkIGluIG15DQpub3Rlcy4NCg0KLUVkDQoN Cg0KDQotLS0tIE9yaWdpbmFsIG1lc3NhZ2UgLS0tLQ0KPkRhdGU6IFN1biwgMTMgRmViIDIw MDUgMTg6MDE6MjIgLTAzMDANCj5Gcm9tOiAiZGF2aWxhIiA8ZGF2aWxhQGlvYy5maW9jcnV6 LmJyPiAgDQo+U3ViamVjdDogUkVTOiBbR3VzZGV2LWd1c2Rldl0gcGFyc2VCbGFzdEZpbGVz Rm9yU2ltaWxhcml0eS5wbCAgDQo+VG86ICJTdGV2ZSBGaXNjaGVyIiA8c2Zpc2NoZXJAcGNi aS51cGVubi5lZHU+DQo+Q2M6ICJZLiBUaG9tYXMgR2FuIiA8eW9uZ2NoYW5AcGNiaS51cGVu bi5lZHU+LCAiUG9saWFuYQ0KTWF0ZXVzIiA8cG9saWFuYS5tYXRldXNAZ21haWwuY29tPiwN CjxndXNkZXYtZ3VzZGV2QGxpc3RzLnNvdXJjZWZvcmdlLm5ldD4NCj4NCj5TdGV2ZSwNCj4q IHNlZSBiZWxvdw0KPg0KPkFsYmVydG8gRGF2aWxhIHdyb3RlOg0KPg0KPj5XZSBhcmUgZG9p bmcgdGhpcyBmb3IgR2Fyc2EgKGFub3RoZXIgc3lzdGVtKSAuLiBiYXNpY2FsbHkgd2UNCmhh dmUgYQ0KPj5iaW9wZXJsIHBhcnNlciAoQmlvOjpTZWFyY2g6OklPKSB0aGF0IHJlYWRzIHRo ZSBCbGFzdA0KcmVzdWx0cyBmaWxlIGFuZA0KPj5leHRyYWN0IGFsbCB0aGUgbmVlZGVkIGlu Zm8gKHRvIHRoZSAiQmxhc3RfSGl0IiB0YWJsZSkuLi4NCmFuZCBhbHNvIGxvYWQNCj4+aW50 byBhIGdpdmVuIHRhYmxlIChlZzogRXh0ZXJuYWxfREIpIGFsbCB0aGUgc2VxdWVuY2VzIChp bg0KZmFzdGEgZm9ybWF0KQ0KPj5wcmVzZW50aW5nIHNpbWlsYXJpdHkgd2l0aCB0aGUgcXVl cmllcy4uLiBhdCB0aGUgZW5kIHdlIGhhdmUNCiJCbGFzdF9IaXQiDQo+PmFuZCAiRXh0ZXJu YWxfREIiIHBvcHVsYXRlZCB3aXRoIHRoZSBzYW1lIHNjcmlwdC4NCj4+DQo+PiANCj4+DQo+ d293LCBncmVhdC4gIGNvdWxkIHlvdSBtYWtlIGEgZ3VzIHBsdWdpbiBmcm9tIHRoYXQ/DQo+ DQo+DQo+U2hvdWxkIG5vdCBiZSBhIGJpZyBwcm9ibGVtLCBJIHdpbGwgYXNrIFBvbGlhbmEg dG8gZG8gdGhhdC4uLg0Kc2hlIGNhbiBvY2Fzc2lvbmFsbHkgY29udGFjdCB5b3UgYXNraW5n IGZvciBzb21lIGRldGFpbHMuLi4gYXQNCnRoZSBlbmQgd2Ugd2lsbCBwdXQgdGhpbmdzIGJl aW5nIGRlYnVnZ2VkL2RldmVsb3BlZCBieSB1cyBhdCA6DQp3d3cuYmlvd2ViZGIub3JnIGFu ZCBhbHNvIHByb3ZpZGUgdGhlbSB0byBhbnkgaW50ZXJlc3RlZA0KcGVvcGxlLiBJbiBhbiBp ZGVhbCB3b3JsZCwgbm9ib2R5IHNob3VsZCBzdWZmZXIgdHdpY2Ugd2l0aCB0aGUNCnNhbWUg ImJ1ZyIgOy0pDQo+DQo+PlJlZ2FyZGluZyBJbnRlcnBybyBhbmQgR2xpbW1lciwgdGhlIG1h aW4gcHJvYmxlbSBpcyB0byBrbm93DQppbiB3aGljaA0KPj50YWJsZXMgd2Ugc2hvdWxkIGxv YWQgdGhlIHBhcnNlZCByZXN1bHRzID8NCj4+DQo+PiANCj4+DQo+KiBkZXNjcmliZSB0aGUg aW5mbyB5b3Ugd2FudCB0byBzdG9yZS4NCj4NCj5CYXNpY2FsbHkgdGhpczoNCj4NCj5GcmFt ZV9IaXQsIE1ldGhvZCAsIE1ldGhvZF9BY2Nlc3Npb24sIEFjY2Vzc2lvbiwgSGl0X1N0YXR1 cywNClF1ZXJ5X1N0YXJ0LCBRdWVyeV9FbmQsIERlc2NyaXB0aW9uLCBFX3ZhbHVlDQo+DQo+ QWdhaW4sIEkgYW0gYXNraW5nIFBvbGlhbmEgdG8gdGFrZSBjYXJlIG9mIHRoYXQuDQo+DQo+ KiBzdGV2ZQ0KPg0KPkNoZWVycywgQWxiZXJ0bw0KPg0KPg0KPkFsYmVydG8gTS4gUi4gRMOh dmlsYSwgUGhEDQo+S2luZXRvcGxhc3RpZCBCaW9sb2d5IGFuZCBEaXNlYXNlIChCaW9tZWQg Q2VudHJhbCkNCj5odHRwOi8vd3d3LmtpbmV0b3BsYXN0aWRzLmNvbQ0KPmh0dHA6Ly93d3cu ZGFyd2luLmZpb2NydXouYnINCj5EQkJNIC8gSW5zdGl0dXRvIE9zd2FsZG8gQ3J1eiAvIEZJ T0NSVVoNCj5Bdi4gQnJhc2lsIDQzNjUNCj5SaW8gZGUgSmFuZWlybywgUkosIEJyYXNpbA0K PkNFUCAyMTA0NS05MDANCj5FbWFpbDogZGF2aWxhQGZpb2NydXouYnINCj4gICAgICAgICAg YW1yZGF2aWxhQHlhaG9vLmNvbQ0KPlBob25lOiA1NS0yMS0zODY1LTgyMjkgLyAzODY1LTgy MDYNCj5GYXg6IDU1LTIxLTI1OTAtMzQ5NQ0KPi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCj5UaGUgQmlvd2ViREIgY29uc29ydGl1bTogaHR0 cDovL3d3dy5iaW93ZWJkYi5vcmcNCj4NCj4gDQo+DQo+DQo+DQo+PkFsYmVydG8NCj4+DQo+ Pk9uIEZyaSwgMjAwNS0wMi0xMSBhdCAxMzoyMSAtMDUwMCwgWS4gVGhvbWFzIEdhbiB3cm90 ZToNCj4+IA0KPj4NCj4+Pkkgd2FzIGdvaW5nIHRvIGdpdmUgdGhlIHNhbWUgYW5zd2VyIHN0 ZXZlIGdhdmUgZm9yIGludGVycHJvDQphbmQgZ2VuZQ0KPj4+ZmluZGluZyByZXN1bHRzLg0K Pj4+DQo+Pj5Gb3IgbG9hZGluZyBzZXF1ZW5jZXMgaW50byBHVVMsIHRoZSBkaWxsZW1hIHdp dGggb3B0aW9uIDINCmlzOiBob3cgZG8geW91DQo+Pj5rbm93IHdoaWNoIHNlcXVlbmNlIHRv IGxvYWQgd2hlbiB5b3UgbG9hZCAod2hpY2ggaXMgYmVmb3JlDQp5b3UgYWN0dWFsbHkNCj4+ PmhhdmUgdGhlIHNpbWlsYXJpdHkgcmVzdWx0KT8gT25lIHNvbHV0aW9uIHdvdWxkIGJlIHRv DQppbml0aWFsbHkgbG9hZA0KPj4+Y29tcGxldGUgZGF0YXNldChzKSBidXQgZGVsZXRlIHRo b3NlIHdpdGhvdXQgc2ltaWxhcml0eQ0KYWZ0ZXIgbG9hZGluZw0KPj4+c2ltaWxhcml0eSBy ZXN1bHRzLg0KPj4+DQo+Pj4tVGhvbWFzDQo+Pj4NCj4+Pk9uIEZyaSwgMTEgRmViIDIwMDUs IFN0ZXZlIEZpc2NoZXIgd3JvdGU6DQo+Pj4NCj4+PiAgIA0KPj4+DQo+Pj4+YWxiZXJ0by0N Cj4+Pj4NCj4+Pj53ZSd2ZSBuZXZlciBsb2FkZWQgaW50ZXJwcm8sIHNvIHRoZXJlIGlzbid0 IGEgcGx1Z2luLg0KPj4+PmkgYmVsaWV2ZSBwbGFzbW9kYiBoYXMgbG9hZGVkIGdsaW1tZXIg cmVzdWx0cywgdGhvdWdoIGknbQ0Kbm90IHN1cmUuICAgaSBoYXZlDQo+Pj4+YXNrZWQgYSBw bGFzbW9kYiBkZXZlbG9wZXIgdG8gYW5zd2VyIHRoYXQgcXVlc3Rpb24uDQo+Pj4+DQo+Pj4+ c3RldmUNCj4+Pj4NCj4+Pj5BbGJlcnRvIERhdmlsYSB3cm90ZToNCj4+Pj4NCj4+Pj4gICAg IA0KPj4+Pg0KPj4+Pj5IZXkgU3RldmUsIFRob21hcywNCj4+Pj4+DQo+Pj4+PlRoYW5rcyBh IGxvdCBmb3IgdGhlIHRpcHMsIHJlYWxseSBoZWxwZnVsLi4gbm93LCBmZXcgbW9yZQ0KcXVl c3Rpb25zOg0KPj4+Pj4NCj4+Pj4+DQo+Pj4+PiAgICAgICANCj4+Pj4+DQo+Pj4+Pj5vay4g IE5SID0gTlJEQg0KPj4+Pj4+DQo+Pj4+Pj50aGUgd2F5IHdlIGhhdmUgdXNlZCBndXMgd2l0 aCBzaW1pbGFyaXRpZXMgaXMgdGhhdCBib3RoDQp0aGUgcXVlcnkgYW5kDQo+Pj4+Pj5zdWJq ZWN0IGFyZSBsb2FkZWQgaW50byBndXMuICBBcyB0aG9tYXMgZXhwbGFpbmVkLCB0aGUNCnNp bWlsYXJpdHkgdGFibGUNCj4+Pj4+PmNhcHR1cmVzIHNpbWlsYXJpdHkgYmV0d2VlbiBzZXF1 ZW5jZXMgdGhhdCBhcmUgaW4gZ3VzLg0KPj4+Pj4+b3VyIGFwcHJvYWNoIGhhcyBhbHdheXMg YmVlbiB0byBqdXN0IGxvYWQgKHdhcmVob3VzZSkNCnRoZSBlbnRpcmUgc3ViamVjdA0KPj4+ Pj4+ZGF0YWJhc2UgKE5SLCBFU1QpIHRoYXQgd2UgYXJlIGJsYXN0aW5nIGFnYWluc3QuDQo+ Pj4+Pj4NCj4+Pj4+PnRoZSBjdXJyZW50IHBsdWdpbnMgYW5kIGJsYXN0U2ltaWxhcml0eSBh cmUgc2V0IHVwIGZvciB0aGlzLg0KPj4+Pj4+DQo+Pj4+Pj5vYnZpb3VzbHksIHRoaXMgdGFr ZXMgYSBsb3Qgb2YgZGlzayBzcGFjZS4gIHR3byBtYWpvcg0KZWZmaWNpZW5jaWVzIHRoYXQg d2UNCj4+Pj4+PmRvbid0IGN1cnJlbnRseSBoYXZlIHBsdWdpbnMgZm9yIHdvdWxkIGJlOg0K Pj4+Pj4+IDEuIHRvIG9ubHkgc3RvcmUgaW4gZ3VzIGEgKnJlZmVyZW5jZSogdG8gdGhlIGV4 dGVybmFsDQpzZXF1ZW5jZSAoaWUsIGRvbid0DQo+Pj4+Pj5zdG9yZSB0aGUgYWN0Z3MpLg0K Pj4+Pj4+IDIuIG9ubHkgc3RvcmUgaW4gZ3VzIHRoZSBzZXF1ZW5jZXMgdGhhdCBhY3R1YWxs eSBoYXZlDQpzaW1pbGFyaXRpZXMNCj4+Pj4+Pg0KPj4+Pj4+ICAgICAgICAgDQo+Pj4+Pj4N Cj4+Pj4+T3B0aW9uIDIgc291bmQgYmV0dGVyIGZvciB1cywgc2luY2Ugd2Ugd2lsbCBiZSBi bGFzdGluZw0KYWdhaW5zdCBzZXZlcmFsDQo+Pj4+PmRhdGFiYXNlcyAoPiAxMEdCIGRhdGFi YXNlcykNCj4+Pj4+DQo+Pj4+PldoYXQgYWJvdXQgdGhlIHBsdWdpbnMgdG8gbG9hZCBJbnRl cnBybyBhbmQgImdlbmUgZmluZGVyIg0KKGdsaW1tZXIsIGV0YykNCj4+Pj4+cmVzdWx0cyA/ IElzIHRoZXJlIGFueSBhdCBhbGwgPw0KPj4+Pj4NCj4+Pj4+Q2hlZXJzLCBBbGJlcnRvDQo+ Pj4+Pg0KPj4+Pj4NCj4+Pj4+ICAgICAgIA0KPj4+Pj4NCj4+Pj4+PnN0ZXZlDQo+Pj4+Pj4N Cj4+Pj4+PkFsYmVydG8gRGF2aWxhIHdyb3RlOg0KPj4+Pj4+DQo+Pj4+Pj4NCj4+Pj4+PiAg ICAgICAgIA0KPj4+Pj4+DQo+Pj4+Pj4+QWxsIHRoZSBibGFzdGFibGUgZGF0YWJhc2VzIEkg bWVudGlvbmVkIGFyZSBzdGFuZGFyZA0KZGF0YWJhc2VzIGZyb20gTkNCSQ0KPj4+Pj4+Pihm dHA6Ly9mdHAubmNiaS5ubG0ubmloLmdvdi9ibGFzdC9kYi9ibGFzdGRiLnR4dCk6DQo+Pj4+ Pj4+DQo+Pj4+Pj4+TlQgPSBudWNsZW90aWRlcw0KPj4+Pj4+Pg0KPj4+Pj4+Pn4zMDAwMCBl bnRyaWVzIGZyb20gZ2VuYmFuayAoZ2VuYmFuayBmb3JtYXQpIGFyZSBsb2FkZWQNCmludG8g R1VTIG5vdy4NCj4+Pj4+Pj4NCj4+Pj4+Pj5Ob3Qgc3VyZSBhYm91dCB5b3VyICJOUkRCIiwg SSBrbm93IE5SIGZyb20gTkNCSSB0aGF0IGlzDQphIGNvbGxlY3Rpb24gb2YNCj4+Pj4+Pj5h bWlub2FjaWQgZW50cmllcywgY291bGQgaXQgYmUgdGhlIHNhbWUgPw0KPj4+Pj4+Pg0KPj4+ Pj4+PkFsYmVydG8NCj4+Pj4+Pj4NCj4+Pj4+Pj5PbiBGcmksIDIwMDUtMDItMTEgYXQgMTA6 NDMgLTA1MDAsIFN0ZXZlIEZpc2NoZXIgd3JvdGU6DQo+Pj4+Pj4+DQo+Pj4+Pj4+DQo+Pj4+ Pj4+DQo+Pj4+Pj4+ICAgICAgICAgICANCj4+Pj4+Pj4NCj4+Pj4+Pj4+KHdoYXQgaXMgTlQ/ KQ0KPj4+Pj4+Pj4NCj4+Pj4+Pj4+d2hpY2ggb2YgdGhlc2UgKGdlbmJhbmssIHlvdXIgZmFz dGEsIE5SREIsIE5ULCBFU1QpDQpoYXZlIHlvdSBsb2FkZWQgaW50bw0KPj4+Pj4+Pj5ndXM/ DQo+Pj4+Pj4+Pg0KPj4+Pj4+Pj5zdGV2ZQ0KPj4+Pj4+Pj4NCj4+Pj4+Pj4+QWxiZXJ0byBE YXZpbGEgd3JvdGU6DQo+Pj4+Pj4+Pg0KPj4+Pj4+Pj4NCj4+Pj4+Pj4+DQo+Pj4+Pj4+PiAg ICAgICAgICAgICANCj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj5RdWVyeToNCj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+RWl0aGVyIHNlcXVlbmNlcyBmcm9tIGdlbmJhbmsgKGdlbmJhbmsgZm9ybWF0KSBv cg0Kc2VxdWVuY2VzIGdlbmVyYXRlZA0KPj4+Pj4+Pj4+aW4NCj4+Pj4+Pj4+PnRoZSBsYWIg KGZhc3RhIGZvcm1hdCkNCj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Qmxhc3RhYmxlIGRhdGFiYXNl cyAoYWxsIGFyZSBmb3JtYXR0ZWQgZGF0YWJhc2VzIGZyb20NCk5DQkkpOg0KPj4+Pj4+Pj4+ DQo+Pj4+Pj4+Pj5OUg0KPj4+Pj4+Pj4+TlQNCj4+Pj4+Pj4+PkVTVA0KPj4+Pj4+Pj4+DQo+ Pj4+Pj4+Pj5BbGJlcnRvDQo+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pk9uIEZyaSwgMjAwNS0wMi0x MSBhdCAxMDozNCAtMDUwMCwgU3RldmUgRmlzY2hlciB3cm90ZToNCj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+DQo+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+ICAgICAgICAgICAgICAg DQo+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj5mb3IgdGhlIGJsYXN0LCB3aGF0IGFyZSB0aGUgcXVl cnkgc2VxdWVuY2VzIGFuZCB3aGF0DQphcmUgdGhlIGJsYXN0YWJsZQ0KPj4+Pj4+Pj4+PmRh dGFiYXNlcz8NCj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj5zdGV2ZQ0KPj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+PkFsYmVydG8gRGF2aWxhIHdyb3RlOg0KPj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pg0K Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+PiAgICAgICAgICAgICAgICAgDQo+ Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+PkJhc2ljYWxseSB3ZSB3aWxsIHVzZSBzZXF1ZW5jZXMg KGxvYWRlZCBpbnRvIEdVUw0Kd2l0aCB0aGUgR0JQYXJzZXIpDQo+Pj4+Pj4+Pj4+PmZvcg0K Pj4+Pj4+Pj4+Pj5OQ0JJIEJsYXN0IChCbGFzdHgsIEJsYXN0cCBhbmQgVEJsYXN0WCksIHRo ZSBzYW1lDQpzZXF1ZW5jZXMgd2lsbCBiZQ0KPj4+Pj4+Pj4+Pj5hbHNvDQo+Pj4+Pj4+Pj4+ PnVzZWQgZm9yIEludGVycHJvIGFuYWx5c2VzLiBSZXN1bHRzIG9mIGJvdGggKEJsYXN0DQph bmQgSW50ZXJwcm8pIHdpbGwNCj4+Pj4+Pj4+Pj4+YmUNCj4+Pj4+Pj4+Pj4+bG9hZGVkIGlu dG8gR1VTLiBXZSB3aWxsIHBhcnNlIHNwZWNpZmljIHRoaW5ncyBmcm9tDQp0aGUgQmxhc3QN Cj4+Pj4+Pj4+Pj4+cmVzdWx0cywgSQ0KPj4+Pj4+Pj4+Pj53b3VsZCBzYXk6DQo+Pj4+Pj4+ Pj4+Pg0KPj4+Pj4+Pj4+Pj5gR2lgIGBBY2Nlc3Npb25gIGBEZXNjcmlwdGlvbmAgYEVfdmFs dWVgIGBTY29yZWANCmBMZW5ndGhgDQo+Pj4+Pj4+Pj4+PmBGcmFtZV9RdWVyeWAgYEZyYW1l X0hpdGAgYElkZW50aWNhbGANCmBIc3BfRnJhY19JZGVudGljYWxgDQo+Pj4+Pj4+Pj4+PmBD b25zZXJ2ZWRgIGBIc3BfRnJhY19Db25zZXJ2ZWRgDQo+Pj4+Pj4+Pj4+PmBRdWVyeV9TdGFy dGANCj4+Pj4+Pj4+Pj4+YFF1ZXJ5X0VuZGAgYEhpdF9TdGFydGAgYEhpdF9FbmRgIGBIc3Bf QWxpZ25gDQpgZGF0YWJhc2VfbGV0dGVyc2ANCj4+Pj4+Pj4+Pj4+YGRhdGFiYXNlX2VudHJp ZXNgDQo+Pj4+Pj4+Pj4+PldlIGFscmVhZHkgaGF2ZSBhIEJpb3BlcmwgcGFyc2VyIGZvciB0 aGF0IChzcGVjaWZpYw0KZm9yIGFub3RoZXINCj4+Pj4+Pj4+Pj4+c3lzdGVtOg0KPj4+Pj4+ Pj4+Pj5HQVJTQSkgdGhhdCBjb3VsZCBiZSBhZGFwdGVkIHRvIEdVUywgcHJvYmxlbSBiZWlu Zw0Kd2UgYXJlIG5vdCBzdXJlDQo+Pj4+Pj4+Pj4+PndoYXQNCj4+Pj4+Pj4+Pj4+dGFibGVz IHNob3VsZCBiZSB1c2VkIHRvIHN0b3JlIHRob3NlIGRhdGEgaW4gR1VTLg0KPj4+Pj4+Pj4+ Pj4NCj4+Pj4+Pj4+Pj4+Q2hlZXJzLCBBbGJlcnRvDQo+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+ Pj4NCj4+Pj4+Pj4+Pj4+T24gRnJpLCAyMDA1LTAyLTExIGF0IDEwOjA2IC0wNTAwLCBTdGV2 ZSBGaXNjaGVyIHdyb3RlOg0KPj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+ Pg0KPj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+PiAgICAgICAgICAgICAg ICAgICANCj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj53aGF0IGFyZSB5b3UgcGxhbm5pbmcg b24gYmxhc3Rpbmc/DQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+PnN0ZXZlDQo+Pj4+Pj4+ Pj4+Pj4NCj4+Pj4+Pj4+Pj4+PkFsYmVydG8gRGF2aWxhIHdyb3RlOg0KPj4+Pj4+Pj4+Pj4+ DQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+ Pj4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+ICAgICAgICAgICAgICAgICAgICAgDQo+ Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj5IaSBTdGV2ZSwNCj4+Pj4+Pj4+Pj4+Pj4NCj4+ Pj4+Pj4+Pj4+Pj5PbiBGcmksIDIwMDUtMDItMTEgYXQgMDg6NTYgLTA1MDAsIFN0ZXZlIEZp c2NoZXINCndyb3RlOg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+ Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+Pj4+PiAgICAgICAgICAgICAgICAgICAgICAgDQo+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+ Pj4+Pj4+Pj4+PnBvbGlhbmEtDQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pj5vb3Bz LCB0aGUgdXNhZ2Ugc3RhdGVtZW50IGZvciBMb2FkQmxhc3RTaW1GYXN0DQppcyBvdXQgb2Yg ZGF0ZS4NCj4+Pj4+Pj4+Pj4+Pj4+aXQgc2hvdWxkIGluc3RydWN0IHlvdSB0byB1c2UgdGhl DQpibGFzdFNpbWlsYXJpdHkgY29tbWFuZC4NCj4+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+ Pj4+PkxvYWRCbGFzdFNpbUZhc3QgbWFrZXMgYSBiaWcgYXNzdW1wdGlvbiwgdGhhdA0KdGhl IHN1YmplY3QgYW5kDQo+Pj4+Pj4+Pj4+Pj4+PnF1ZXJ5IHNlcXVlbmNlcyBhcmUgaW4gR1VT LCBhbmQgdGhlaXIgZGVmLiBsaW5lcw0KaGF2ZSBHVVMgcHJpbWFyeQ0KPj4+Pj4+Pj4+Pj4+ Pj5rZXlzLg0KPj4+Pj4+Pj4+Pj4+Pj5BcmUgeW91ciBzZXF1ZW5jZXMgYWxyZWFkeSBsb2Fk ZWQgaW50byBHVVM/DQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+ Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+ DQo+Pj4+Pj4+Pj4+Pj4+PiAgICAgICAgICAgICAgICAgICAgICAgICANCj4+Pj4+Pj4+Pj4+ Pj4+DQo+Pj4+Pj4+Pj4+Pj4+VGhleSBhcmUgbm90LCB0aGVyZSB3b3VsZCBiZSBhbnkgaG93 dG8vdGlwcyBmb3INCnRoYXQgcGx1Z2luID8gV2UNCj4+Pj4+Pj4+Pj4+Pj53aWxsDQo+Pj4+ Pj4+Pj4+Pj4+Y2VydGFpbmx5IG5lZWQgYSBwbHVnaW4gdG8gbG9hZCAiSW50ZXJwcm8iIGFu ZA0KIk9SRiBmaW5kaW5nIg0KPj4+Pj4+Pj4+Pj4+PnJlc3VsdHMNCj4+Pj4+Pj4+Pj4+Pj5p bnRvIEdVUy4uLiBJZiB0aGV5IGFyZSBub3QgYXZhaWxhYmxlLCB0aGVuIG1heWJlDQp3ZSB3 aWxsIGhhdmUgdG8NCj4+Pj4+Pj4+Pj4+Pj53cml0ZQ0KPj4+Pj4+Pj4+Pj4+PnRoZW0gLi4u DQo+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Q2hlZXJzLCBBbGJlcnRvDQo+Pj4+Pj4+ Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+DQo+ Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+ Pj4+ICAgICAgICAgICAgICAgICAgICAgICANCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+ Pj4+c3RldmUNCj4+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+ Pj4NCj4+Pj4+Pj4+Pj4+Pj4+UG9saWFuYSBNYXRldXMgd3JvdGU6DQo+Pj4+Pj4+Pj4+Pj4+ Pg0KPj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+ Pj4+Pj4gICAgICAgICAgICAgICAgICAgICAgICAgDQo+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+ Pj4+Pj4+Pj4+SGVsbG8gYWxsLA0KPj4+Pj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Pj5X aGVyZSBjYW4gZmluZCB0aGUgc2NyaXB0DQpwYXJzZUJsYXN0RmlsZXNGb3JTaW1pbGFyaXR5 LnBsPz8NCj4+Pj4+Pj4+Pj4+Pj4+PkknbSB0cnlpbmcgdG8gcnVuIExvYWRCbGFzdFNpbUZh c3QuLi4NCj4+Pj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pj4+UG9saWFuYQ0KPj4+Pj4+ Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+Pj4gICAgICAgICAgICAgICAgICAgICAgICAgICAN Cj4+Pj4+Pj4+Pj4+Pj4+Pg0KPg0KPg0KPkhq77+977+9eU5Mdhp5dnpqdu+/ve+/ve+/vXYh 77+977+977+9anrvv73vv73vv73vv73vv73vv73vv71677+977+9Su+/vXbvv73vv73vv71y aQ0KLS0tLS0tLS0tLS0tLS0tLS0NCkVkIFJvYmluc29uDQpDZW50ZXIgZm9yIFRyb3BpY2Fs IGFuZCBFbWVyZ2luZyBHbG9iYWwgRGlzZWFzZXMNClVuaXZlcnNpdHkgb2YgR2VvcmdpYSwg QXRoZW5zLCBHQSAzMDYwMg0KZXJvYmluc29AdWdhLmVkdS8oNzA2KTU0Mi4xNDQ3LzI1NC44 ODgzDQo= |
From: Ed R. <ero...@ug...> - 2005-02-14 14:41:20
|
This question is a bit more complex than it seems. All three of these may be necessary for every level of analysis. first, was the overall gene prediction/feature prediction reviewed and how was it algorithmically arrived at? Then you can ask the same question about the locations. Early locations may be provided by the same algorithm as the feature, but these may be further defined later and require their own review annotations. One thing that I think needs to be addressed, however, is how these columns appear throughout the schema like mushrooms. I have been told that GUS was hyper-normalized when it was first written to 4N or 5N form, but that is definitely not the case now. Why do you want review_status, reviewer and algorithm in the feature table? Since these usually will appear in clusters (i.e. running an algorithm once gives you 500 cases of the same entry for all three on the same date), shouldn't we have a table to collect these into an annotation_status table and just have an FK to an entry in that table for every table that uses these? The same goes for other SRes entries such as taxon, project, etc. -Ed ---- Original message ---- >Date: Sat, 12 Feb 2005 22:34:03 -0500 (EST) >From: "Sucheta Tripathy" <su...@vb...> >Subject: [Gusdev-gusdev] dots.nalocation table >To: Gus...@li... > > >Hi Group, > >From community annotation point of view, I was wondering if it is a good >idea to have is_reviewed, algorithm_id and reviewer_id in dots.nalocation >table. > >Since one na_feature_id( a transcript or a gene) may be having multiple >sets of nalocations, so one can easily capture them in nalocation with >different algorithms and with a reviewed option. > >In our application we need several gene calling programs to have locations >as well as related information registered. > >Sucheta > > >-- >Sucheta Tripathy >Virginia Bioinformatics Institute Phase-I >Washington street. >Virginia Tech. >Blacksburg,VA 24061-0447 >phone:(540)231-8138 >Fax: (540) 231-2606 > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >Gusdev-gusdev mailing list >Gus...@li... >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev ----------------- Ed Robinson Center for Tropical and Emerging Global Diseases University of Georgia, Athens, GA 30602 ero...@ug.../(706)542.1447/254.8883 |
From: davila <da...@io...> - 2005-02-13 21:06:30
|
U3RldmUsDQoqIHNlZSBiZWxvdw0KDQpBbGJlcnRvIERhdmlsYSB3cm90ZToNCg0KPldlIGFyZSBk b2luZyB0aGlzIGZvciBHYXJzYSAoYW5vdGhlciBzeXN0ZW0pIC4uIGJhc2ljYWxseSB3ZSBoYXZl IGENCj5iaW9wZXJsIHBhcnNlciAoQmlvOjpTZWFyY2g6OklPKSB0aGF0IHJlYWRzIHRoZSBCbGFz dCByZXN1bHRzIGZpbGUgYW5kDQo+ZXh0cmFjdCBhbGwgdGhlIG5lZWRlZCBpbmZvICh0byB0aGUg IkJsYXN0X0hpdCIgdGFibGUpLi4uIGFuZCBhbHNvIGxvYWQNCj5pbnRvIGEgZ2l2ZW4gdGFibGUg KGVnOiBFeHRlcm5hbF9EQikgYWxsIHRoZSBzZXF1ZW5jZXMgKGluIGZhc3RhIGZvcm1hdCkNCj5w cmVzZW50aW5nIHNpbWlsYXJpdHkgd2l0aCB0aGUgcXVlcmllcy4uLiBhdCB0aGUgZW5kIHdlIGhh dmUgIkJsYXN0X0hpdCINCj5hbmQgIkV4dGVybmFsX0RCIiBwb3B1bGF0ZWQgd2l0aCB0aGUgc2Ft ZSBzY3JpcHQuDQo+DQo+IA0KPg0Kd293LCBncmVhdC4gIGNvdWxkIHlvdSBtYWtlIGEgZ3VzIHBs dWdpbiBmcm9tIHRoYXQ/DQoNCg0KU2hvdWxkIG5vdCBiZSBhIGJpZyBwcm9ibGVtLCBJIHdpbGwg YXNrIFBvbGlhbmEgdG8gZG8gdGhhdC4uLiBzaGUgY2FuIG9jYXNzaW9uYWxseSBjb250YWN0IHlv dSBhc2tpbmcgZm9yIHNvbWUgZGV0YWlscy4uLiBhdCB0aGUgZW5kIHdlIHdpbGwgcHV0IHRoaW5n cyBiZWluZyBkZWJ1Z2dlZC9kZXZlbG9wZWQgYnkgdXMgYXQgOiB3d3cuYmlvd2ViZGIub3JnIGFu ZCBhbHNvIHByb3ZpZGUgdGhlbSB0byBhbnkgaW50ZXJlc3RlZCBwZW9wbGUuIEluIGFuIGlkZWFs IHdvcmxkLCBub2JvZHkgc2hvdWxkIHN1ZmZlciB0d2ljZSB3aXRoIHRoZSBzYW1lICJidWciIDst KQ0KDQo+UmVnYXJkaW5nIEludGVycHJvIGFuZCBHbGltbWVyLCB0aGUgbWFpbiBwcm9ibGVtIGlz IHRvIGtub3cgaW4gd2hpY2gNCj50YWJsZXMgd2Ugc2hvdWxkIGxvYWQgdGhlIHBhcnNlZCByZXN1 bHRzID8NCj4NCj4gDQo+DQoqIGRlc2NyaWJlIHRoZSBpbmZvIHlvdSB3YW50IHRvIHN0b3JlLg0K DQpCYXNpY2FsbHkgdGhpczoNCg0KRnJhbWVfSGl0LCBNZXRob2QgLCBNZXRob2RfQWNjZXNzaW9u LCBBY2Nlc3Npb24sIEhpdF9TdGF0dXMsIFF1ZXJ5X1N0YXJ0LCBRdWVyeV9FbmQsIERlc2NyaXB0 aW9uLCBFX3ZhbHVlDQoNCkFnYWluLCBJIGFtIGFza2luZyBQb2xpYW5hIHRvIHRha2UgY2FyZSBv ZiB0aGF0Lg0KDQoqIHN0ZXZlDQoNCkNoZWVycywgQWxiZXJ0bw0KDQoNCkFsYmVydG8gTS4gUi4g RMOhdmlsYSwgUGhEDQpLaW5ldG9wbGFzdGlkIEJpb2xvZ3kgYW5kIERpc2Vhc2UgKEJpb21lZCBD ZW50cmFsKQ0KaHR0cDovL3d3dy5raW5ldG9wbGFzdGlkcy5jb20NCmh0dHA6Ly93d3cuZGFyd2lu LmZpb2NydXouYnINCkRCQk0gLyBJbnN0aXR1dG8gT3N3YWxkbyBDcnV6IC8gRklPQ1JVWg0KQXYu IEJyYXNpbCA0MzY1DQpSaW8gZGUgSmFuZWlybywgUkosIEJyYXNpbA0KQ0VQIDIxMDQ1LTkwMA0K RW1haWw6IGRhdmlsYUBmaW9jcnV6LmJyDQogICAgICAgICAgYW1yZGF2aWxhQHlhaG9vLmNvbQ0K UGhvbmU6IDU1LTIxLTM4NjUtODIyOSAvIDM4NjUtODIwNg0KRmF4OiA1NS0yMS0yNTkwLTM0OTUN Ci0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NClRoZSBC aW93ZWJEQiBjb25zb3J0aXVtOiBodHRwOi8vd3d3LmJpb3dlYmRiLm9yZw0KDQogDQoNCg0KDQo+ QWxiZXJ0bw0KPg0KPk9uIEZyaSwgMjAwNS0wMi0xMSBhdCAxMzoyMSAtMDUwMCwgWS4gVGhvbWFz IEdhbiB3cm90ZToNCj4gDQo+DQo+Pkkgd2FzIGdvaW5nIHRvIGdpdmUgdGhlIHNhbWUgYW5zd2Vy IHN0ZXZlIGdhdmUgZm9yIGludGVycHJvIGFuZCBnZW5lDQo+PmZpbmRpbmcgcmVzdWx0cy4NCj4+ DQo+PkZvciBsb2FkaW5nIHNlcXVlbmNlcyBpbnRvIEdVUywgdGhlIGRpbGxlbWEgd2l0aCBvcHRp b24gMiBpczogaG93IGRvIHlvdQ0KPj5rbm93IHdoaWNoIHNlcXVlbmNlIHRvIGxvYWQgd2hlbiB5 b3UgbG9hZCAod2hpY2ggaXMgYmVmb3JlIHlvdSBhY3R1YWxseQ0KPj5oYXZlIHRoZSBzaW1pbGFy aXR5IHJlc3VsdCk/IE9uZSBzb2x1dGlvbiB3b3VsZCBiZSB0byBpbml0aWFsbHkgbG9hZA0KPj5j b21wbGV0ZSBkYXRhc2V0KHMpIGJ1dCBkZWxldGUgdGhvc2Ugd2l0aG91dCBzaW1pbGFyaXR5IGFm dGVyIGxvYWRpbmcNCj4+c2ltaWxhcml0eSByZXN1bHRzLg0KPj4NCj4+LVRob21hcw0KPj4NCj4+ T24gRnJpLCAxMSBGZWIgMjAwNSwgU3RldmUgRmlzY2hlciB3cm90ZToNCj4+DQo+PiAgIA0KPj4N Cj4+PmFsYmVydG8tDQo+Pj4NCj4+PndlJ3ZlIG5ldmVyIGxvYWRlZCBpbnRlcnBybywgc28gdGhl cmUgaXNuJ3QgYSBwbHVnaW4uDQo+Pj5pIGJlbGlldmUgcGxhc21vZGIgaGFzIGxvYWRlZCBnbGlt bWVyIHJlc3VsdHMsIHRob3VnaCBpJ20gbm90IHN1cmUuICAgaSBoYXZlDQo+Pj5hc2tlZCBhIHBs YXNtb2RiIGRldmVsb3BlciB0byBhbnN3ZXIgdGhhdCBxdWVzdGlvbi4NCj4+Pg0KPj4+c3RldmUN Cj4+Pg0KPj4+QWxiZXJ0byBEYXZpbGEgd3JvdGU6DQo+Pj4NCj4+PiAgICAgDQo+Pj4NCj4+Pj5I ZXkgU3RldmUsIFRob21hcywNCj4+Pj4NCj4+Pj5UaGFua3MgYSBsb3QgZm9yIHRoZSB0aXBzLCBy ZWFsbHkgaGVscGZ1bC4uIG5vdywgZmV3IG1vcmUgcXVlc3Rpb25zOg0KPj4+Pg0KPj4+Pg0KPj4+ PiAgICAgICANCj4+Pj4NCj4+Pj4+b2suICBOUiA9IE5SREINCj4+Pj4+DQo+Pj4+PnRoZSB3YXkg d2UgaGF2ZSB1c2VkIGd1cyB3aXRoIHNpbWlsYXJpdGllcyBpcyB0aGF0IGJvdGggdGhlIHF1ZXJ5 IGFuZA0KPj4+Pj5zdWJqZWN0IGFyZSBsb2FkZWQgaW50byBndXMuICBBcyB0aG9tYXMgZXhwbGFp bmVkLCB0aGUgc2ltaWxhcml0eSB0YWJsZQ0KPj4+Pj5jYXB0dXJlcyBzaW1pbGFyaXR5IGJldHdl ZW4gc2VxdWVuY2VzIHRoYXQgYXJlIGluIGd1cy4NCj4+Pj4+b3VyIGFwcHJvYWNoIGhhcyBhbHdh eXMgYmVlbiB0byBqdXN0IGxvYWQgKHdhcmVob3VzZSkgdGhlIGVudGlyZSBzdWJqZWN0DQo+Pj4+ PmRhdGFiYXNlIChOUiwgRVNUKSB0aGF0IHdlIGFyZSBibGFzdGluZyBhZ2FpbnN0Lg0KPj4+Pj4N Cj4+Pj4+dGhlIGN1cnJlbnQgcGx1Z2lucyBhbmQgYmxhc3RTaW1pbGFyaXR5IGFyZSBzZXQgdXAg Zm9yIHRoaXMuDQo+Pj4+Pg0KPj4+Pj5vYnZpb3VzbHksIHRoaXMgdGFrZXMgYSBsb3Qgb2YgZGlz ayBzcGFjZS4gIHR3byBtYWpvciBlZmZpY2llbmNpZXMgdGhhdCB3ZQ0KPj4+Pj5kb24ndCBjdXJy ZW50bHkgaGF2ZSBwbHVnaW5zIGZvciB3b3VsZCBiZToNCj4+Pj4+IDEuIHRvIG9ubHkgc3RvcmUg aW4gZ3VzIGEgKnJlZmVyZW5jZSogdG8gdGhlIGV4dGVybmFsIHNlcXVlbmNlIChpZSwgZG9uJ3QN Cj4+Pj4+c3RvcmUgdGhlIGFjdGdzKS4NCj4+Pj4+IDIuIG9ubHkgc3RvcmUgaW4gZ3VzIHRoZSBz ZXF1ZW5jZXMgdGhhdCBhY3R1YWxseSBoYXZlIHNpbWlsYXJpdGllcw0KPj4+Pj4NCj4+Pj4+ICAg ICAgICAgDQo+Pj4+Pg0KPj4+Pk9wdGlvbiAyIHNvdW5kIGJldHRlciBmb3IgdXMsIHNpbmNlIHdl IHdpbGwgYmUgYmxhc3RpbmcgYWdhaW5zdCBzZXZlcmFsDQo+Pj4+ZGF0YWJhc2VzICg+IDEwR0Ig ZGF0YWJhc2VzKQ0KPj4+Pg0KPj4+PldoYXQgYWJvdXQgdGhlIHBsdWdpbnMgdG8gbG9hZCBJbnRl cnBybyBhbmQgImdlbmUgZmluZGVyIiAoZ2xpbW1lciwgZXRjKQ0KPj4+PnJlc3VsdHMgPyBJcyB0 aGVyZSBhbnkgYXQgYWxsID8NCj4+Pj4NCj4+Pj5DaGVlcnMsIEFsYmVydG8NCj4+Pj4NCj4+Pj4N Cj4+Pj4gICAgICAgDQo+Pj4+DQo+Pj4+PnN0ZXZlDQo+Pj4+Pg0KPj4+Pj5BbGJlcnRvIERhdmls YSB3cm90ZToNCj4+Pj4+DQo+Pj4+Pg0KPj4+Pj4gICAgICAgICANCj4+Pj4+DQo+Pj4+Pj5BbGwg dGhlIGJsYXN0YWJsZSBkYXRhYmFzZXMgSSBtZW50aW9uZWQgYXJlIHN0YW5kYXJkIGRhdGFiYXNl cyBmcm9tIE5DQkkNCj4+Pj4+PihmdHA6Ly9mdHAubmNiaS5ubG0ubmloLmdvdi9ibGFzdC9kYi9i bGFzdGRiLnR4dCk6DQo+Pj4+Pj4NCj4+Pj4+Pk5UID0gbnVjbGVvdGlkZXMNCj4+Pj4+Pg0KPj4+ Pj4+fjMwMDAwIGVudHJpZXMgZnJvbSBnZW5iYW5rIChnZW5iYW5rIGZvcm1hdCkgYXJlIGxvYWRl ZCBpbnRvIEdVUyBub3cuDQo+Pj4+Pj4NCj4+Pj4+Pk5vdCBzdXJlIGFib3V0IHlvdXIgIk5SREIi LCBJIGtub3cgTlIgZnJvbSBOQ0JJIHRoYXQgaXMgYSBjb2xsZWN0aW9uIG9mDQo+Pj4+Pj5hbWlu b2FjaWQgZW50cmllcywgY291bGQgaXQgYmUgdGhlIHNhbWUgPw0KPj4+Pj4+DQo+Pj4+Pj5BbGJl cnRvDQo+Pj4+Pj4NCj4+Pj4+Pk9uIEZyaSwgMjAwNS0wMi0xMSBhdCAxMDo0MyAtMDUwMCwgU3Rl dmUgRmlzY2hlciB3cm90ZToNCj4+Pj4+Pg0KPj4+Pj4+DQo+Pj4+Pj4NCj4+Pj4+PiAgICAgICAg ICAgDQo+Pj4+Pj4NCj4+Pj4+Pj4od2hhdCBpcyBOVD8pDQo+Pj4+Pj4+DQo+Pj4+Pj4+d2hpY2gg b2YgdGhlc2UgKGdlbmJhbmssIHlvdXIgZmFzdGEsIE5SREIsIE5ULCBFU1QpIGhhdmUgeW91IGxv YWRlZCBpbnRvDQo+Pj4+Pj4+Z3VzPw0KPj4+Pj4+Pg0KPj4+Pj4+PnN0ZXZlDQo+Pj4+Pj4+DQo+ Pj4+Pj4+QWxiZXJ0byBEYXZpbGEgd3JvdGU6DQo+Pj4+Pj4+DQo+Pj4+Pj4+DQo+Pj4+Pj4+DQo+ Pj4+Pj4+ICAgICAgICAgICAgIA0KPj4+Pj4+Pg0KPj4+Pj4+Pj5RdWVyeToNCj4+Pj4+Pj4+DQo+ Pj4+Pj4+PkVpdGhlciBzZXF1ZW5jZXMgZnJvbSBnZW5iYW5rIChnZW5iYW5rIGZvcm1hdCkgb3Ig c2VxdWVuY2VzIGdlbmVyYXRlZA0KPj4+Pj4+Pj5pbg0KPj4+Pj4+Pj50aGUgbGFiIChmYXN0YSBm b3JtYXQpDQo+Pj4+Pj4+Pg0KPj4+Pj4+Pj5CbGFzdGFibGUgZGF0YWJhc2VzIChhbGwgYXJlIGZv cm1hdHRlZCBkYXRhYmFzZXMgZnJvbSBOQ0JJKToNCj4+Pj4+Pj4+DQo+Pj4+Pj4+Pk5SDQo+Pj4+ Pj4+Pk5UDQo+Pj4+Pj4+PkVTVA0KPj4+Pj4+Pj4NCj4+Pj4+Pj4+QWxiZXJ0bw0KPj4+Pj4+Pj4N Cj4+Pj4+Pj4+T24gRnJpLCAyMDA1LTAyLTExIGF0IDEwOjM0IC0wNTAwLCBTdGV2ZSBGaXNjaGVy IHdyb3RlOg0KPj4+Pj4+Pj4NCj4+Pj4+Pj4+DQo+Pj4+Pj4+Pg0KPj4+Pj4+Pj4NCj4+Pj4+Pj4+ ICAgICAgICAgICAgICAgDQo+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Zm9yIHRoZSBibGFzdCwgd2hhdCBh cmUgdGhlIHF1ZXJ5IHNlcXVlbmNlcyBhbmQgd2hhdCBhcmUgdGhlIGJsYXN0YWJsZQ0KPj4+Pj4+ Pj4+ZGF0YWJhc2VzPw0KPj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj5zdGV2ZQ0KPj4+Pj4+Pj4+DQo+Pj4+ Pj4+Pj5BbGJlcnRvIERhdmlsYSB3cm90ZToNCj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+DQo+Pj4+Pj4+ Pj4NCj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+ICAgICAgICAgICAgICAgICANCj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+PkJhc2ljYWxseSB3ZSB3aWxsIHVzZSBzZXF1ZW5jZXMgKGxvYWRlZCBpbnRvIEdVUyB3 aXRoIHRoZSBHQlBhcnNlcikNCj4+Pj4+Pj4+Pj5mb3INCj4+Pj4+Pj4+Pj5OQ0JJIEJsYXN0IChC bGFzdHgsIEJsYXN0cCBhbmQgVEJsYXN0WCksIHRoZSBzYW1lIHNlcXVlbmNlcyB3aWxsIGJlDQo+ Pj4+Pj4+Pj4+YWxzbw0KPj4+Pj4+Pj4+PnVzZWQgZm9yIEludGVycHJvIGFuYWx5c2VzLiBSZXN1 bHRzIG9mIGJvdGggKEJsYXN0IGFuZCBJbnRlcnBybykgd2lsbA0KPj4+Pj4+Pj4+PmJlDQo+Pj4+ Pj4+Pj4+bG9hZGVkIGludG8gR1VTLiBXZSB3aWxsIHBhcnNlIHNwZWNpZmljIHRoaW5ncyBmcm9t IHRoZSBCbGFzdA0KPj4+Pj4+Pj4+PnJlc3VsdHMsIEkNCj4+Pj4+Pj4+Pj53b3VsZCBzYXk6DQo+ Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+YEdpYCBgQWNjZXNzaW9uYCBgRGVzY3JpcHRpb25gIGBFX3Zh bHVlYCBgU2NvcmVgIGBMZW5ndGhgDQo+Pj4+Pj4+Pj4+YEZyYW1lX1F1ZXJ5YCBgRnJhbWVfSGl0 YCBgSWRlbnRpY2FsYCBgSHNwX0ZyYWNfSWRlbnRpY2FsYA0KPj4+Pj4+Pj4+PmBDb25zZXJ2ZWRg IGBIc3BfRnJhY19Db25zZXJ2ZWRgDQo+Pj4+Pj4+Pj4+YFF1ZXJ5X1N0YXJ0YA0KPj4+Pj4+Pj4+ PmBRdWVyeV9FbmRgIGBIaXRfU3RhcnRgIGBIaXRfRW5kYCBgSHNwX0FsaWduYCBgZGF0YWJhc2Vf bGV0dGVyc2ANCj4+Pj4+Pj4+Pj5gZGF0YWJhc2VfZW50cmllc2ANCj4+Pj4+Pj4+Pj5XZSBhbHJl YWR5IGhhdmUgYSBCaW9wZXJsIHBhcnNlciBmb3IgdGhhdCAoc3BlY2lmaWMgZm9yIGFub3RoZXIN Cj4+Pj4+Pj4+Pj5zeXN0ZW06DQo+Pj4+Pj4+Pj4+R0FSU0EpIHRoYXQgY291bGQgYmUgYWRhcHRl ZCB0byBHVVMsIHByb2JsZW0gYmVpbmcgd2UgYXJlIG5vdCBzdXJlDQo+Pj4+Pj4+Pj4+d2hhdA0K Pj4+Pj4+Pj4+PnRhYmxlcyBzaG91bGQgYmUgdXNlZCB0byBzdG9yZSB0aG9zZSBkYXRhIGluIEdV Uy4NCj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj5DaGVlcnMsIEFsYmVydG8NCj4+Pj4+Pj4+Pj4NCj4+ Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj5PbiBGcmksIDIwMDUtMDItMTEgYXQgMTA6MDYgLTA1MDAsIFN0 ZXZlIEZpc2NoZXIgd3JvdGU6DQo+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+DQo+ Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+ICAgICAgICAgICAgICAgICAgIA0KPj4+ Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj53aGF0IGFyZSB5b3UgcGxhbm5pbmcgb24gYmxhc3Rpbmc/DQo+ Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj5zdGV2ZQ0KPj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+QWxi ZXJ0byBEYXZpbGEgd3JvdGU6DQo+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+ DQo+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+PiAgICAg ICAgICAgICAgICAgICAgIA0KPj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+PkhpIFN0ZXZlLA0KPj4+ Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj5PbiBGcmksIDIwMDUtMDItMTEgYXQgMDg6NTYgLTA1MDAs IFN0ZXZlIEZpc2NoZXIgd3JvdGU6DQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+ Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+ Pj4+Pj4gICAgICAgICAgICAgICAgICAgICAgIA0KPj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4+ cG9saWFuYS0NCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj5vb3BzLCB0aGUgdXNhZ2Ugc3Rh dGVtZW50IGZvciBMb2FkQmxhc3RTaW1GYXN0IGlzIG91dCBvZiBkYXRlLg0KPj4+Pj4+Pj4+Pj4+ Pml0IHNob3VsZCBpbnN0cnVjdCB5b3UgdG8gdXNlIHRoZSBibGFzdFNpbWlsYXJpdHkgY29tbWFu ZC4NCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj5Mb2FkQmxhc3RTaW1GYXN0IG1ha2VzIGEg YmlnIGFzc3VtcHRpb24sIHRoYXQgdGhlIHN1YmplY3QgYW5kDQo+Pj4+Pj4+Pj4+Pj4+cXVlcnkg c2VxdWVuY2VzIGFyZSBpbiBHVVMsIGFuZCB0aGVpciBkZWYuIGxpbmVzIGhhdmUgR1VTIHByaW1h cnkNCj4+Pj4+Pj4+Pj4+Pj5rZXlzLg0KPj4+Pj4+Pj4+Pj4+PkFyZSB5b3VyIHNlcXVlbmNlcyBh bHJlYWR5IGxvYWRlZCBpbnRvIEdVUz8NCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4NCj4+ Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4N Cj4+Pj4+Pj4+Pj4+Pj4gICAgICAgICAgICAgICAgICAgICAgICAgDQo+Pj4+Pj4+Pj4+Pj4+DQo+ Pj4+Pj4+Pj4+Pj5UaGV5IGFyZSBub3QsIHRoZXJlIHdvdWxkIGJlIGFueSBob3d0by90aXBzIGZv ciB0aGF0IHBsdWdpbiA/IFdlDQo+Pj4+Pj4+Pj4+Pj53aWxsDQo+Pj4+Pj4+Pj4+Pj5jZXJ0YWlu bHkgbmVlZCBhIHBsdWdpbiB0byBsb2FkICJJbnRlcnBybyIgYW5kICJPUkYgZmluZGluZyINCj4+ Pj4+Pj4+Pj4+PnJlc3VsdHMNCj4+Pj4+Pj4+Pj4+PmludG8gR1VTLi4uIElmIHRoZXkgYXJlIG5v dCBhdmFpbGFibGUsIHRoZW4gbWF5YmUgd2Ugd2lsbCBoYXZlIHRvDQo+Pj4+Pj4+Pj4+Pj53cml0 ZQ0KPj4+Pj4+Pj4+Pj4+dGhlbSAuLi4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Q2hlZXJz LCBBbGJlcnRvDQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+DQo+Pj4+ Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+DQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+ Pj4+Pj4+PiAgICAgICAgICAgICAgICAgICAgICAgDQo+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+ Pj5zdGV2ZQ0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+ Pj4+Pj4+Pj4+PlBvbGlhbmEgTWF0ZXVzIHdyb3RlOg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+ Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+ Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+Pg0KPj4+Pj4+Pj4+Pj4+PiAgICAgICAgICAgICAgICAgICAg ICAgICANCj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+SGVsbG8gYWxsLA0KPj4+Pj4+Pj4+ Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+V2hlcmUgY2FuIGZpbmQgdGhlIHNjcmlwdCBwYXJzZUJsYXN0 RmlsZXNGb3JTaW1pbGFyaXR5LnBsPz8NCj4+Pj4+Pj4+Pj4+Pj4+SSdtIHRyeWluZyB0byBydW4g TG9hZEJsYXN0U2ltRmFzdC4uLg0KPj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+UG9saWFu YQ0KPj4+Pj4+Pj4+Pj4+Pj4NCj4+Pj4+Pj4+Pj4+Pj4+ICAgICAgICAgICAgICAgICAgICAgICAg ICAgDQo+Pj4+Pj4+Pj4+Pj4+Pg0KDQoNCg== |
From: Sucheta T. <su...@vb...> - 2005-02-13 16:50:50
|
What about an UTR? Also I have difficulty finding links between the transcript with genefeature. We have a particular gene described in genefeature view and its corresponding transcript in dots.transcript. Sucheta > in the case of gene, rna and protein, you would use Gene and > GeneInstance (and similar for RNA, Protein) > > Gene and GeneInstance are designed for that. Gene is the "concept of > the gene." GeneInstance links Gene and GeneFeature. Each "instance" > represents a "proposal" for what the gene looks like, and what sequence > it might be found on. > > steve > > Sucheta Tripathy wrote: > >>Steve, >> >>I know the way we have been using GUS is slightly different from the way >>it was meant to be used. In our case, we have a scaffold gets a >>na_sequence_id which has several features. Now if we are to create >>different features for a particular gene for different algorithms used, >>how would link them up together? >> >>Sucheta >> >> >> >>>sucheta- >>> >>>i think the way we have generally done this in the past is to add a new >>>feature, not location, for each algotithm or person that predicts or >>>edits it. that is why we have the review and algorithm info on the >>>feature. >>> >>>steve >>> >>>Sucheta Tripathy wrote: >>> >>> >>> >>>>Hi Group, >>>> >>>>>From community annotation point of view, I was wondering if it is a >>>>> good >>>>idea to have is_reviewed, algorithm_id and reviewer_id in >>>> dots.nalocation >>>>table. >>>> >>>>Since one na_feature_id( a transcript or a gene) may be having multiple >>>>sets of nalocations, so one can easily capture them in nalocation with >>>>different algorithms and with a reviewed option. >>>> >>>>In our application we need several gene calling programs to have >>>>locations >>>>as well as related information registered. >>>> >>>>Sucheta >>>> >>>> >>>> >>>> >>>> >>>> >> >> >> >> > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > -- Sucheta Tripathy Virginia Bioinformatics Institute Phase-I Washington street. Virginia Tech. Blacksburg,VA 24061-0447 phone:(540)231-8138 Fax: (540) 231-2606 |
From: Steve F. <sfi...@pc...> - 2005-02-13 15:18:03
|
in the case of gene, rna and protein, you would use Gene and GeneInstance (and similar for RNA, Protein) Gene and GeneInstance are designed for that. Gene is the "concept of the gene." GeneInstance links Gene and GeneFeature. Each "instance" represents a "proposal" for what the gene looks like, and what sequence it might be found on. steve Sucheta Tripathy wrote: >Steve, > >I know the way we have been using GUS is slightly different from the way >it was meant to be used. In our case, we have a scaffold gets a >na_sequence_id which has several features. Now if we are to create >different features for a particular gene for different algorithms used, >how would link them up together? > >Sucheta > > > >>sucheta- >> >>i think the way we have generally done this in the past is to add a new >>feature, not location, for each algotithm or person that predicts or >>edits it. that is why we have the review and algorithm info on the >>feature. >> >>steve >> >>Sucheta Tripathy wrote: >> >> >> >>>Hi Group, >>> >>>>From community annotation point of view, I was wondering if it is a good >>>idea to have is_reviewed, algorithm_id and reviewer_id in dots.nalocation >>>table. >>> >>>Since one na_feature_id( a transcript or a gene) may be having multiple >>>sets of nalocations, so one can easily capture them in nalocation with >>>different algorithms and with a reviewed option. >>> >>>In our application we need several gene calling programs to have >>>locations >>>as well as related information registered. >>> >>>Sucheta >>> >>> >>> >>> >>> >>> > > > > |
From: Sucheta T. <su...@vb...> - 2005-02-13 14:43:19
|
Steve, I know the way we have been using GUS is slightly different from the way it was meant to be used. In our case, we have a scaffold gets a na_sequence_id which has several features. Now if we are to create different features for a particular gene for different algorithms used, how would link them up together? Sucheta > sucheta- > > i think the way we have generally done this in the past is to add a new > feature, not location, for each algotithm or person that predicts or > edits it. that is why we have the review and algorithm info on the > feature. > > steve > > Sucheta Tripathy wrote: > >>Hi Group, >> >>>From community annotation point of view, I was wondering if it is a good >>idea to have is_reviewed, algorithm_id and reviewer_id in dots.nalocation >>table. >> >>Since one na_feature_id( a transcript or a gene) may be having multiple >>sets of nalocations, so one can easily capture them in nalocation with >>different algorithms and with a reviewed option. >> >>In our application we need several gene calling programs to have >> locations >>as well as related information registered. >> >>Sucheta >> >> >> >> > -- Sucheta Tripathy Virginia Bioinformatics Institute Phase-I Washington street. Virginia Tech. Blacksburg,VA 24061-0447 phone:(540)231-8138 Fax: (540) 231-2606 |
From: Steve F. <sfi...@pc...> - 2005-02-13 14:02:11
|
sucheta- i think the way we have generally done this in the past is to add a new feature, not location, for each algotithm or person that predicts or edits it. that is why we have the review and algorithm info on the feature. steve Sucheta Tripathy wrote: >Hi Group, > >>From community annotation point of view, I was wondering if it is a good >idea to have is_reviewed, algorithm_id and reviewer_id in dots.nalocation >table. > >Since one na_feature_id( a transcript or a gene) may be having multiple >sets of nalocations, so one can easily capture them in nalocation with >different algorithms and with a reviewed option. > >In our application we need several gene calling programs to have locations >as well as related information registered. > >Sucheta > > > > |
From: Sucheta T. <su...@vb...> - 2005-02-13 03:34:20
|
Hi Group, From community annotation point of view, I was wondering if it is a good idea to have is_reviewed, algorithm_id and reviewer_id in dots.nalocation table. Since one na_feature_id( a transcript or a gene) may be having multiple sets of nalocations, so one can easily capture them in nalocation with different algorithms and with a reviewed option. In our application we need several gene calling programs to have locations as well as related information registered. Sucheta -- Sucheta Tripathy Virginia Bioinformatics Institute Phase-I Washington street. Virginia Tech. Blacksburg,VA 24061-0447 phone:(540)231-8138 Fax: (540) 231-2606 |