From: Brian B. <br...@pc...> - 2007-01-16 14:11:38
|
Dhivya, your gus.config file ($GUS_HOME/config/gus.config) specifies the database login information. This is used by the gus application to determine which database to connect to so if you want to change between databases you will need to edit this file. -Brian On Jan 15, 2007, at 2:52 PM, Dhivya Aras wrote: > Hi Chris, > > THanks for that information. I was able to get all the input > parameters and you were right about the gi_taxid_prot.dmp- It is a > download from ncbi, though not part of the nrdb bundle. > I didnt know who would be the right person for this question: So, > Chris, Brian or anyone in the group, if you have any ideas, let me > know. > I have two gus dbs on my machine. I actually wanted to run the > loadNRDB script on the backup version but I couldnt find any input > parameter where I can specify which database the NRDB will be > loaded in. How would I specify this? Does it have to do the > location/path where I'm running the script from? > > thanks > dhivya > > > > Chris Stoeckert <sto...@pc...> wrote: > Hi Dhivya, > I'm putting this back onto the gusdev so that answers may help > others (or others can correct my answers). I should also warn you > that as PI of the project, I don't actually run any of the code so > this is a test of how well I understand what's going on. ;) >> sres.externaldatabaserelease.version for this instance of >> NRDB >> sres.externaldatabase.name for NRDB > To load nrdb or any other external "database" (really data source), > that source needs to be entered into externaldatabase and the > version (can simply be date when you downloaded it) entered into > databaserelease. These can be entered manually into those tables. > If this is a new version, then NRDB should already be in > externaldatabase. You simply need to enter a new row in > externaldatabaserelease for NRDB and give whatever you put in the > version field. > >> pathname for the gi_taxid_prot.dmp file > I'm guessing that this is a pointer to the dump file that comes > with NRDB providing the taxon_id for each protein sequence but > that's just a guess. > > Chris > > On Jan 12, 2007, at 5:01 PM, Dhivya Aras wrote: > >> Hi Chris, >> >> I'm trying to load a new NRDB version into gus using loadNRDB >> plugin. It requires several compulsory input parameters. I dont >> understand what three of them are. The ga help describes it as: >> >> externalDatabaseVersion *string* (Required) >> >> sres.externaldatabaserelease.version for this instance of >> NRDB >> >> >> gitax *file* (Required) >> >> pathname for the gi_taxid_prot.dmp file >> >> and >> >> externalDatabaseName *string* (Required) >> >> sres.externaldatabase.name for NRDB >> >> Could you point me some docs or information about these arguments? >> thanks >> dhivya >> >> >> >> Chris Stoeckert <sto...@pc...> wrote: >> Hi Brian, >> Dhivya found the Djob plugin but did not find any documentation on >> how to run. Can you point him at the appropriate place or person? >> Can this be added to the GUS svn somewhere? >> Thanks, >> Chris >> >> Chris Stoeckert, Ph.D. >> Research Professor, Dept. of Genetics >> 1415 Blockley Hall, Center for Bioinformatics >> 423 Guardian Dr., University of Pennsylvania >> Philadelphia, PA 19104 >> Ph: 215-573-4409 FAX: 215-573-3111 >> http://www.cbil.upenn.edu >> >> >> On Jan 11, 2007, at 8:15 AM, Brian Brunk wrote: >> >>> blastSimilarity does not appear to be in the GUS distribution, >>> nor is it in CBIL/Bio. In my project_home it is in DJob. Seems >>> to me like blastSimilarity should be in the GUS distribution >>> that one can download from the gusdb.org site (or check out of >>> the repository). I also have a script called >>> parseBlastFilesForSimilarity.pl that takes in BLAST file names on >>> stdin that is very useful for parsing blast files into the format >>> to be loaded into the db that could be included. >>> >>> -Brian >>> >>> On Jan 10, 2007, at 4:52 PM, Chris Stoeckert wrote: >>> >>>> Hi, >>>> Can anyone help with this question about the input file to >>>> InsertBlastSimilarities? >>>> Thanks, >>>> Chris >>>> >>>> Begin forwarded message: >>>> >>>>> From: Dhivya Aras <dhi...@ya...> >>>>> Date: January 10, 2007 4:25:25 PM EST >>>>> To: Chris Stoeckert <sto...@pc...> >>>>> Subject: Re: [GUSDEV] loading COG and blast annotation results >>>>> into GUS >>>>> >>>>> Hi Chris, >>>>> >>>>> I understand that to load the blast data into the two tables, >>>>> similarity and similarityspan, I need to use the gus supported >>>>> plugin, InsertBlastSimilarities. But, this plugin asks for an >>>>> input file 'generated by the blastSimilarity command >>>>> (distributed with GUS in the CBIL/Bio component)'. Any idea >>>>> where I can find this blastSimilarity utility? >>>>> >>>>> Thanks >>>>> dhivya >>>>> >>>>> >>>>> Chris Stoeckert <sto...@pc...> wrote: >>>>> Again, answers in-line. >>>>> Chris >>>>> >>>>> On Jan 9, 2007, at 2:27 PM, Dhivya Aras wrote: >>>>> >>>>>> Hi Chris, >>>>>> >>>>>> I did look at the GUS schema browser- unfortunately most >>>>>> tables dont have any documentation- I could just see the >>>>>> attributes in each table and maybe a small description of the >>>>>> attribute. >>>>>> >>>>>> But thanks to your reply, I do understand the necessity for >>>>>> the two tables , similarity and similarityspan now. The way I >>>>>> understand it- the query_table_id points to a table in which >>>>>> the query sequence data is and the query_id indicates the row >>>>>> in that table. So, for example, if the query_table_id points >>>>>> to externalNASequence, I'm assuming the query_id points to the >>>>>> primary key of that table, Na_Sequence_ID. Am I right in this >>>>>> assumption? >>>>> >>>>> yes that's right. >>>>> >>>>>> Basically, I have an exisiting gus db with data and I have >>>>>> some new blastp results of AA sequences against NRDB. Here's >>>>>> what I think needs to be done to put these new blast results >>>>>> into the GUS db. Please fill in gaps as I'm vague on some areas. >>>>>> >>>>>> 1. Store each hsp in the similarityspan table. I've mapped all >>>>>> the blast fields to the table's fields- thats not a problem. >>>>> yes >>>>> >>>>>> 2. SInce the query is an AA sequence, which table should the >>>>>> query_table_id point to? TranslatedAASequence with the >>>>>> query_id pointing to AA_Sequence_id? >>>>> yes (assuming you are doing a blastp with a sequence from >>>>> TranslatedAASequence - note that AASequence could also come >>>>> from other views of AASequence). >>>>> >>>>>> 3. Since the subject is from the NRDB, I'm guessing that the >>>>>> query_table_id should point to externalAASequence with the >>>>>> query_id pointing to AA_Sequence_ID. >>>>> >>>>> yes (assuming you loaded nrdb into ExternalAASequence). >>>>> >>>>>> 4. I think these are the only tables I would be affecting for >>>>>> adding these new blastP results. Am I right? >>>>> >>>>> yes (mostly). Using ga you'll also get audit tables populated >>>>> like algorithm invocation. >>>>> >>>>>> I know I've asked quite a few questions, but I'm really not >>>>>> able to find too much information on what the tables and >>>>>> fields mean and what they contain. So I'm hoping you can help >>>>>> me out. >>>>> No problem - we need to improve the docs. >>>>> >>>>>> thanks >>>>>> dhivya >>>>>> >>>>>> Chris Stoeckert <sto...@pc...> wrote: >>>>>> >>>>>> See answers in-line. Also, did you look at the documentation >>>>>> in the GUS schema browser? The tables (I know many don't) >>>>>> actually have table and attribute descriptions. Were they too >>>>>> vague (i.e. do we need to improve them? >>>>>> >>>>>> Chris >>>>>> >>>>>>> Thanks for replying. I have currently been working on putting >>>>>>> my blast results in similarity and similarityspan tables. >>>>>>> But, I have two questions about these tables. Maybe you could >>>>>>> help me out here. >>>>>>> >>>>>>> 1. SImilarity and SImilarityspan have pretty much the same >>>>>>> fields except than similarityspan is a child table of >>>>>>> Similarity. So, why do I even need the SImilaritySpan table? >>>>>> These tables have different purposes (and semantics). Think of >>>>>> Similarity as global (what's the overall similarity between >>>>>> two proteins) and SimilaritySpan as local (what are the >>>>>> individual HSPs). >>>>>> >>>>>>> 2. I couldnt find any fields in the Similarity table for >>>>>>> storing the actual query and subject annotation. Most >>>>>>> probably this can be done by referring to some other table >>>>>>> with the annotation. But I find that the only two fields >>>>>>> refferring to other tables are query_table_id and >>>>>>> subject_table_id which just refer to the core.TableInfo. I'm >>>>>>> confused about these two fields and exactly how they can be >>>>>>> used to refer to the query and subject annotation? >>>>>> >>>>>> The query and subject sequences are identified (as you may >>>>>> have guessed) with the soft links query_table_id and >>>>>> subject_table_id although these attributes can point to >>>>>> anything relevant. Our semantics are that they point the >>>>>> entities (e.g., nucleic acid sequence, amino acid sequence, >>>>>> possibly dbref) and annotation is associated with those entities. >>>>>> >>>>>>> Any help or suggestions would be helpful. >>>>>>> >>>>>>> Thanks >>>>>>> dhivya >>>>>>> >>>>>>> Chris Stoeckert <sto...@pc...> wrote: >>>>>>> Dear Dhivya, >>>>>>> Sorry for the long delay in replying. >>>>>>> You guessed correctly about Similarity and SImilaritySpan. >>>>>>> These were >>>>>>> designed to hold BLAST results (as well as results from other >>>>>>> analyses). >>>>>>> >>>>>>> For ortholog tables you might check the GUS schema browser >>>>>>> (http:// >>>>>>> www.gusdb.org/SchemaBrowser/) and scroll down to the categories: >>>>>>> Paralogs and Family; Sequence Ortholog, Paralog, Family AA >>>>>>> Ortholog. >>>>>>> >>>>>>> Looking over old notes for OrthoMCL, it looks like >>>>>>> DoTS.BestSimilarityPair is the table that we store summarized >>>>>>> ortholog info data for queries. >>>>>>> >>>>>>> Hope this helps, >>>>>>> Chris >>>>>>> >>>>>>> On Dec 16, 2006, at 3:38 PM, Dhivya Aras wrote: >>>>>>> >>>>>>> > Hi everyone, >>>>>>> > >>>>>>> > I would like to store COG annotation and blast results in >>>>>>> GUS. I >>>>>>> > did find two tables named similarity and similarityspan in >>>>>>> the dots >>>>>>> > schema - It looks like this can hold blast results but I >>>>>>> need to >>>>>>> > investigate more. >>>>>>> > >>>>>>> > As far as COG is concerned, I couldnt find any table >>>>>>> supporting >>>>>>> > this data. I was told that orthoMcl data is stored in >>>>>>> > dots.SequenceGroup and dots.SequenceSequenceGroup, but I'm not >>>>>>> > sure it that would best suit my needs. So, if anyone who >>>>>>> has used >>>>>>> > GUS for these purposes before or just has an idea, pleas >>>>>>> let me >>>>>>> > know. I would really appreciate it. >>>>>>> > >>>>>>> > thanks >>>>>>> > dhivya arasappan >>>>>>> > __________________________________________________ >>>>>>> > Do You Yahoo!? >>>>>>> > Tired of spam? Yahoo! Mail has the best spam protection around >>>>>>> > http://mail.yahoo.com >>>>>>> > >>>>>>> > >>>>>>> ---------------------------------------------------------------- >>>>>>> ------ >>>>>>> > --- >>>>>>> > Take Surveys. Earn Cash. Influence the Future of IT >>>>>>> > Join SourceForge.net's Techsay panel and you'll get the >>>>>>> chance to >>>>>>> > share your >>>>>>> > opinions on IT & business topics through brief surveys - >>>>>>> and earn cash >>>>>>> > http://www.techsay.com/default.php? >>>>>>> > >>>>>>> page=join.php&p=sourceforge&CID=DEVDEV__________________________ >>>>>>> ______ >>>>>>> > _______________ >>>>>>> > Gusdev-gusdev mailing list >>>>>>> > Gus...@li... >>>>>>> > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>> >>>>>>> >>>>>>> ---------------------------------------------------------------- >>>>>>> --------- >>>>>>> Take Surveys. Earn Cash. Influence the Future of IT >>>>>>> Join SourceForge.net's Techsay panel and you'll get the >>>>>>> chance to share your >>>>>>> opinions on IT & business topics through brief surveys - and >>>>>>> earn cash >>>>>>> http://www.techsay.com/default.php? >>>>>>> page=join.php&p=sourceforge&CID=DEVDEV >>>>>>> _______________________________________________ >>>>>>> Gusdev-gusdev mailing list >>>>>>> Gus...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>>> >>>>>>> __________________________________________________ >>>>>>> Do You Yahoo!? >>>>>>> Tired of spam? Yahoo! Mail has the best spam protection around >>>>>>> http://mail.yahoo.com >>>>>>> ---------------------------------------------------------------- >>>>>>> --------- >>>>>>> Take Surveys. Earn Cash. Influence the Future of IT >>>>>>> Join SourceForge.net's Techsay panel and you'll get the >>>>>>> chance to share your >>>>>>> opinions on IT & business topics through brief surveys - and >>>>>>> earn cash >>>>>>> http://www.techsay.com/default.php? >>>>>>> page=join.php&p=sourceforge&CID=DEVDEV__________________________ >>>>>>> _____________________ >>>>>>> Gusdev-gusdev mailing list >>>>>>> Gus...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>> >>>>>> ----------------------------------------------------------------- >>>>>> -------- >>>>>> Take Surveys. Earn Cash. Influence the Future of IT >>>>>> Join SourceForge.net's Techsay panel and you'll get the chance >>>>>> to share your >>>>>> opinions on IT & business topics through brief surveys - and >>>>>> earn cash >>>>>> http://www.techsay.com/default.php? >>>>>> page=join.php&p=sourceforge&CID=DEVDEV___________________________ >>>>>> ____________________ >>>>>> Gusdev-gusdev mailing list >>>>>> Gus...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>>> >>>>>> __________________________________________________ >>>>>> Do You Yahoo!? >>>>>> Tired of spam? Yahoo! Mail has the best spam protection around >>>>>> http://mail.yahoo.com >>>>> >>>>> >>>>> >>>>> Need a quick answer? Get one in minutes from people who know. >>>>> Ask your question on Yahoo! Answers. >>>> >>>> _______________________________________________ >>>> CBIL mailing list >>>> CB...@pc... >>>> https://mail.pcbi.upenn.edu/mailman/listinfo/cbil >>> >>> Brian P. Brunk, Ph.D. >>> ApiDB Senior Manager >>> 1424 Blockley Hall >>> Penn Center For Bioinformatics >>> University of Pennsylvania >>> Philadelphia PA 19104-6021 >>> Tel: 215-573-3118 >>> Fax: 215-573-3111 >>> >>> >> >> >> >> Need a quick answer? Get one in minutes from people who know. Ask >> your question on Yahoo! Answers. > > > > Don't pick lemons. > See all the new 2007 cars at Yahoo! Autos. Brian P. Brunk, Ph.D. ApiDB Senior Manager 1424 Blockley Hall Penn Center For Bioinformatics University of Pennsylvania Philadelphia PA 19104-6021 Tel: 215-573-3118 Fax: 215-573-3111 |