From: Deborah F. P. <pi...@sn...> - 2002-07-25 12:22:08
|
Hi, startRunGenerateBlastSimilarity.pl is a perl script that orchestrates blast runs on a cluster. In this case the query sequences are the assemblies and the subject databases are sequences from prodom, nrdb, or cdd. We are currenntly running the script via a controller that Steve Fischer wrote. Here is an example of an old command line that might give you some information: startRunGenerateBlastSimilarity.pl --regex '(\S+)' --blastProgram blastp --database prodom2001 --seqFile ~/BlastRuns/plas_v3.1VsProdom.cds --setsize 500 --nodes '72, 73' --blastParams 'wordmask=seg+xnu W=3 T=1000' And here is a sample output: >40142473 (0 subjects) >40142515 (0 subjects) >40142529 (0 subjects) >40142583 (0 subjects) >40142613 (1 subjects) Sum: 8022685:114:1.2e-09:7:84:2:227:2:66:33:43:1:-1 HSP1: 8022685:23:32:49:114:1.2e-09:36:84:2:148:1:-1 HSP2: 8022685:10:11:17:44:1.2e-09:7:23:177:227:1:-3 >40142617 (0 subjects) >97378349 (1 subjects) We extract the assembly, prodom, and nrdb sequences from the Assembly view of NASequenceImp and the MotifAASequence and ExternalAASequence views of AASequenceImp. We do this using another perl script, dumpSequencesFromTable.pl, so that they have GUS identifiers. CDD sequences are handled using Rps Blast so that GUSidentifiers have to be attached afterwards using substitutePrimKeysInSimilarity.pl. Finally, the resulting similarity files are loaded into the Similarity table (from which previous similarities for that taxon have been deleted) using LoadBlastSimilaritiesPK.pm. There are more details but I may have already told you more than you want to know right now. We are hard at work on a script that automates the entire DoTS build and annotation process. It is in the development stage at the moment although we are using it for a large part of the process. I hope this helps. Deborah Pinney On 25 Jul 2002, Keith James wrote: > > Hi, > > Sorry if you get this twice - the first time I posted nobody at Sanger > seemed to get my message. > > In plugins I found LoadBlastSimilaritiesPK.pm which appears to use > output from a script called generateBlastSimilarity.pl (which we do > not have). What does generateBlastSimilarity.pl do and what file > format does it produce, please? > > thanks, > > Keith > > |