From: Amr A. <am...@br...> - 2009-03-25 14:36:47
|
Apparently running this process has broken amosvalidate. I have 2 versions of a wrapper script that includes amosvalidate. One version essentially runs through the process outline below, while the other does not use the options to give a seq and qual file. Amosvalidate works fine in this version, but when running it on a bank file generated with the seq and qual files included, it breaks on the analyzeSNPs step as follows: /Doing step 400: Analyzing SNPs Command: /seq/finishing/specialprojects/HawkEye2/amos-2.0.8/bin/analyzeSNPs -i -b L36066orchid.ace.0.hawkeye.bnk -S -cumqv 40 -minsnps 2 -r -H > L36066orchid.ace.0.hawkeye.snps exited with status: 134/ Log file contents, offending error in bold text: !!! 2009-03-25 10:34:03 Started by amr@node170 on Wed Mar 25 10:34:03 2009 !!! 2009-03-25 10:34:03 Doing step 100: Creating FastA !!! 2009-03-25 10:34:03 Running: /seq/finishing/specialprojects/HawkEye2/amos-2.0.8/bin/bank2fasta -b L36066orchid.ace.0.hawkeye.bn k > L36066orchid.ace.0.hawkeye.fasta !!! 2009-03-25 10:34:03 Done! Elapsed time:0d 0h 0m 0s !!! 2009-03-25 10:34:03 Doing step 400: Analyzing SNPs !!! 2009-03-25 10:34:03 Running: /seq/finishing/specialprojects/HawkEye2/amos-2.0.8/bin/analyzeSNPs -i -b L36066orchid.ace.0.hawkey e.bnk -S -cumqv 40 -minsnps 2 -r -H > L36066orchid.ace.0.hawkeye.snps *Searching 1 contigs..................................................terminate called after throwing an instance of 'AMOS::ArgumentE xception_t'* * what(): IID '3177' does not exist in bank* /bin/sh: line 1: 25073 Aborted /seq/finishing/specialprojects/HawkEye2/amos-2.0.8/bin/analyzeSNPs -i -b L36066orchid .ace.0.hawkeye.bnk -S -cumqv 40 -minsnps 2 -r -H >L36066orchid.ace.0.hawkeye.snps !!! 2009-03-25 10:34:06 Command: /seq/finishing/specialprojects/HawkEye2/amos-2.0.8/bin/analyzeSNPs -i -b L36066orchid.ace.0.hawkey e.bnk -S -cumqv 40 -minsnps 2 -r -H > L36066orchid.ace.0.hawkeye.snps exited with status: 134 !!! END - Elapsed time: 0d 0h 0m 3s Amr Abouelleil wrote: > Excellent! I was able to replicate your results, thanks for your help > Michael and Dan for working it out. I agree, this process should be > documented in the manual. > > Dan Bolser wrote: > >> 2009/3/24 Amr Abouelleil <am...@br...>: >> >> >>> Good to hear, I am working on replicating your results with one of our >>> assemblies. >>> Is the seq file just a multifasta containing every read ? Can you outline >>> everything you did so I can try to replicate it? >>> >>> >> Yeah... I was wondering if you would say that ;-) >> >> From the output of Phred I have a .seq and a .qual, they are both >> multifasta files containing the called sequence and the call >> qualities, respectively. >> >> Here is an example: >> >> $ head *.seq *.qual >> ==> read_seq_BAC04.screen.screen.real.seq <== >> >> >>> KN654-BAC04-01_A01.b.abi CHROMAT_FILE: KN654-BAC04-01_A01.b.abi PHD_FILE: KN654-BAC04-01_A01.b.abi.phd.1 CHEM: term DYE: big TIME: Wed Feb 25 11:03:10 2009 >>> >>> >> NNNAAAGCGCGAGATCAGCTTGATTCGTTTGAAATTTCTGAAAATGTTTT >> TTTATTTTTTATTTTTAAGATTCGGTATCTGACTCAATGACCCAATAATA >> AATGGATTAATTATTGATGAATTGAACAGGTGTCATGGAATATGCAAAGT >> TTTAAGCATACTGTGAAACGGTTAGAAGAAGTTTCAGTTTCATGTAGAGG >> GATAGAGAGAGTTCAGTTTCAGTTTCATGTGTATCAACTAGTTGATAATA >> TAACATAACTTGTAAAGAAATAATTTCAAATTGCATTTGTATCATTACTT >> TTTCAATAAGCATGTACTAGCTATATGTAATTCCAAAAAGGTCAAGCCAA >> TTATACTCACCGATTATATTATTTTGCCATAAAAATATCATCACTTCAAT >> TTGGTATATATAAATATAAACTAAATTGCTCAAAAAAAGTAAAATAACCA >> >> ==> read_seq_BAC04.screen.screen.real.qual <== >> >> >>> KN654-BAC04-01_A01.b.abi PHD_FILE: KN654-BAC04-01_A01.b.abi.phd.1 >>> >>> >> 0 0 0 3 6 6 6 6 8 10 11 13 8 8 8 12 13 22 24 15 18 >> 18 20 19 20 26 26 28 25 23 23 19 21 28 32 37 37 37 >> 37 37 32 35 37 37 37 47 47 50 59 59 31 36 36 36 36 >> 36 59 50 50 50 50 50 50 40 40 40 42 42 42 57 57 57 >> 68 68 68 51 51 48 48 48 53 57 54 48 47 41 41 48 50 >> 50 50 51 68 68 68 68 68 68 68 68 68 68 68 68 68 68 >> 68 68 68 57 57 57 57 57 57 57 68 68 68 68 68 68 68 >> 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 68 >> 68 68 68 68 68 68 68 68 57 68 50 50 57 57 57 68 68 >> >> $ grep -c "^>" *.seq *.qual >> read_seq_BAC04.screen.screen.real.seq:960 >> read_seq_BAC04.screen.screen.real.qual:960 >> >> >> From the output of Phrap, I have a .ace file describing the assembly >> (which I sent you previously). >> >> >> I run the following recipe: >> >> $ toAmos \ >> -ace read_seq_BAC04.screen.screen.ace \ >> -q read_seq_BAC04.screen.screen.real.qual \ >> -s read_seq_BAC04.screen.screen.real.seq \ >> -o read_seq_BAC04.screen.screen.afg >> >> $ amos2sq read_seq_BAC04.screen.screen.afg >> >> $ bank-transact -c -f \ >> -m read_seq_BAC04.screen.screen.afg \ >> -b read_seq_BAC04.screen.screen.bnk >> >> $ analyzeSNPs -b read_seq_BAC04.screen.screen.bnk | wc -l >> >> etc. >> >> >> >> Example output (after amos2sq): >> >> $ head read_seq_BAC04.screen.screen.seq read_seq_BAC04.screen.screen.qual >> ==> read_seq_BAC04.screen.screen.seq <== >> >> >>> KN654-BAC04-01_A01.b.abi 0 0 0 32 1142 >>> >>> >> NNNAAAGCGCGAGATCAGCTTGATTCGTTTGAAATTTCTGAAAATGTTTTTTTATTTTTT >> ATTTTTAAGATTCGGTATCTGACTCAATGACCCAATAATAAATGGATTAATTATTGATGA >> ATTGAACAGGTGTCATGGAATATGCAAAGTTTTAAGCATACTGTGAAACGGTTAGAAGAA >> GTTTCAGTTTCATGTAGAGGGATAGAGAGAGTTCAGTTTCAGTTTCATGTGTATCAACTA >> GTTGATAATATAACATAACTTGTAAAGAAATAATTTCAAATTGCATTTGTATCATTACTT >> TTTCAATAAGCATGTACTAGCTATATGTAATTCCAAAAAGGTCAAGCCAATTATACTCAC >> CGATTATATTATTTTGCCATAAAAATATCATCACTTCAATTTGGTATATATAAATATAAA >> CTAAATTGCTCAAAAAAAGTAAAATAACCAATAAGACTGATGTTCACGTCAAATAGGTCC >> ATCAATTCCCAACATCTCATAGGTTCATCAATTTTTAATATTGAATCCCGCTCAAATTAT >> >> ==> read_seq_BAC04.screen.screen.qual <== >> >> >>> KN654-BAC04-01_A01.b.abi >>> >>> >> 01 01 01 03 06 06 06 06 08 10 11 13 08 08 08 12 13 >> 22 24 15 18 18 20 19 20 26 26 28 25 23 23 19 21 28 >> 32 37 37 37 37 37 32 35 37 37 37 47 47 50 59 59 31 >> 36 36 36 36 36 59 50 50 50 50 50 50 40 40 40 42 42 >> 42 57 57 57 60 60 60 51 51 48 48 48 53 57 54 48 47 >> 41 41 48 50 50 50 51 60 60 60 60 60 60 60 60 60 60 >> 60 60 60 60 60 60 60 57 57 57 57 57 57 57 60 60 60 >> 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 >> 60 60 60 60 60 60 60 60 60 60 60 60 57 60 50 50 57 >> >> >> >> Thanks again to Michael for getting this working. >> >> HTH, >> Dan. >> >> >> >> >>> Dan Bolser wrote: >>> >>> >>>> 2009/3/23 Michael Schatz <mic...@gm...>: >>>> >>>> >>>> >>>>> Hi Dan, >>>>> >>>>> I looked at toAmos more carefully. Try specifying both the seq and qual >>>>> files along with the ACE file: >>>>> >>>>> $ toAmos -s prefix.seq -q prefix.qual -ace prefix.ace -o prefix.afg >>>>> >>>>> >>>>> >>>> WORKED! >>>> >>>> analyzeSNPs now shows the correct quality values... >>>> >>>> >>>> However, the first four columns of analyzeSNPs output are strictly >>>> unaffected by including the correct quality values in the input, which >>>> leads me to conclude that their absence isn't affecting the algorithm. >>>> >>>> Thanks for your help Mike, It would be great if the above info could >>>> get into the docs somewhere (assuming having the correct quality >>>> values makes a difference at some stage). >>>> >>>> >>>> Thanks again, >>>> Dan. >>>> >>>> >>>> >>>> >>>> >>>>> I think if you do this, you have to be sure that you (and the assembler) >>>>> didn't make any edits to the reads. I know some assemblers will edit the >>>>> reads to correct sequencing errors. >>>>> >>>>> Let me know if this works for you, >>>>> >>>>> Mike >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Mar 19, 2009 at 1:23 PM, Dan Bolser <dan...@gm...> wrote: >>>>> >>>>> >>>>> >>>>>> 2009/3/19 Dan Bolser <dan...@gm...>: >>>>>> >>>>>> >>>>>> >>>>>>> 2009/3/19 Michael Schatz <mic...@gm...>: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Sorry, the option is -q not -qual. (Too much late night hacking). A >>>>>>>> qual >>>>>>>> file is like a multifasta file, but with quality values. The scores >>>>>>>> should >>>>>>>> be 5' to 3' just like a multifasta file of reads. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> identifier >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> 23 20 30 33 34 ... >>>>>>>> >>>>>>>> Try that and let us know. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Ahh... sorry, I didn't follow my own advice and check "toAmos -h"... >>>>>>> >>>>>>> >>>>>>> So I just tried this: >>>>>>> >>>>>>> $ toAmos -ace read_seq_BAC04.screen.screen.ace -q test.qual -o >>>>>>> read_seq_BAC04.screen.screen.afg >>>>>>> >>>>>>> $ amos2sq read_seq_BAC04.screen.screen.afg >>>>>>> >>>>>>> >>>>>>> >>>>>> Sorry for confusion, I should have said: >>>>>> >>>>>> $ toAmos -ace read_seq_BAC04.screen.screen.ace -q >>>>>> read_seq_BAC04.screen.screen.real.qual -o >>>>>> read_seq_BAC04.screen.screen.afg >>>>>> >>>>>> $ amos2sq read_seq_BAC04.screen.screen.afg >>>>>> >>>>>> >>>>>> i.e. ignore test.qual >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> No joy... >>>>>>> >>>>>>> head -n 3 read_seq_BAC04.screen.screen.real.qual >>>>>>> read_seq_BAC04.screen.screen.qual >>>>>>> ==> read_seq_BAC04.screen.screen.real.qual <== >>>>>>> >>>>>>> >>>>>>> >>>>>>>> KN654-BAC04-01_A01.b.abi PHD_FILE: KN654-BAC04-01_A01.b.abi.phd.1 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> 0 0 0 3 6 6 6 6 8 10 11 13 8 8 8 12 13 22 24 15 18 >>>>>>> 18 20 19 20 26 26 28 25 23 23 19 21 28 32 37 37 37 >>>>>>> >>>>>>> ==> read_seq_BAC04.screen.screen.qual <== >>>>>>> >>>>>>> >>>>>>> >>>>>>>> KN654-BAC04-01_A05.b.abi >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> 10 10 10 30 30 30 30 30 30 30 30 30 30 30 30 30 30 >>>>>>> 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 >>>>>>> >>>>>>> >>>>>>> Let me know and I'll supply the raw files. >>>>>>> >>>>>>> Dan. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> -Mike >>>>>>>> >>>>>>>> On Thu, Mar 19, 2009 at 9:42 AM, Dan Bolser <dan...@gm...> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2009/3/19 Amr Abouelleil <am...@br...>: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> What format should the qual file take? I imagine I could script >>>>>>>>>> something to >>>>>>>>>> take the quality info for my reads from somewhere and put them in a >>>>>>>>>> format >>>>>>>>>> the -qual option will expect. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> I tried passing fasta format quality scores (the format produced by >>>>>>>>> amos2sq, bank2fasta, etc). But it doesn't work ... The error is >>>>>>>>> "Unknown option: qual" i.e. whatever the format, there is no such >>>>>>>>> option for "toAmos". >>>>>>>>> >>>>>>>>> I tried 'bank-combine' after passing the quality file to >>>>>>>>> tarchive2amos >>>>>>>>> (which does read qualities) but that didn't work either ... >>>>>>>>> bank-combine failed. I tried to manually merge the banks, but >>>>>>>>> analyzeSNPs failed on the resulting bank. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Dan Bolser wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> 2009/3/19 Michael Schatz <mic...@gm...>: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Yes, that is right, the 10s and 30s are arbitrary and set with the >>>>>>>>>>>> -bq >>>>>>>>>>>> and >>>>>>>>>>>> -gq. I think if you specify a qual file in addition to an ace file >>>>>>>>>>>> (toAmos >>>>>>>>>>>> -ace prefix.ace -qual prefix.qual), it will load the real quality >>>>>>>>>>>> values. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Gives: >>>>>>>>>>> Unknown option: qual >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Re my previous email: >>>>>>>>>>> >>>>>>>>>>> *) Does this make a difference to algorithms like analyzeSNPs ? >>>>>>>>>>> >>>>>>>>>>> *) How are the values of 10 / 30 linked to the bases (it doesn't >>>>>>>>>>> seem >>>>>>>>>>> to match up with the case of the sequence) ? >>>>>>>>>>> >>>>>>>>>>> *) Anyone know how to get the real quality values into the AFG / >>>>>>>>>>> BNK ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks very much for help on any of the above. >>>>>>>>>>> >>>>>>>>>>> Dan. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Good luck! >>>>>>>>>>>> >>>>>>>>>>>> Michael Schatz >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Mar 18, 2009 at 9:15 AM, Amr Abouelleil >>>>>>>>>>>> <am...@br...> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Well I think I have solved this mystery. The reason we get >>>>>>>>>>>>> defaults >>>>>>>>>>>>> of >>>>>>>>>>>>> 30 >>>>>>>>>>>>> & 10 for quality values is because the ace file format does not >>>>>>>>>>>>> encode >>>>>>>>>>>>> quality values for individual reads, it simply encodes "low >>>>>>>>>>>>> quality" >>>>>>>>>>>>> vs >>>>>>>>>>>>> "high quality" using lower case or uppercase. I suspect that amos >>>>>>>>>>>>> is >>>>>>>>>>>>> seeing >>>>>>>>>>>>> the lowercase and giving quality scores of 10 and uppercase and >>>>>>>>>>>>> giving >>>>>>>>>>>>> quality scores of 30. Does this sound about right Michael? If so, >>>>>>>>>>>>> is >>>>>>>>>>>>> there >>>>>>>>>>>>> another way to get read quality scores into our bank files? >>>>>>>>>>>>> >>>>>>>>>>>>> See this link and find the ACE FILE FORMAT section: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://bozeman.mbt.washington.edu/consed/distributions/README.14.0.txt >>>>>>>>>>>>> >>>>>>>>>>>>> Dan Bolser wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 2009/3/17 Michael Schatz <mic...@gm...>: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Dan, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> amos2sq reports the sequences and qualities from the *reads*, >>>>>>>>>>>>>>> but >>>>>>>>>>>>>>> bank2fasta >>>>>>>>>>>>>>> reports the sequences and qualities from the *contigs* (the >>>>>>>>>>>>>>> quality >>>>>>>>>>>>>>> values >>>>>>>>>>>>>>> are the consensus quality values). Try running dumpreads -r -q >>>>>>>>>>>>>>> prefix.bnk to >>>>>>>>>>>>>>> see if the *read* quality values look okay. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Michael, >>>>>>>>>>>>>> >>>>>>>>>>>>>> The difference between the sequence of the *reads* (amos2sq) and >>>>>>>>>>>>>> the >>>>>>>>>>>>>> sequence from the *contigs* (bank2fasta) is clear. The problem >>>>>>>>>>>>>> is >>>>>>>>>>>>>> that >>>>>>>>>>>>>> the reads seem to be associated with incorrect qualities in >>>>>>>>>>>>>> Amos... >>>>>>>>>>>>>> (amos2sq, dumpreads and analyzeSNPs all showing a read base >>>>>>>>>>>>>> quality >>>>>>>>>>>>>> of >>>>>>>>>>>>>> either 10 or 30). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Actually it seems that the ace file from which the sequences are >>>>>>>>>>>>>> derived does not contain the qualities of the individual reads >>>>>>>>>>>>>> (only >>>>>>>>>>>>>> the qualities of the contig). Now that I see this its not >>>>>>>>>>>>>> surprising >>>>>>>>>>>>>> that Amos doesn't know the read base qualities. I wonder where >>>>>>>>>>>>>> the >>>>>>>>>>>>>> 10's and the 30's come from? I had guessed that it was based on >>>>>>>>>>>>>> the >>>>>>>>>>>>>> 'UPPER cAsE' crude encoding of qualities in the ace file, but >>>>>>>>>>>>>> that >>>>>>>>>>>>>> doesn't look correct... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >From "amos2sq read_seq_BAC04.screen.screen.ace": >>>>>>>>>>>>>> >>>>>>>>>>>>>> head -n 3 read_seq_BAC04.screen.screen.seq >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> KN654-BAC04-01_A05.b.abi 0 0 0 4 1118 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> NNNTTCGCGCGATGACAGCTTGATTCGTCCACCCGACTACCCTCGACCAGCGTGAGAGAA >>>>>>>>>>>>>> ACATCGTGATCGACGAGCCCGAAATTGCCGATGGCGAGATCGGCCTCGCCAGAGCGAAGG >>>>>>>>>>>>>> >>>>>>>>>>>>>> head -n 3 read_seq_BAC04.screen.screen.qual >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> KN654-BAC04-01_A05.b.abi >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> 10 10 10 30 30 30 30 30 30 30 30 30 30 30 30 30 30 >>>>>>>>>>>>>> 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >From the original ACE: >>>>>>>>>>>>>> >>>>>>>>>>>>>> RD KN654-BAC04-01_A05.b.abi 1253 0 0 >>>>>>>>>>>>>> nnnttcgcgcgatgacagcttgattcgtccacccgactaccctcgaccag >>>>>>>>>>>>>> cgtgagagaaacatcgtgatcgacgagcccGAAATTGCCGATGGCGAGAT >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> i.e. the 10's don't map onto the lower case bases in the ACE >>>>>>>>>>>>>> file. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Two questions come to mind: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1) Does the lack of (or error in) the read base qualities affect >>>>>>>>>>>>>> algorithms such as analyzeSNPs? >>>>>>>>>>>>>> 2) given the read base qualities in fasta format, how can I add >>>>>>>>>>>>>> them >>>>>>>>>>>>>> to the bank? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> To answer 2, I tried the following... >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ tarchive2amos test.seq -o test.afg >>>>>>>>>>>>>> >>>>>>>>>>>>>> (where test.seq and test.qual are the fasta format read >>>>>>>>>>>>>> sequences >>>>>>>>>>>>>> and >>>>>>>>>>>>>> qualities taken from the phd's that were used to build the ace >>>>>>>>>>>>>> file >>>>>>>>>>>>>> above) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ bank-transact -c -f -m test.afg -b test.bnk >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ bank-combine merge.bnk read_seq_BAC04.screen.screen.bnk >>>>>>>>>>>>>> test.bnk >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> But I get the error... >>>>>>>>>>>>>> >>>>>>>>>>>>>> FATAL: Could not open bank file, test.bnk//CTG.ifo, No such file >>>>>>>>>>>>>> or >>>>>>>>>>>>>> directory >>>>>>>>>>>>>> there has been a fatal error, abort >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I tried to manually merge the banks by copying round files, but >>>>>>>>>>>>>> I >>>>>>>>>>>>>> don't know what I'm doing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any hints? Perhaps its not important in any case... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Dan. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry for the confusion! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Michael Schatz >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Mar 17, 2009 at 5:29 PM, Dan Bolser >>>>>>>>>>>>>>> <dan...@gm...> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Myself and Amr Abouelleil have been trying to track down a >>>>>>>>>>>>>>>> problem >>>>>>>>>>>>>>>> with analyzeSNPs, and it seems to be related to the difference >>>>>>>>>>>>>>>> between >>>>>>>>>>>>>>>> amos2sq and bank2fasta (full details below). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> amos2sq gives quality values of all 30 suggesting that the AFG >>>>>>>>>>>>>>>> has >>>>>>>>>>>>>>>> no >>>>>>>>>>>>>>>> quality information BUT, if I run bank2fasta on a bank created >>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>> the AFG, I DO see the correct quality values. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Note that amos2seq gives the (~800) read sequences (without >>>>>>>>>>>>>>>> quality >>>>>>>>>>>>>>>> information) and bank2fasta gives the (~10) contig sequences >>>>>>>>>>>>>>>> (with >>>>>>>>>>>>>>>> quality information). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Now the problem is that analyzeSNPs seems to be reporting the >>>>>>>>>>>>>>>> default >>>>>>>>>>>>>>>> quality values (like amos2seq). This is worrying, because the >>>>>>>>>>>>>>>> quality >>>>>>>>>>>>>>>> of the SNP is quite important for verifying its existence. Is >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>> behaviour expected, or is it a bug related to the way that >>>>>>>>>>>>>>>> read >>>>>>>>>>>>>>>> sequence base qualities are encoded? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Amr, I'm not sure if you have anything to add? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks very much for any help, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Dan. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> P.S. I'll send the ACE file to anyone who needs it off list. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> P.P.S. I just looked a bit closer, and it seems that the >>>>>>>>>>>>>>>> output of >>>>>>>>>>>>>>>> amos2seq always has a quality of 10 or 30, while bank2fasta >>>>>>>>>>>>>>>> seems >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>> have the 'proper range' of quality values. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> DETAILS: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> amos2sq -i read_seq_BAC04.screen.screen.afg -o test >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> bank2fasta -b read_seq_BAC04.screen.screen.bnk > test.2.seq -q >>>>>>>>>>>>>>>> test.2.qual >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Note that read_seq_BAC04.screen.screen.afg and >>>>>>>>>>>>>>>> read_seq_BAC04.screen.screen.bnk are created from the .ace >>>>>>>>>>>>>>>> output >>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>> Phred/Phrap like this: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> toAmos -ace read_seq_BAC04.screen.screen.ace \ >>>>>>>>>>>>>>>> -o read_seq_BAC04.screen.screen.afg >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> bank-transact -c -f -m read_seq_BAC04.screen.screen.afg \ >>>>>>>>>>>>>>>> -b read_seq_BAC04.screen.screen.bnk >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> $ grep -c "^>" test*.seq >>>>>>>>>>>>>>>> test.2.seq:9 >>>>>>>>>>>>>>>> test.seq:839 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> $ head -n 2 test*.qual >>>>>>>>>>>>>>>> ==> test.2.qual <== >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ==> test.qual <== >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> KN654-BAC04-01_A05.b.abi >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 10 10 10 30 30 30 30 30 30 30 30 30 30 30 30 30 30 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>>>>>> Apps built with the Adobe(R) Flex(R) framework and Flex >>>>>>>>>>>>>>>> Builder(TM) >>>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>> powering Web 2.0 with engaging, cross-platform capabilities. >>>>>>>>>>>>>>>> Quickly >>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>> easily build your RIAs with Flex Builder, the Eclipse(TM)based >>>>>>>>>>>>>>>> development >>>>>>>>>>>>>>>> software that enables intelligent coding and step-through >>>>>>>>>>>>>>>> debugging. >>>>>>>>>>>>>>>> Download the free 60 day trial. >>>>>>>>>>>>>>>> http://p.sf.net/sfu/www-adobe-com >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> AMOS-help mailing list >>>>>>>>>>>>>>>> AMO...@li... >>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/amos-help >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>>>> Apps built with the Adobe(R) Flex(R) framework and Flex >>>>>>>>>>>>>> Builder(TM) >>>>>>>>>>>>>> are >>>>>>>>>>>>>> powering Web 2.0 with engaging, cross-platform capabilities. >>>>>>>>>>>>>> Quickly >>>>>>>>>>>>>> and >>>>>>>>>>>>>> easily build your RIAs with Flex Builder, the Eclipse(TM)based >>>>>>>>>>>>>> development >>>>>>>>>>>>>> software that enables intelligent coding and step-through >>>>>>>>>>>>>> debugging. >>>>>>>>>>>>>> Download the free 60 day trial. >>>>>>>>>>>>>> http://p.sf.net/sfu/www-adobe-com >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> AMOS-help mailing list >>>>>>>>>>>>>> AMO...@li... >>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/amos-help >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> _____________________________________ >>>>>>>>>>>>> Amr Abouelleil >>>>>>>>>>>>> Computer Finishing Coordinator >>>>>>>>>>>>> The Broad Institute of MIT & Harvard >>>>>>>>>>>>> Phone: 617-324-2396 >>>>>>>>>>>>> Fax: 617-324-1866 >>>>>>>>>>>>> _____________________________________ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> _____________________________________ >>>>>>>>>> Amr Abouelleil >>>>>>>>>> Computer Finishing Coordinator >>>>>>>>>> The Broad Institute of MIT & Harvard >>>>>>>>>> Phone: 617-324-2396 >>>>>>>>>> Fax: 617-324-1866 >>>>>>>>>> _____________________________________ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>> >>>>> >>> -- >>> _____________________________________ >>> Amr Abouelleil >>> Computer Finishing Coordinator >>> The Broad Institute of MIT & Harvard >>> Phone: 617-324-2396 >>> Fax: 617-324-1866 >>> _____________________________________ >>> >>> >>> >>> > > > ------------------------------------------------------------------------------ > Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are > powering Web 2.0 with engaging, cross-platform capabilities. Quickly and > easily build your RIAs with Flex Builder, the Eclipse(TM)based development > software that enables intelligent coding and step-through debugging. > Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com > _______________________________________________ > AMOS-help mailing list > AMO...@li... > https://lists.sourceforge.net/lists/listinfo/amos-help > |