Re: [AMOS-help] Minimus on the Staphylococcus Aureus

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Dear Dan,

I will play with the overlap size of the hash-overlap.
Let me know how the re-assembling goes...

Thanks a lot,
--Giuseppe

On Tue, Aug 4, 2009 at 1:16 PM, Dan Sommer<ds...@um...> wrote:
> The Clear ranges look right.
>  The has been some minor changes to the code since the paper so I guess there is chance something is wrong.
>  If you want to try changing the overlap size,  you can modify the minimus
> pipeline script.  The step to change is the hash-overlap step.  Run 'overlap
> -h' to see the options.  I believe the -o option sets the min. overlap.  If
> you have amos install in your path, you should be able to run hash-overlap
> at the command line.
>  I will try re-assembling the benchmark here once I get a chance.
> -Dan
>
> On Thu, Jul 30, 2009 at 6:28 AM, Giuseppe Narzisi <gna...@gm...>
> wrote:
>>
>> Just for your information, with Brucella, Wolbachia and Staphylococcus
>> epidermidis I do not get a fragmented assembly.
>> It only happens with the Staphylococcus aureus and Shewanella
>> oneidensis datasets.
>>
>> --Giuseppe.
>>
>>
>>
>> On Thu, Jul 30, 2009 at 6:16 AM, Giuseppe Narzisi<gna...@gm...>
>> wrote:
>> > Dear Dan,
>> >
>> > I used the tarchive2amos utility to convert the fasta/qual/xml files
>> > into the amos format.
>> > Here is an example of RED from the afg file, it looks like the clear
>> > range is properly handled.
>> > I wanted to send you the full afg file but it is too big to send as an
>> > attachment...
>> >
>> > Thanks,
>> > --Giuseppe
>> >
>> >
>> > {RED
>> > iid:28498
>> > eid:GSALE31TF
>> > seq:
>> > NGCCAAGCTTGCATGCCTGCAGGTCGACACTAGAGGATCCCCTCGAATGTTTAATCATTT
>> > AGAAGCGCCTACATCAGGTGAAGTTATTATAGATGGAGACCATATAGGTCAATTGTCCAA
>> > AAATGGATTAAGAGCAAAAAGACAAAAAGTAAGTATGATCTTCCAACATTTTAATTTGTT
>> > ATGGTCAAGGACTGTGTTAAAAAATATTATGTTTCCGCTTGAAATTGCAGGTGTCCCTAG
>> > AAGGAGAGCTAAGCAAAAAGCATTAGAACTTGTCGAACTCGTCGGTTTAAAAGGTAGAGA
>> > AAAGGCTTATCCATCAGAGTTATCAGGTGGACAAAAGCAACGTGTTGGGATTGCACGAGC
>> > GTTAGCTAATGATCCAACGGTCTTGCTTTGTGATGAGGCAACAAGTGCACTTGATCCGCA
>> > AACAACAGATGAAATTTTAGATCTACTACTAAAAATTAGAGAACAACAAAATTTAACAAT
>> > TGTACTAATTACGCATGAAATGCATGTCATTCGTCGTATTTGTGATGAANTTGCAGTTAT
>> > GGAAAGTGGTAAAGTGATAGAACAAGGACCGGTGACACAGGTTTTTGAAAATCCGCAACA
>> > CACTGTGACAAAAACATTTGTGAAAGACGATTTAGATGAATATTTCGAAACATCTTTTAC
>> > AGAATTAGAGCCATTAGAAAAAGATGCATATATCGGTTAGATTAGTTTCCGCTGGGTCAC
>> > CAANCAACGGAGCCTATTGGTATCGAGTCTAC
>> > .
>> > qlt:
>> > 0000FC00000ABB0000CLKRTTVLI000DGPRRR]]]XXPPLUUXX]X]RRRNNQQQQ
>> > RQQQQQQV]]]]]]]UUU]]XXXXUUXUUUUX]]]XXXXUUUUUXUUUUUXX]UUUYYYY
>> > ]]]Y[YYYRRRUUU]]]]]a]cccc]ZZUUUU]UUUUUUUUYYYYYYNNNUQQQQZUUNN
>> > NNQRUUYUYUYUUQQQQQQUZ]]]]URRUUUR]YYRRRRRRUUYYYYYPPUVVUUXPPLL
>> > UV]UURUUUOORUMUU]]]UUUGI00PPPUPFFKLU]]RRMMMMRUURRPUPP0B0GGRU
>> > ]]SSEEELIPLOOOOIIPNMJHBD00?GGOOTTTRQNIHHLNQOOOOORQQQQPPKKPPO
>> > QRROGLGIMOGG?00AAEHFHGAAJJOOOMFLLKJ@0000AIHHMHJRID?000000CIH
>> > HTMMIJJJJLLLPGHEF000C000IGOOTTTQQQM?000GGORTH0000000GDAAHDDE
>> > AGEIHHA@B0B000??KKNNNKLLLKGDC00000000@00@?0@00000000AG000000
>> > 00000B0000000BBE000000000000000000000000000@@000@00000000000
>> > 000000000000000000000000D00000000000000000000000000ABC0A0000
>> > 0000000000000000000000BBBB?000000000000000000000000000000000
>> > 00000000000000000000000000000000
>> > .
>> > frg:1
>> > clr:42,529
>> > }
>> >
>> >
>> >
>> >
>> > On Wed, Jul 29, 2009 at 11:02 AM, Dan Sommer<ds...@um...>
>> > wrote:
>> >> The clear ranges (CLEARL & CLEARR) are given in the fasta header line
>> >> for
>> >> each read in the benchmark but I am not sure they are being put into
>> >> the
>> >> amos message file (.afg).  How did you convert the fasta file?  Can you
>> >> look
>> >> to see if the .afg file has the clear ranges in it?
>> >> Dan
>> >>
>> >> On Tue, Jul 28, 2009 at 6:47 AM, Giuseppe Narzisi <gna...@gm...>
>> >> wrote:
>> >>>
>> >>> Dear Dan,
>> >>>
>> >>> the README of the assembly benchmark says that all the sequences have
>> >>> been already trimmed to remove vector and low-quality basecalls.
>> >>> For each read a clear-range is specified using CLEARL and CLEARR and I
>> >>> assume that minimus uses this information during the assembly.
>> >>> Should I trim the reads again using LUCY ?
>> >>> Could it be that I have to change the default parameter of the
>> >>> hash-overlap ?
>> >>>
>> >>> Thanks,
>> >>> --Giuseppe
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Jul 27, 2009 at 1:06 PM, Dan Sommer<ds...@um...>
>> >>> wrote:
>> >>> > Did you trim the reads first before assembling them?  If you don't
>> >>> > trim
>> >>> > first using the LUCY trimming software, it will fragment the
>> >>> > assembly.
>> >>> >  http://lucy.sourceforge.net/
>> >>> > Dan
>> >>> >
>> >>> > On Sat, Jul 25, 2009 at 9:43 AM, Giuseppe Narzisi
>> >>> > <gna...@gm...>
>> >>> > wrote:
>> >>> >>
>> >>> >> Hi everyone,
>> >>> >>
>> >>> >> I have been testing Minimus on the Staphylococcus Aureus genome
>> >>> >> from
>> >>> >> the benchmark data available at:
>> >>> >> http://www.cbcb.umd.edu/research/benchmark.shtml
>> >>> >> In partuclular, to simulate the assembly of the original shotgun
>> >>> >> project, I have concatenated the data in random.seq and
>> >>> >> random_nonmatching.seq
>> >>> >> According to the results reported in the BMC Bioinformatics paper,
>> >>> >> Minimus creates only 85 contigs however I get 5445 contigs.
>> >>> >> So I was wondering what I am doing wrong.
>> >>> >> I am using the standard minimus pipeline of the amos package.
>> >>> >>
>> >>> >>
>> >>> >> Thanks,
>> >>> >> --Giuseppe
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> ------------------------------------------------------------------------------
>> >>> >> _______________________________________________
>> >>> >> AMOS-help mailing list
>> >>> >> AMO...@li...
>> >>> >> https://lists.sourceforge.net/lists/listinfo/amos-help
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>
>

Re: [AMOS-help] Minimus on the Staphylococcus Aureus

AMOS is a collection of tools for genome assembly

Re: [AMOS-help] Minimus on the Staphylococcus Aureus