ngopt / Tickets / #6 Long reads cause IDBA to crash

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-08-30

Originally posted by: MAlab... (code.google.com)@gmail.com

Hi
I am getting the same problem with my 250 bp reads. I did what you suggested and the idpa step worked fine. but my question, what is the next step in the pipeline and how you run it independently?

Thanks for your time

Magdy

*Originally posted by:* [MAlab...@gmail.com](http://code.google.com/u/100091110995732878812/) Hi I am getting the same problem with my 250 bp reads. I did what you suggested and the idpa step worked fine. but my question, what is the next step in the pipeline and how you run it independently? Thanks for your time Magdy

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-09-04

Originally posted by: matt.wor... (code.google.com)@gmail.com

I think the easiest way use long reads with the pipeline is to modify the Perl script which runs the pipeline to allow IDBA to accept the long reads. However, the authors of A5 may have a better suggestion.

*Originally posted by:* [matt.wor...@gmail.com](http://code.google.com/u/107029938896615864123/) I think the easiest way use long reads with the pipeline is to modify the Perl script which runs the pipeline to allow IDBA to accept the long reads.  However, the authors of A5 may have a better suggestion.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-10-22

Originally posted by: prakhar.... (code.google.com)@gmail.com

IDBA usage:
IDBA: Iterative De Bruijn graph short read Assembler
Version 0.19

Usage: idba --read read-file [--output out] [options]

Allowed Options:
-h, --help                   produce help message
-r, --read arg               read file
-l, --long arg               long read file
-o, --output arg (=out)      prefix of output
      --scaffold               use pair end information to merge contigs
      --mink arg (=25)         minimum k value
      --maxk arg (=50)         maximum k value
      --minCount arg (=2)      filtering threshold for each k-mer
      --cover arg (=0)         the cutting coverage for contigs
      --minPairs arg (=5)      minimum number of pair-end connections to join two contigs
      --prefixLength arg (=3) length of the prefix of k-mer used to split k-mer table
------------------------------------------------------------------------------
I am having the same problem as described above, though I am using idba as part of A5 pipeline.

The -r and -l(--long) options seem to be mutually exclusive, and supplying -r as blank file means no input.
How then will idba assemble reads?

Or am I missing something?

Input data: MiSeq 250 bp reads, both PE and MP libs

Regards,
--
prakhar gaur

*Originally posted by:* [prakhar....@gmail.com](http://code.google.com/u/116844777657441009279/) IDBA usage: IDBA: Iterative De Bruijn graph short read Assembler Version 0.19 Usage: idba --read read-file \[--output out\] \[options\] Allowed Options:   -h, --help                   produce help message   -r, --read arg               read file   -l, --long arg               long read file   -o, --output arg $=out$      prefix of output       --scaffold               use pair end information to merge contigs       --mink arg $=25$         minimum k value       --maxk arg $=50$         maximum k value       --minCount arg $=2$      filtering threshold for each k-mer       --cover arg $=0$         the cutting coverage for contigs       --minPairs arg $=5$      minimum number of pair-end connections to join two contigs       --prefixLength arg $=3$  length of the prefix of k-mer used to split k-mer table \------------------------------------------------------------------------------ I am having the same problem as described above, though I am using idba as part of A5 pipeline. The -r and -l$--long$ options seem to be mutually exclusive, and supplying -r as blank file means no input. How then will idba assemble reads? Or am I missing something? Input data: MiSeq 250 bp reads, both PE and MP libs Regards, \-- prakhar gaur

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-10-25

Originally posted by: prakhar.... (code.google.com)@gmail.com

Hello,

back with more details on the error.
It seems that the Mate Library, (MiSeq, 250base pair read lenght)
is the one causing the trouble.

On attempting a a5 run only with Mate pair reads as input, the idba step fails
with the above mentioned error message.

regards,
--
prahar gaur

*Originally posted by:* [prakhar....@gmail.com](http://code.google.com/u/116844777657441009279/) Hello, back with more details on the error. It seems that the Mate Library, $MiSeq, 250base pair read lenght$ is the one causing the trouble. On attempting a a5 run only with Mate pair reads as input, the idba step fails with the above mentioned error message. regards, \-- prahar gaur

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-12-16

Originally posted by: aaron.darling (code.google.com)

There is a new branch of the code which can work with 250nt and 300nt miseq reads. It's available from the subversion repository here:
http://ngopt.googlecode.com/svn/branches/20130712_miseq_longread/

It can be checked out with subversion and then built with the build_pipeline.sh script.

After some further refinement and testing we will eventually release this version as a separate download. In principle it should work with reads up to 400nt long, but memory requirements for longer reads are very high. Expect to use at least 30GB for a typical bacterial genome.

*Originally posted by:* [aaron.darling](http://code.google.com/u/aaron.darling/) There is a new branch of the code which can work with 250nt and 300nt miseq reads. It's available from the subversion repository here: [http://ngopt.googlecode.com/svn/branches/20130712_miseq_longread/](http://ngopt.googlecode.com/svn/branches/20130712_miseq_longread/) It can be checked out with subversion and then built with the build\_pipeline.sh script. After some further refinement and testing we will eventually release this version as a separate download. In principle it should work with reads up to 400nt long, but memory requirements for longer reads are very high. Expect to use at least 30GB for a typical bacterial genome.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-12-17

Originally posted by: prakhar.... (code.google.com)@gmail.com

Hello Aaron,

I used the above mentioned code, for a bacterial assembly.
With two libs,
PE - 250 bp Read lenght
MP - 250 bp Read lenght

I created a lib file,
$cat AKS7.lib
[LIB]
p1=AKS7_S1_L001_R1_001.fastq
p2=AKS7_S1_L001_R2_001.fastq
[LIB]
p1=AKS7-MP_S2_L001_R1_001.fastq
p2=AKS7-MP_S2_L001_R2_001.fastq

The a5 pipeline was run with this command,
$perl /home/prakhar/local_Bin/ngopt_a5pipeline_linux-x64_20131217/bin/a5_pipeline.pl AKS7.lib a5-20131217_AKS7_MP_PE

The assembly was completed and final scaffolds file generated.

But on examining the a5 log,

"[a5] /home/prakhar/local_Bin/ngopt_a5pipeline_linux-x64_20131217/bin/idba_ud250 -r a5-20131217_AKS7_MP_PE.s2/a5-20131217_AKS7_MP_PE.ec.fasta -o a5-20131217_AKS7_MP_PE.s2/a5-20131217_AKS7_MP_PE --mink 35 --maxk 250 --min_pairs 2 --min_count 1
Segmentation fault"

Is this significant ?

Please find the full log file attached herewith.

Regards,
--
prakhar gaur

*Originally posted by:* [prakhar....@gmail.com](http://code.google.com/u/116844777657441009279/) Hello Aaron, I used the above mentioned code, for a bacterial assembly. With two libs, PE - 250 bp Read lenght MP - 250 bp Read lenght I created a lib file, $cat AKS7.lib \[LIB\] p1=AKS7\_S1\_L001\_R1\_001.fastq p2=AKS7\_S1\_L001\_R2\_001.fastq \[LIB\] p1=AKS7-MP\_S2\_L001\_R1\_001.fastq p2=AKS7-MP\_S2\_L001\_R2\_001.fastq The a5 pipeline was run with this command, $perl /home/prakhar/local\_Bin/ngopt\_a5pipeline\_linux-x64\_20131217/bin/a5\_pipeline.pl AKS7.lib a5-20131217\_AKS7\_MP\_PE The assembly was completed and final scaffolds file generated. But on examining the a5 log, "\[a5\] /home/prakhar/local\_Bin/ngopt\_a5pipeline\_linux-x64\_20131217/bin/idba\_ud250 -r a5-20131217\_AKS7\_MP\_PE.s2/a5-20131217\_AKS7\_MP\_PE.ec.fasta  -o a5-20131217\_AKS7\_MP\_PE.s2/a5-20131217\_AKS7\_MP\_PE --mink 35 --maxk 250 --min\_pairs 2 --min\_count 1 Segmentation fault" Is this significant ? Please find the full log file attached herewith. Regards, \-- prakhar gaur

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

a5-20131217_AKS7_MP_PE-log.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2013-12-19

Originally posted by: aaron.darling (code.google.com)

Hmmm, not sure why that is happening, perhaps idba_ud has a memory corruption bug on exit. In any case, it produced an assembly so you should be able to use it.

*Originally posted by:* [aaron.darling](http://code.google.com/u/aaron.darling/) Hmmm, not sure why that is happening, perhaps idba\_ud has a memory corruption bug on exit. In any case, it produced an assembly so you should be able to use it.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Long reads cause IDBA to crash

de novo assembly & analysis of Illumina sequence data

Searches

Help

#6 Long reads cause IDBA to crash

Discussion