Error running cloudburst example

Brought to you by: mcschatz

#10 Error running cloudburst example

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2011-04-22

Created: 2011-04-22

Creator: robin

Private: No

get error "java.io.IOException: ERROR: seqlen=65535 > MAX_READ_LEN=36 in hdfs://localhost/bye/cloudburst/s_suis.br ref:hdfs:///bye/cloudburst/s_suis.br
", when running CloudBurst command with option "MAX_READ_LEN=36". If change "MAX_READ_LEN=65535", then it ask to reconvert fasta file, but there's no documentation on how to reconvert fasta file with CHUNK_OVERLAP=65535.

See attached .err file for details.

Discussion

robin - 2011-04-22

commands and error

cloudburst.err

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Schatz - 2011-04-22

The Sample Results wiki page had not been updated for the new version of CloudBurst. Please try this again using:

$ hadoop jar CloudBurst.jar /data/cloudburst/s_suis.br \
/data/cloudburst/100k.br /data/results \
36 36 3 0 1 240 48 24 24 128 16 >& cloudburst.err

Good luck!

Mike

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

robin - 2011-04-22

Thank you Mike for your quick response! However, I still got the same error saying "seqlen=65535 > MAX_READ_LEN=36" with the command you send. Any other suggestions? Thanks a lot!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Schatz - 2011-04-22

Hmm...works for me. Can you print out the first few lines after you run it:

$ hadoop jar CloudBurst.jar /user/mschatz/cloudburst/in/s_suis.br /user/mschatz/cloudburst/in/100k.br /user/mschatz/cloudburst/out 36 36 3 0 1 100 100 100 100 128 16
refath: /user/mschatz/cloudburst/in/s_suis.br
qrypath: /user/mschatz/cloudburst/in/100k.br
outpath: /user/mschatz/cloudburst/out-alignments
MIN_READ_LEN: 36
MAX_READ_LEN: 36
K: 3
SEED_LEN: 9
FLANK_LEN: 30
ALLOW_DIFFERENCES: 0
FILTER_ALIGNMENTS: true
NUM_MAP_TASKS: 100
NUM_REDUCE_TASKS: 100
BLOCK_SIZE: 128
REDUNDANCY: 16
Removing old results
11/04/22 20:37:41 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/04/22 20:37:41 INFO mapred.FileInputFormat: Total input paths to process : 2
11/04/22 20:37:42 INFO mapred.JobClient: Running job: job_201103291504_0244
11/04/22 20:37:43 INFO mapred.JobClient: map 0% reduce 0%
11/04/22 20:37:52 INFO mapred.JobClient: map 93% reduce 0%
<..>

Thanks!

Mike

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

robin - 2011-04-25

Hi Mike,

Please see below: Thank you so much!

[bye@zd1 CloudBurst-1.1.0]$ hadoop jar ./CloudBurst.jar hdfs:///bye/cloudburst/s_suis.br hdfs:///bye/cloudburst/100k.br hdfs:///bye/cloudburst/results 36 36 3 0 1 240 48 24 24 128 16refath: hdfs:///bye/cloudburst/s_suis.br
qrypath: hdfs:///bye/cloudburst/100k.br
outpath: hdfs:///bye/cloudburst/results-alignments
MIN_READ_LEN: 36
MAX_READ_LEN: 36
K: 3
SEED_LEN: 9
FLANK_LEN: 30
ALLOW_DIFFERENCES: 0
FILTER_ALIGNMENTS: true
NUM_MAP_TASKS: 240
NUM_REDUCE_TASKS: 48
BLOCK_SIZE: 128
REDUNDANCY: 16
Removing old results
11/04/22 16:27:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/04/22 16:27:24 INFO mapred.FileInputFormat: Total input paths to process : 2
11/04/22 16:27:24 INFO mapred.JobClient: Running job: job_201102250632_0055
11/04/22 16:27:25 INFO mapred.JobClient: map 0% reduce 0%
11/04/22 16:27:36 INFO mapred.JobClient: Task Id : attempt_201102250632_0055_m_000000_0, Status : FAILED
java.io.IOException: ERROR: seqlen=65535 > MAX_READ_LEN=36 in hdfs://localhost/bye/cloudburst/s_suis.br ref:hdfs:///bye/cloudburst/s_suis.br

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Schatz - 2011-04-25

Did you use the s_suis.br and 100k.br from the sample data or did you regenerate those using ConvertFastaForCloud.jar?

Thanks,

Mike

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

robin - 2011-04-26

I tried both, got the same error. Any other suggestions? Thank you so much!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nobody/Anonymous - 2011-04-29

I finally get it work by using path to files defined as in manual "/data/ref.br" and "/data/qry.br". Previously, I use path as "hdfs:///data/ref.br" and "hdfs:///data/qry.br", and get error message saying "seqlen=65535 > MAX_READ_LEN=36". Somehow, the program was taking the ref.br as qry.br, even though the output messages showed that it read ref.br and qry.br correctly, it mess up during the operation somewhere. This looks to me, is a hadoop configuration issue(?). I'm new to hadoop as well. Just thought it might worthy to let you know how it's solved. Thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Schatz - 2011-05-04

Great, glad it is working for you. The path processing is very unpredictable. That's what you get for using a 0.20.1 release!

Good luck!

Mike

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.