Menu

#10 Error running cloudburst example

open
nobody
None
5
2011-04-22
2011-04-22
robin
No

get error "java.io.IOException: ERROR: seqlen=65535 > MAX_READ_LEN=36 in hdfs://localhost/bye/cloudburst/s_suis.br ref:hdfs:///bye/cloudburst/s_suis.br
", when running CloudBurst command with option "MAX_READ_LEN=36". If change "MAX_READ_LEN=65535", then it ask to reconvert fasta file, but there's no documentation on how to reconvert fasta file with CHUNK_OVERLAP=65535.

See attached .err file for details.

Discussion

  • robin

    robin - 2011-04-22

    commands and error

     
  • Michael Schatz

    Michael Schatz - 2011-04-22

    The Sample Results wiki page had not been updated for the new version of CloudBurst. Please try this again using:

    $ hadoop jar CloudBurst.jar /data/cloudburst/s_suis.br \
    /data/cloudburst/100k.br /data/results \
    36 36 3 0 1 240 48 24 24 128 16 >& cloudburst.err

    Good luck!

    Mike

     
  • robin

    robin - 2011-04-22

    Thank you Mike for your quick response! However, I still got the same error saying "seqlen=65535 > MAX_READ_LEN=36" with the command you send. Any other suggestions? Thanks a lot!

     
  • Michael Schatz

    Michael Schatz - 2011-04-22

    Hmm...works for me. Can you print out the first few lines after you run it:

    $ hadoop jar CloudBurst.jar /user/mschatz/cloudburst/in/s_suis.br /user/mschatz/cloudburst/in/100k.br /user/mschatz/cloudburst/out 36 36 3 0 1 100 100 100 100 128 16
    refath: /user/mschatz/cloudburst/in/s_suis.br
    qrypath: /user/mschatz/cloudburst/in/100k.br
    outpath: /user/mschatz/cloudburst/out-alignments
    MIN_READ_LEN: 36
    MAX_READ_LEN: 36
    K: 3
    SEED_LEN: 9
    FLANK_LEN: 30
    ALLOW_DIFFERENCES: 0
    FILTER_ALIGNMENTS: true
    NUM_MAP_TASKS: 100
    NUM_REDUCE_TASKS: 100
    BLOCK_SIZE: 128
    REDUNDANCY: 16
    Removing old results
    11/04/22 20:37:41 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    11/04/22 20:37:41 INFO mapred.FileInputFormat: Total input paths to process : 2
    11/04/22 20:37:42 INFO mapred.JobClient: Running job: job_201103291504_0244
    11/04/22 20:37:43 INFO mapred.JobClient: map 0% reduce 0%
    11/04/22 20:37:52 INFO mapred.JobClient: map 93% reduce 0%
    <..>

    Thanks!

    Mike

     
  • robin

    robin - 2011-04-25

    Hi Mike,

    Please see below: Thank you so much!

    [bye@zd1 CloudBurst-1.1.0]$ hadoop jar ./CloudBurst.jar hdfs:///bye/cloudburst/s_suis.br hdfs:///bye/cloudburst/100k.br hdfs:///bye/cloudburst/results 36 36 3 0 1 240 48 24 24 128 16refath: hdfs:///bye/cloudburst/s_suis.br
    qrypath: hdfs:///bye/cloudburst/100k.br
    outpath: hdfs:///bye/cloudburst/results-alignments
    MIN_READ_LEN: 36
    MAX_READ_LEN: 36
    K: 3
    SEED_LEN: 9
    FLANK_LEN: 30
    ALLOW_DIFFERENCES: 0
    FILTER_ALIGNMENTS: true
    NUM_MAP_TASKS: 240
    NUM_REDUCE_TASKS: 48
    BLOCK_SIZE: 128
    REDUNDANCY: 16
    Removing old results
    11/04/22 16:27:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    11/04/22 16:27:24 INFO mapred.FileInputFormat: Total input paths to process : 2
    11/04/22 16:27:24 INFO mapred.JobClient: Running job: job_201102250632_0055
    11/04/22 16:27:25 INFO mapred.JobClient: map 0% reduce 0%
    11/04/22 16:27:36 INFO mapred.JobClient: Task Id : attempt_201102250632_0055_m_000000_0, Status : FAILED
    java.io.IOException: ERROR: seqlen=65535 > MAX_READ_LEN=36 in hdfs://localhost/bye/cloudburst/s_suis.br ref:hdfs:///bye/cloudburst/s_suis.br

     
  • Michael Schatz

    Michael Schatz - 2011-04-25

    Did you use the s_suis.br and 100k.br from the sample data or did you regenerate those using ConvertFastaForCloud.jar?

    Thanks,

    Mike

     
  • robin

    robin - 2011-04-26

    I tried both, got the same error. Any other suggestions? Thank you so much!

     
  • Nobody/Anonymous

    I finally get it work by using path to files defined as in manual "/data/ref.br" and "/data/qry.br". Previously, I use path as "hdfs:///data/ref.br" and "hdfs:///data/qry.br", and get error message saying "seqlen=65535 > MAX_READ_LEN=36". Somehow, the program was taking the ref.br as qry.br, even though the output messages showed that it read ref.br and qry.br correctly, it mess up during the operation somewhere. This looks to me, is a hadoop configuration issue(?). I'm new to hadoop as well. Just thought it might worthy to let you know how it's solved. Thank you!

     
  • Michael Schatz

    Michael Schatz - 2011-05-04

    Great, glad it is working for you. The path processing is very unpredictable. That's what you get for using a 0.20.1 release!

    Good luck!

    Mike

     

Log in to post a comment.

MongoDB Logo MongoDB