PBSuite / Discussion / PBJelly Tickets: No gaps to be assembled were found

No gaps to be assembled were found

Forum: PBJelly Tickets

Creator: zxl124

Created: 2014-07-03

Updated: 2018-07-12

zxl124 - 2014-07-03

Hi,

First, thank you for providing this software. I am a new user of PBJelly. I've installed it as instructed and have run the dummy data and passed. When I was running it with real data, everything went smoothly until the "assembly" step and error message was "No gaps to be assembled were found". I went back to check .err files of every previous step.

In the mapping step, there was an error message like this.

2014-07-02 17:07:36,018 [INFO] Running /usr/local/bioinf/PBSuite_14.6.24//bin//m4pie.py /home/jay/temp/JSC1/mapping/m140608_213146_42146_c100642192550000001823128610151427_s1_p0.1.subreads.fastq.m4 /home/jay/temp/JSC1/reads/m140608_213146_42146_c1006421925500000018
23128610151427_s1_p0.1.subreads.fastq /home/jay/temp/JSC1/reference/JSC1.fasta --nproc 8 -i
2014-07-02 17:07:38,877 [INFO] Extracting tails
Traceback (most recent call last):
File "/usr/local/bioinf/PBSuite_14.6.24//bin//m4pie.py", line 206, in <module>
run(sys.argv[1:])
File "/usr/local/bioinf/PBSuite_14.6.24//bin//m4pie.py", line 185, in run
r, t, m = extractTails(aligns, reads, outFq=tailfastq, minLength=args.minTail)
File "/usr/local/bioinf/PBSuite_14.6.24//bin//m4pie.py", line 50, in extractTails
seq = reads[read.qname][:pTail]
KeyError: 'm140608_213146_42146_c100642192550000001823128610151427_s1_p0/219/4863_9930'

The 'm140608_213146_42146_c100642192550000001823128610151427_s1_p0/219/4863_9930' was the name of a read. The mapping step seemed to produce a normal m4 file.

The output in "support" step seemed normal. In the "extraction" step, log looks like this.

2014-07-02 19:44:44,080 [INFO] Parsing /home/jay/temp/JSC1/reads/m140611_101939_42146_c100642462550000001823129210151477_s1_p0.3.subreads.fastq
2014-07-02 19:44:58,811 [INFO] Loaded 61375 Reads
2014-07-02 19:44:59,317 [INFO] Parsed 0 Reads
2014-07-02 19:44:59,726 [INFO] Finished

masterSupport.bml in extraction/ also looks normal, it has a bunch of lines started with 'extend' or 'evidence'.

By the way, my networkx version is 1.1, blasr version is 1.3.1.

Could you help me figure this out, please? Thank you very much.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Adam English - 2014-07-03

Hello,

This is likely due to your subreads.fastq file having spaces in the read name. Blasr doesn't use the spaces when reporting what read maps where, and PBJelly reads the entire entry. So where Blasr reports "m130611_...p0/218/0_1000", the fastq likely has something like "m130611_...p0/218/0_1000 RQ=0.862"

If you remove the " RQ=*" from the fastq read names, you can resume processing at the 'extraction' stage without needing to remap/resupport. I'd recommend making a backup of the subreads.fastq just in case.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

zxl124 - 2014-07-06

Thank you. This solved my problem.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Amit Rai - 2018-07-12

Hi, Could you please help me to know how to remove RQ. I have encountered exactly the same problem and will really appreciate your help or suggestion.

Thank you so much

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

No gaps to be assembled were found

Software for Long-Read Sequencing Data from PacBio

Forums

Help

No gaps to be assembled were found

No gaps to be assembled were found

Software for Long-Read Sequencing Data from PacBio

Forums

Help

No gaps to be assembled were found document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

No gaps to be assembled were found