Menu

#44 MIRA 4 cannot recognise paired reads

4.0.1
open
nobody
None
2021-05-18
2021-05-18
Maeva perez
No

Hi Bastien,
I have sucessfully run Mira in the past but this these reads downloaded from NCBI I have been unable to. My guess is that there is something wrong with the format of my fastq files so I have tried modifying the headers in different ways but without success.

Your help would be greatly appreciated

Below is the log and attached are the read files (small subset) and manifest.conf
This is MIRA 4.0.2_0+g29f87d4 .

Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.

To (un-)subscribe the MIRA mailing lists, see:
http://www.chevreux.org/mira_mailinglists.html

After subscribing, mail general questions to the MIRA talk mailing list:
mira_talk@freelists.org

To report bugs or ask for features, please use the SourceForge ticketing
system at:
http://sourceforge.net/p/mira-assembler/tickets/
This ensures that requests do not get lost.

Compiled by: bach
Fri Apr 18 14:57:56 CEST 2014
On: Darwin airfau2.fritz.box 13.1.0 Darwin Kernel Version 13.1.0: Thu Jan 16 19:40:37 PST 2014; root:xnu-2422.90.20~2/RELEASE_X86_64 x86_64
Compiled in boundtracking mode.
Compiled in bugtracking mode.
Compiled with ENABLE64 activated.
Runtime settings (sorry, for debug):
Size of size_t : 8
Size of uint32 : 4
Size of uint32_t: 4
Size of uint64 : 8
Size of uint64_t: 8
Current system: Darwin Maevas-MacBook-Air.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64

For mapping assembly: readgroup 2 named 'reads' has no 'infoonly' or 'exclusion_criterion' set for 'segment_placement',
assuming 'infoonly'.
Looking for files named in data ...Pushing back filename: "P08H-3-Mito_circ_reoriented.fa"
Pushing back filename: "subset_reads1.fastq"
Pushing back filename: "subset_reads2.fastq"
Manifest:
projectname: initial_mapping_testpool-to_ref
job: genome,mapping,accurate
parameters: -NW:mrnl=0 -AS:nop=1 SOLEXA_SETTINGS -CO:msr=no
Manifest load entries: 2
MLE 1:
RGID: 1
RGN: SN: AlvCau
SP: SPio: 0 SPC: 0 IF: -1 IT: -1 TSio: 0
ST: 5 (Text) namschem: 6 SID: 0
DQ: 30
BB: 1 Rail: 0 CER: 0

P08H-3-Mito_circ_reoriented.fa MLE 2:
RGID: 2
RGN: reads SN: testpool
SP: ---> <--- SPio: 1 SPC: -1 IF: -1 IT: -1 TSio: 0
ST: 6 (Solexa) namschem: 3 SID: 0
DQ: 30
BB: 0 Rail: 0 CER: 0

subset_reads1.fastq subset_reads2.fastq

Parameters parsed without error, perfect.
Overriding number of threads via '-t' with 2

-CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1.

Parameter settings seen for:
Sanger data

Used parameter settings:
General (-GE):
Project name : initial_mapping_testpool-to_ref
Number of threads (not) : 2
Automatic memory management (amm) : yes
Keep percent memory free (kpmf) : 15
Max. process size (mps) : 0
EST SNP pipeline step (esps) : 0
Colour reads by hash frequency (crhf) : yes

Load reads options (-LR):
Wants quality file (wqf) : [sxa] yes

Filecheck only (fo)                         : no

Assembly options (-AS):
Number of passes (nop) : 1
Skim each pass (sep) : yes
Maximum number of RMB break loops (rbl) : 1
Maximum contigs per pass (mcpp) : 0

Minimum read length (mrl)                   :  [sxa]  20
Minimum reads per contig (mrpc)             :  [sxa]  10
Enforce presence of qualities (epoq)        :  [sxa]  yes

Automatic repeat detection (ard)            : yes
    Coverage threshold (ardct)              :  [sxa]  2
    Minimum length (ardml)                  :  [sxa]  200
    Grace length (ardgl)                    :  [sxa]  20
    Use uniform read distribution (urd)     : no
      Start in pass (urdsip)                : 3
      Cutoff multiplier (urdcm)             :  [sxa]  1.5

Spoiler detection (sd)                      : yes
    Last pass only (sdlpo)                  : yes

Use genomic pathfinder (ugpf)               : yes

Use emergency search stop (uess)            : yes
    ESS partner depth (esspd)               : 500
Use emergency blacklist (uebl)              : yes
Use max. contig build time (umcbt)          : no
    Build time in seconds (bts)             : 10000

Strain and backbone options (-SB):
Bootstrap new backbone (bnb) : yes
Start backbone usage in pass (sbuip) : 0
Backbone rail from strain (brfs) :
Backbone rail length (brl) : 0
Backbone rail overlap (bro) : 0
Trim overhanging reads (tor) : yes

(Also build new contigs (abnc))             : no

Dataprocessing options (-DP):
Use read extensions (ure) : [sxa] no
Read extension window length (rewl) : [sxa] 30
Read extension w. maxerrors (rewme) : [sxa] 2
First extension in pass (feip) : [sxa] 0
Last extension in pass (leip) : [sxa] 0

Clipping options (-CL):
SSAHA2 or SMALT clipping:
Gap size (msvsgs) : [sxa] 1
Max front gap (msvsmfg) : [sxa] 2
Max end gap (msvsmeg) : [sxa] 2
Strict front clip (msvssfc) : [sxa] 0
Strict end clip (msvssec) : [sxa] 0
Possible vector leftover clip (pvlc) : [sxa] no
maximum len allowed (pvcmla) : [sxa] 18
Min qual. threshold for entire read (mqtfer): [sxa] 0
Number of bases (mqtfernob) : [sxa] 15
Quality clip (qc) : [sxa] no
Minimum quality (qcmq) : [sxa] 20
Window length (qcwl) : [sxa] 30
Bad stretch quality clip (bsqc) : [sxa] no
Minimum quality (bsqcmq) : [sxa] 5
Window length (bsqcwl) : [sxa] 20
Masked bases clip (mbc) : [sxa] no
Gap size (mbcgs) : [sxa] 5
Max front gap (mbcmfg) : [sxa] 12
Max end gap (mbcmeg) : [sxa] 12
Lower case clip front (lccf) : [sxa] no
Lower case clip back (lccb) : [sxa] no
Clip poly A/T at ends (cpat) : [sxa] no
Keep poly-a signal (cpkps) : [sxa] no
Minimum signal length (cpmsl) : [sxa] 12
Max errors allowed (cpmea) : [sxa] 1
Max gap from ends (cpmgfe) : [sxa] 9
Clip 3 prime polybase (c3pp) : [sxa] yes
Minimum signal length (c3ppmsl) : [sxa] 15
Max errors allowed (c3ppmea) : [sxa] 3
Max gap from ends (c3ppmgfe) : [sxa] 9
Clip known adaptors right (ckar) : [sxa] yes
Ensure minimum left clip (emlc) : [sxa] no
Minimum left clip req. (mlcr) : [sxa] 0
Set minimum left clip to (smlc) : [sxa] 0
Ensure minimum right clip (emrc) : [sxa] no
Minimum right clip req. (mrcr) : [sxa] 10
Set minimum right clip to (smrc) : [sxa] 20

Apply SKIM chimera detection clip (ascdc)   : no
Apply SKIM junk detection clip (asjdc)      : no

Propose end clips (pec)                     :  [sxa]  yes
    Bases per hash (pecbph)                 : 31
    Handle Solexa GGCxG problem (pechsgp)   : yes
    Front freq (pffreq)                     :  [sxa]  0
    Back freq (pbfreq)                      :  [sxa]  0
    Minimum kmer for forward-rev (pmkfr)    : 1
    Front forward-rev (pffore)              :  [sxa]  yes
    Back forward-rev (pbfore)               :  [sxa]  yes
    Front conf. multi-seq type (pfcmst)     :  [sxa]  yes
    Back conf. multi-seq type (pbcmst)      :  [sxa]  yes
    Front seen at low pos (pfsalp)          :  [sxa]  no
    Back seen at low pos (pbsalp)           :  [sxa]  no

Clip bad solexa ends (cbse)                 :  [sxa]  yes
Search PhiX174 (spx174)                     :  [sxa]  yes
    Filter PhiX174 (fpx174)                 :  [sxa]  no

Rare kmer mask (rkm)                        :  [sxa]  0

Parameters for SKIM algorithm (-SK):
Number of threads (not) : 2

Also compute reverse complements (acrc)     : yes
Bases per hash (bph)                        : 10
    Automatic increase per pass (bphaipp)   : 1
    Automatic incr. cov. threshold (bphaict): 20
Hash save stepping (hss)                    : 1
Percent required (pr)                       :  [sxa]  60

Max hits per read (mhpr)                    : 2000
Max megahub ratio (mmhr)                    : 0

SW check on backbones (swcob)               : yes

Max hashes in memory (mhim)                 : 15000000
MemCap: hit reduction (mchr)                : 4096

Parameters for Hash Statistics (-HS):
Freq. cov. estim. min (fcem) : 0
Freq. estim. min normal (fenn) : 0.4
Freq. estim. max normal (fexn) : 1.6
Freq. estim. repeat (fer) : 1.9
Freq. estim. heavy repeat (fehr) : 8
Freq. estim. crazy (fecr) : 20
Mask nasty repeats (mnr) : no
Nasty repeat ratio (nrr) : 100
Nasty repeat coverage (nrc) : 0
Lossless digital normalisation (ldn) : no

Repeat level in info file (rliif)           : 6

Million hashes per buffer (mhpb)            : 16
Rare kmer early kill (rkek)                 : no

Pathfinder options (-PF):
Use quick rule (uqr) : [sxa] yes
Quick rule min len 1 (qrml1) : [sxa] -90
Quick rule min sim 1 (qrms1) : [sxa] 100
Quick rule min len 2 (qrml2) : [sxa] -80
Quick rule min sim 2 (qrms2) : [sxa] 100
Backbone quick overlap min len (bqoml) : [sxa] 20
Max. start cache fill time (mscft) : 5

Align parameters for Smith-Waterman align (-AL):
Bandwidth in percent (bip) : [sxa] 20
Bandwidth max (bmax) : [sxa] 80
Bandwidth min (bmin) : [sxa] 20
Minimum score (ms) : [sxa] 15
Minimum overlap (mo) : [sxa] 20
Minimum relative score in % (mrs) : [sxa] 60
Solexa_hack_max_errors (shme) : [sxa] -1
Extra gap penalty (egp) : [sxa] no
extra gap penalty level (egpl) : [sxa] reject_codongaps
Max. egp in percent (megpp) : [sxa] 100

Contig parameters (-CO):
Name prefix (np) : initial_mapping_testpool-to_ref
Reject on drop in relative alignment score in % (rodirs) : [sxa] 30
Mark repeats (mr) : yes
Only in result (mroir) : no
Assume SNP instead of repeats (asir) : no
Minimum reads per group needed for tagging (mrpg) : [sxa] 3
Minimum neighbour quality needed for tagging (mnq) : [sxa] 20
Minimum Group Quality needed for RMB Tagging (mgqrt) : [sxa] 30
End-read Marking Exclusion Area in bases (emea) : [sxa] 1
Set to 1 on clipping PEC (emeas1clpec) : yes
Also mark gap bases (amgb) : [sxa] yes
Also mark gap bases - even multicolumn (amgbemc) : [sxa] yes
Also mark gap bases - need both strands (amgbnbs): [sxa] yes
Force non-IUPAC consensus per sequencing type (fnicpst) : [sxa] no
Merge short reads (msr) : [sxa] no
Max errors (msrme) : [sxa] 0
Keep ends unmerged (msrkeu) : [sxa] -1
Gap override ratio (gor) : [sxa] 66

Edit options (-ED):
Mira automatic contig editing (mace) : yes
Edit kmer singlets (eks) : yes
Edit homopolymer overcalls (ehpo) : [sxa] no

Misc (-MI):
Large contig size (lcs) : 500
Large contig size for stats (lcs4s) : 5000

I know what I do (ikwid)                    : no

Extra flag 1 / sanity track check (ef1)     : no
Extra flag 2 / dnredreadsatpeaks (ef2)      : yes
Extra flag 3 / pelibdisassemble (ef3)       : yes
Extended log (el)                           : no

Nag and Warn (-NW):
Check NFS (cnfs) : stop
Check multi pass mapping (cmpm) : stop
Check template problems (ctp) : stop
Check duplicate read names (cdrn) : stop
Check max read name length (cmrnl) : stop
Max read name length (mrnl) : 0
Check average coverage (cac) : stop
Average coverage value (acv) : 160

Directories (-DI):
Top directory for writing files : initial_mapping_testpool-to_ref_assembly
For writing result files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_results
For writing result info files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_info
For writing tmp files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_tmp
Tmp redirected to (trt) :
For writing checkpoint files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_chkpt

Output files (-OUTPUT/-OUT):
Save simple singlets in project (sssip) : [sxa] no
Save tagged singlets in project (stsip) : [sxa] yes

Remove rollover tmps (rrot)                  : yes
Remove tmp directory (rtd)                   : no

Result files:
Saved as CAF                       (orc)     : yes
Saved as MAF                       (orm)     : yes
Saved as FASTA                     (orf)     : yes
Saved as GAP4 (directed assembly)  (org)     : no
Saved as phrap ACE                 (ora)     : no
Saved as GFF3                     (org3)     : no
Saved as HTML                      (orh)     : no
Saved as Transposed Contig Summary (ors)     : yes
Saved as simple text format        (ort)     : no
Saved as wiggle                    (orw)     : yes

Temporary result files:
Saved as CAF                       (otc)     : yes
Saved as MAF                       (otm)     : no
Saved as FASTA                     (otf)     : no
Saved as GAP4 (directed assembly)  (otg)     : no
Saved as phrap ACE                 (ota)     : no
Saved as HTML                      (oth)     : no
Saved as Transposed Contig Summary (ots)     : no
Saved as simple text format        (ott)     : no

Extended temporary result files:
Saved as CAF                      (oetc)     : no
Saved as FASTA                    (oetf)     : no
Saved as GAP4 (directed assembly) (oetg)     : no
Saved as phrap ACE                (oeta)     : no
Saved as HTML                     (oeth)     : no
Save also singlets               (oetas)     : no

Alignment output customisation:
TEXT characters per line (tcpl)              : 60
HTML characters per line (hcpl)              : 60
TEXT end gap fill character (tegfc)          :  
HTML end gap fill character (hegfc)          :

File / directory output names:
CAF             : initial_mapping_testpool-to_ref_out.caf
MAF             : initial_mapping_testpool-to_ref_out.maf
FASTA           : initial_mapping_testpool-to_ref_out.unpadded.fasta
FASTA quality   : initial_mapping_testpool-to_ref_out.unpadded.fasta.qual
FASTA (padded)  : initial_mapping_testpool-to_ref_out.padded.fasta
FASTA qual.(pad): initial_mapping_testpool-to_ref_out.padded.fasta.qual
GAP4 (directory): initial_mapping_testpool-to_ref_out.gap4da
ACE             : initial_mapping_testpool-to_ref_out.ace
HTML            : initial_mapping_testpool-to_ref_out.html
Simple text     : initial_mapping_testpool-to_ref_out.txt
TCS overview    : initial_mapping_testpool-to_ref_out.tcs
Wiggle          : initial_mapping_testpool-to_ref_out.wig

Deleting old directory initial_mapping_testpool-to_ref_assembly ... done.
Creating directory initial_mapping_testpool-to_ref_assembly ... done.
Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_results ... done.
Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_info ... done.
Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_chkpt ... done.
Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_tmp ... done.

Tmp directory is not on a NFS mount, good.

Localtime: Tue May 18 15:11:49 2021

Loading reference backbone from P08H-3-Mito_circ_reoriented.fa type fa
Localtime: Tue May 18 15:11:49 2021
Loading data from FASTA file:
[0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
Localtime: Tue May 18 15:11:49 2021
rnm size: 0
No FASTA quality file given, using default qualities for all reads just loaded.
Localtime: Tue May 18 15:11:49 2021

Done.
Loaded 1 reads with 0 reads having quality accounted for.
Loading reads from subset_reads1.fastq type fastq
Localtime: Tue May 18 15:11:49 2021
Loading data from FASTQ file: subset_reads1.fastq
(sorry, no progress indicator for that, possible only with zlib >=1.34)

Done.
Loaded 5 reads, Localtime: Tue May 18 15:11:49 2021
Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
Running quality values adaptation ... done.
Loading reads from subset_reads2.fastq type fastq
Localtime: Tue May 18 15:11:49 2021
Loading data from FASTQ file: subset_reads2.fastq
(sorry, no progress indicator for that, possible only with zlib >=1.34)

Done.
Loaded 5 reads, Localtime: Tue May 18 15:11:49 2021
Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
Running quality values adaptation ... done.
Deleting gap columns in backbones ... Postprocessing backbone(s) ... this may take a while.
1 to process
P08H-3-Mito_bb 16386
Contig P08H-3-Mito_bb has strain AlvCau
TODO: Like Readpool: strain x has y reads
Checking reads for trace data (loading qualities if needed):
[0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
No SCF data present in any read, EdIt automatic contig editing for Sanger data is now switched off.
11 reads with valid data for assembly.
Localtime: Tue May 18 15:11:49 2021

Generated 11 unique DNA template ids for 11 valid reads.
No useful template information found.
TODO: Like Readpool: strain x has y reads
Have read pool with 11 reads.

===========================================================================
Pool statistics:
Backbones: 1 Backbone rails: 0

    Sanger  454 IonTor  PcBioHQ PcBioLQ Text    Solexa  SOLiD
    ------------------------------------------------------------

Total reads 0 0 0 0 0 0 10 0
Reads wo qual 0 0 0 0 0 0 0 0
Used reads 0 0 0 0 0 0 10 0
Avg tot rlen 0 0 0 0 0 0 86 0
Avg rlen used 0 0 0 0 0 0 86 0
W/o clips 0 0 0 0 0 0 10 0

Solexa total bases: 864 used bases in used reads: 864

Checking pairs of readgroup 1 (named: ''): found 0
Checking pairs of readgroup 2 (named: 'reads'): found 0
WARNING: in the above readgroup, no read is paired although the manifest says there should be pairs. This is fishy!

In func: void Assembly::basicReadGroupChecks()
Throw message: MIRA found readgroups where pairs are expected but no read has a partner. See log above and then check your input please (either manifest file or data files loaded or segment_naming scheme).

========================== Memory self assessment ==============================
Running in 64 bit mode.

Could not read file /proc/meminfo

Could not read file /proc/self/status

Information on current assembly object:

AS_readpool: 11 reads.
AS_contigs: 0 contigs.
AS_bbcontigs: 1 contigs.
Mem used for reads: 192 (192 B)

Memory used in assembly structures:
Eff. Size Free cap. LostByAlign
AS_writtenskimhitsperid: 0 24 B 0 B 0 B
AS_skim_edges: 0 24 B 0 B 0 B
AS_adsfacts: 0 24 B 0 B 0 B
AS_confirmed_edges: 0 24 B 0 B 0 B
AS_permanent_overlap_bans: 1 24 B 0 B 0 B
AS_readhitmiss: 0 24 B 0 B 0 B
AS_readhmcovered: 0 24 B 0 B 0 B
AS_count_rhm: 0 24 B 0 B 0 B
AS_clipleft: 0 24 B 0 B 0 B
AS_clipright: 0 24 B 0 B 0 B
AS_used_ids: 0 24 B 0 B 0 B
AS_multicopies: 0 24 B 0 B 0 B
AS_hasmcoverlaps: 0 24 B 0 B 0 B
AS_maxcoveragereached: 0 24 B 0 B 0 B
AS_coverageperseqtype: 0 24 B 0 B 0 B
AS_istroublemaker: 0 24 B 0 B 0 B
AS_isdebris: 0 24 B 0 B 0 B
AS_needalloverlaps: 0 24 B 0 B 0 B
AS_readsforrepeatresolve: 0 40 B 0 B 0 B
AS_allrmbsok: 0 24 B 0 B 0 B
AS_probablermbsnotok: 0 24 B 0 B 0 B
AS_weakrmbsnotok: 0 24 B 0 B 0 B
AS_readmaytakeskim: 0 40 B 0 B 0 B
AS_skimstaken: 0 40 B 0 B 0 B
AS_numskimoverlaps: 0 24 B 0 B 0 B
AS_numleftextendskims: 0 24 B 0 B 0 B
AS_rightextendskims: 0 24 B 0 B 0 B
AS_skimleftextendratio: 0 24 B 0 B 0 B
AS_skimrightextendratio: 0 24 B 0 B 0 B
AS_usedtmpfiles: 1 48 B 0 B 0 B
Total: 984 (984 B)

================================================================================
Dynamic s allocs: 0
Dynamic m allocs: 0
Align allocs: 0

Fatal error (may be due to problems of the input data or parameters):


  • MIRA found readgroups where pairs are expected but no read has a partner. *
  • See log above and then check your input please (either manifest file or data *
  • files loaded or segment_naming scheme). *

->Thrown: void Assembly::basicReadGroupChecks()
->Caught: main

Aborting process, probably due to error in the input data or parametrisation.
Please check the output log for more information.
For help, please write a mail to the mira talk mailing list.
Subscribing / unsubscribing to mira talk, see: http://www.freelists.org/list/mira_talk

CWD: /Users/maeperez/Desktop/Bioinf_softwares/MITObim/AlvCau
Thank you for noticing that this is NOT a crash, but a
controlled program stop.
Failure, wrapped MIRA process aborted.

1 Attachments

Related

Tickets: #44

Discussion

  • Maeva perez

    Maeva perez - 2021-05-18

    Oh nevermind. The issue was in the manifest.conf file. I replace the line

    segment_naming = FR

    by

    segment_naming = solexa

    and it worked. Thanks for a great program.

     
    • Bastien Chevreux

      \o/

      :-)

      On 18. May 2021, at 9:39 , Maeva perez maeperez@users.sourceforge.net wrote:

      Oh nevermind. The issue was in the manifest.conf file. I replace the line

      segment_naming = FR

      by

      segment_naming = solexa

      and it worked. Thanks for a great program.

      [tickets:#44] https://sourceforge.net/p/mira-assembler/tickets/44/ MIRA 4 cannot recognise paired reads

      Status: open
      Version: 4.0.1
      Created: Tue May 18, 2021 07:16 AM UTC by Maeva perez
      Last Updated: Tue May 18, 2021 07:18 AM UTC
      Owner: nobody
      Attachments:

      manifest-1.conf https://sourceforge.net/p/mira-assembler/tickets/44/attachment/manifest-1.conf (454 Bytes; application/octet-stream)
      Hi Bastien,
      I have sucessfully run Mira in the past but this these reads downloaded from NCBI I have been unable to. My guess is that there is something wrong with the format of my fastq files so I have tried modifying the headers in different ways but without success.

      Your help would be greatly appreciated

      Below is the log and attached are the read files (small subset) and manifest.conf
      This is MIRA 4.0.2_0+g29f87d4 .

      Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
      Assembly Using Trace Signals and Additional Sequence Information.
      Computer Science and Biology: Proceedings of the German Conference on
      Bioinformatics (GCB) 99, pp. 45-56.

      To (un-)subscribe the MIRA mailing lists, see:
      http://www.chevreux.org/mira_mailinglists.html http://www.chevreux.org/mira_mailinglists.html
      After subscribing, mail general questions to the MIRA talk mailing list:
      mira_talk@freelists.org

      To report bugs or ask for features, please use the SourceForge ticketing
      system at:
      http://sourceforge.net/p/mira-assembler/tickets/ http://sourceforge.net/p/mira-assembler/tickets/
      This ensures that requests do not get lost.

      Compiled by: bach
      Fri Apr 18 14:57:56 CEST 2014
      On: Darwin airfau2.fritz.box 13.1.0 Darwin Kernel Version 13.1.0: Thu Jan 16 19:40:37 PST 2014; root:xnu-2422.90.20~2/RELEASE_X86_64 x86_64
      Compiled in boundtracking mode.
      Compiled in bugtracking mode.
      Compiled with ENABLE64 activated.
      Runtime settings (sorry, for debug):
      Size of size_t : 8
      Size of uint32 : 4
      Size of uint32_t: 4
      Size of uint64 : 8
      Size of uint64_t: 8
      Current system: Darwin Maevas-MacBook-Air.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64

      For mapping assembly: readgroup 2 named 'reads' has no 'infoonly' or 'exclusion_criterion' set for 'segment_placement',
      assuming 'infoonly'.
      Looking for files named in data ...Pushing back filename: "P08H-3-Mito_circ_reoriented.fa"
      Pushing back filename: "subset_reads1.fastq"
      Pushing back filename: "subset_reads2.fastq"
      Manifest:
      projectname: initial_mapping_testpool-to_ref
      job: genome,mapping,accurate
      parameters: -NW:mrnl=0 -AS:nop=1 SOLEXA_SETTINGS -CO:msr=no
      Manifest load entries: 2
      MLE 1:
      RGID: 1
      RGN: SN: AlvCau
      SP: SPio: 0 SPC: 0 IF: -1 IT: -1 TSio: 0
      ST: 5 (Text) namschem: 6 SID: 0
      DQ: 30
      BB: 1 Rail: 0 CER: 0

      P08H-3-Mito_circ_reoriented.fa MLE 2:
      RGID: 2
      RGN: reads SN: testpool
      SP: ---> <--- SPio: 1 SPC: -1 IF: -1 IT: -1 TSio: 0
      ST: 6 (Solexa) namschem: 3 SID: 0
      DQ: 30
      BB: 0 Rail: 0 CER: 0

      subset_reads1.fastq subset_reads2.fastq

      Parameters parsed without error, perfect.
      Overriding number of threads via '-t' with 2

      -CL:pec and -CO:emeas1clpec are set, setting -CO:emea values to 1.

      Parameter settings seen for:
      Sanger data

      Used parameter settings:
      General (-GE):
      Project name : initial_mapping_testpool-to_ref
      Number of threads (not) : 2
      Automatic memory management (amm) : yes
      Keep percent memory free (kpmf) : 15
      Max. process size (mps) : 0
      EST SNP pipeline step (esps) : 0
      Colour reads by hash frequency (crhf) : yes

      Load reads options (-LR):
      Wants quality file (wqf) : [sxa] yes

      Filecheck only (fo) : no
      Assembly options (-AS):
      Number of passes (nop) : 1
      Skim each pass (sep) : yes
      Maximum number of RMB break loops (rbl) : 1
      Maximum contigs per pass (mcpp) : 0

      Minimum read length (mrl) : [sxa] 20
      Minimum reads per contig (mrpc) : [sxa] 10
      Enforce presence of qualities (epoq) : [sxa] yes

      Automatic repeat detection (ard) : yes
      Coverage threshold (ardct) : [sxa] 2
      Minimum length (ardml) : [sxa] 200
      Grace length (ardgl) : [sxa] 20
      Use uniform read distribution (urd) : no
      Start in pass (urdsip) : 3
      Cutoff multiplier (urdcm) : [sxa] 1.5

      Spoiler detection (sd) : yes
      Last pass only (sdlpo) : yes

      Use genomic pathfinder (ugpf) : yes

      Use emergency search stop (uess) : yes
      ESS partner depth (esspd) : 500
      Use emergency blacklist (uebl) : yes
      Use max. contig build time (umcbt) : no
      Build time in seconds (bts) : 10000
      Strain and backbone options (-SB):
      Bootstrap new backbone (bnb) : yes
      Start backbone usage in pass (sbuip) : 0
      Backbone rail from strain (brfs) :
      Backbone rail length (brl) : 0
      Backbone rail overlap (bro) : 0
      Trim overhanging reads (tor) : yes

      (Also build new contigs (abnc)) : no
      Dataprocessing options (-DP):
      Use read extensions (ure) : [sxa] no
      Read extension window length (rewl) : [sxa] 30
      Read extension w. maxerrors (rewme) : [sxa] 2
      First extension in pass (feip) : [sxa] 0
      Last extension in pass (leip) : [sxa] 0

      Clipping options (-CL):
      SSAHA2 or SMALT clipping:
      Gap size (msvsgs) : [sxa] 1
      Max front gap (msvsmfg) : [sxa] 2
      Max end gap (msvsmeg) : [sxa] 2
      Strict front clip (msvssfc) : [sxa] 0
      Strict end clip (msvssec) : [sxa] 0
      Possible vector leftover clip (pvlc) : [sxa] no
      maximum len allowed (pvcmla) : [sxa] 18
      Min qual. threshold for entire read (mqtfer): [sxa] 0
      Number of bases (mqtfernob) : [sxa] 15
      Quality clip (qc) : [sxa] no
      Minimum quality (qcmq) : [sxa] 20
      Window length (qcwl) : [sxa] 30
      Bad stretch quality clip (bsqc) : [sxa] no
      Minimum quality (bsqcmq) : [sxa] 5
      Window length (bsqcwl) : [sxa] 20
      Masked bases clip (mbc) : [sxa] no
      Gap size (mbcgs) : [sxa] 5
      Max front gap (mbcmfg) : [sxa] 12
      Max end gap (mbcmeg) : [sxa] 12
      Lower case clip front (lccf) : [sxa] no
      Lower case clip back (lccb) : [sxa] no
      Clip poly A/T at ends (cpat) : [sxa] no
      Keep poly-a signal (cpkps) : [sxa] no
      Minimum signal length (cpmsl) : [sxa] 12
      Max errors allowed (cpmea) : [sxa] 1
      Max gap from ends (cpmgfe) : [sxa] 9
      Clip 3 prime polybase (c3pp) : [sxa] yes
      Minimum signal length (c3ppmsl) : [sxa] 15
      Max errors allowed (c3ppmea) : [sxa] 3
      Max gap from ends (c3ppmgfe) : [sxa] 9
      Clip known adaptors right (ckar) : [sxa] yes
      Ensure minimum left clip (emlc) : [sxa] no
      Minimum left clip req. (mlcr) : [sxa] 0
      Set minimum left clip to (smlc) : [sxa] 0
      Ensure minimum right clip (emrc) : [sxa] no
      Minimum right clip req. (mrcr) : [sxa] 10
      Set minimum right clip to (smrc) : [sxa] 20

      Apply SKIM chimera detection clip (ascdc) : no
      Apply SKIM junk detection clip (asjdc) : no

      Propose end clips (pec) : [sxa] yes
      Bases per hash (pecbph) : 31
      Handle Solexa GGCxG problem (pechsgp) : yes
      Front freq (pffreq) : [sxa] 0
      Back freq (pbfreq) : [sxa] 0
      Minimum kmer for forward-rev (pmkfr) : 1
      Front forward-rev (pffore) : [sxa] yes
      Back forward-rev (pbfore) : [sxa] yes
      Front conf. multi-seq type (pfcmst) : [sxa] yes
      Back conf. multi-seq type (pbcmst) : [sxa] yes
      Front seen at low pos (pfsalp) : [sxa] no
      Back seen at low pos (pbsalp) : [sxa] no

      Clip bad solexa ends (cbse) : [sxa] yes
      Search PhiX174 (spx174) : [sxa] yes
      Filter PhiX174 (fpx174) : [sxa] no

      Rare kmer mask (rkm) : [sxa] 0
      Parameters for SKIM algorithm (-SK):
      Number of threads (not) : 2

      Also compute reverse complements (acrc) : yes
      Bases per hash (bph) : 10
      Automatic increase per pass (bphaipp) : 1
      Automatic incr. cov. threshold (bphaict): 20
      Hash save stepping (hss) : 1
      Percent required (pr) : [sxa] 60

      Max hits per read (mhpr) : 2000
      Max megahub ratio (mmhr) : 0

      SW check on backbones (swcob) : yes

      Max hashes in memory (mhim) : 15000000
      MemCap: hit reduction (mchr) : 4096
      Parameters for Hash Statistics (-HS):
      Freq. cov. estim. min (fcem) : 0
      Freq. estim. min normal (fenn) : 0.4
      Freq. estim. max normal (fexn) : 1.6
      Freq. estim. repeat (fer) : 1.9
      Freq. estim. heavy repeat (fehr) : 8
      Freq. estim. crazy (fecr) : 20
      Mask nasty repeats (mnr) : no
      Nasty repeat ratio (nrr) : 100
      Nasty repeat coverage (nrc) : 0
      Lossless digital normalisation (ldn) : no

      Repeat level in info file (rliif) : 6

      Million hashes per buffer (mhpb) : 16
      Rare kmer early kill (rkek) : no
      Pathfinder options (-PF):
      Use quick rule (uqr) : [sxa] yes
      Quick rule min len 1 (qrml1) : [sxa] -90
      Quick rule min sim 1 (qrms1) : [sxa] 100
      Quick rule min len 2 (qrml2) : [sxa] -80
      Quick rule min sim 2 (qrms2) : [sxa] 100
      Backbone quick overlap min len (bqoml) : [sxa] 20
      Max. start cache fill time (mscft) : 5

      Align parameters for Smith-Waterman align (-AL):
      Bandwidth in percent (bip) : [sxa] 20
      Bandwidth max (bmax) : [sxa] 80
      Bandwidth min (bmin) : [sxa] 20
      Minimum score (ms) : [sxa] 15
      Minimum overlap (mo) : [sxa] 20
      Minimum relative score in % (mrs) : [sxa] 60
      Solexa_hack_max_errors (shme) : [sxa] -1
      Extra gap penalty (egp) : [sxa] no
      extra gap penalty level (egpl) : [sxa] reject_codongaps
      Max. egp in percent (megpp) : [sxa] 100

      Contig parameters (-CO):
      Name prefix (np) : initial_mapping_testpool-to_ref
      Reject on drop in relative alignment score in % (rodirs) : [sxa] 30
      Mark repeats (mr) : yes
      Only in result (mroir) : no
      Assume SNP instead of repeats (asir) : no
      Minimum reads per group needed for tagging (mrpg) : [sxa] 3
      Minimum neighbour quality needed for tagging (mnq) : [sxa] 20
      Minimum Group Quality needed for RMB Tagging (mgqrt) : [sxa] 30
      End-read Marking Exclusion Area in bases (emea) : [sxa] 1
      Set to 1 on clipping PEC (emeas1clpec) : yes
      Also mark gap bases (amgb) : [sxa] yes
      Also mark gap bases - even multicolumn (amgbemc) : [sxa] yes
      Also mark gap bases - need both strands (amgbnbs): [sxa] yes
      Force non-IUPAC consensus per sequencing type (fnicpst) : [sxa] no
      Merge short reads (msr) : [sxa] no
      Max errors (msrme) : [sxa] 0
      Keep ends unmerged (msrkeu) : [sxa] -1
      Gap override ratio (gor) : [sxa] 66

      Edit options (-ED):
      Mira automatic contig editing (mace) : yes
      Edit kmer singlets (eks) : yes
      Edit homopolymer overcalls (ehpo) : [sxa] no

      Misc (-MI):
      Large contig size (lcs) : 500
      Large contig size for stats (lcs4s) : 5000

      I know what I do (ikwid) : no

      Extra flag 1 / sanity track check (ef1) : no
      Extra flag 2 / dnredreadsatpeaks (ef2) : yes
      Extra flag 3 / pelibdisassemble (ef3) : yes
      Extended log (el) : no
      Nag and Warn (-NW):
      Check NFS (cnfs) : stop
      Check multi pass mapping (cmpm) : stop
      Check template problems (ctp) : stop
      Check duplicate read names (cdrn) : stop
      Check max read name length (cmrnl) : stop
      Max read name length (mrnl) : 0
      Check average coverage (cac) : stop
      Average coverage value (acv) : 160

      Directories (-DI):
      Top directory for writing files : initial_mapping_testpool-to_ref_assembly
      For writing result files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_results
      For writing result info files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_info
      For writing tmp files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_tmp
      Tmp redirected to (trt) :
      For writing checkpoint files : initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_chkpt

      Output files (-OUTPUT/-OUT):
      Save simple singlets in project (sssip) : [sxa] no
      Save tagged singlets in project (stsip) : [sxa] yes

      Remove rollover tmps (rrot) : yes
      Remove tmp directory (rtd) : no

      Result files:
      Saved as CAF (orc) : yes
      Saved as MAF (orm) : yes
      Saved as FASTA (orf) : yes
      Saved as GAP4 (directed assembly) (org) : no
      Saved as phrap ACE (ora) : no
      Saved as GFF3 (org3) : no
      Saved as HTML (orh) : no
      Saved as Transposed Contig Summary (ors) : yes
      Saved as simple text format (ort) : no
      Saved as wiggle (orw) : yes

      Temporary result files:
      Saved as CAF (otc) : yes
      Saved as MAF (otm) : no
      Saved as FASTA (otf) : no
      Saved as GAP4 (directed assembly) (otg) : no
      Saved as phrap ACE (ota) : no
      Saved as HTML (oth) : no
      Saved as Transposed Contig Summary (ots) : no
      Saved as simple text format (ott) : no

      Extended temporary result files:
      Saved as CAF (oetc) : no
      Saved as FASTA (oetf) : no
      Saved as GAP4 (directed assembly) (oetg) : no
      Saved as phrap ACE (oeta) : no
      Saved as HTML (oeth) : no
      Save also singlets (oetas) : no

      Alignment output customisation:
      TEXT characters per line (tcpl) : 60
      HTML characters per line (hcpl) : 60
      TEXT end gap fill character (tegfc) :
      HTML end gap fill character (hegfc) :

      File / directory output names:
      CAF : initial_mapping_testpool-to_ref_out.caf
      MAF : initial_mapping_testpool-to_ref_out.maf
      FASTA : initial_mapping_testpool-to_ref_out.unpadded.fasta
      FASTA quality : initial_mapping_testpool-to_ref_out.unpadded.fasta.qual
      FASTA (padded) : initial_mapping_testpool-to_ref_out.padded.fasta
      FASTA qual.(pad): initial_mapping_testpool-to_ref_out.padded.fasta.qual
      GAP4 (directory): initial_mapping_testpool-to_ref_out.gap4da
      ACE : initial_mapping_testpool-to_ref_out.ace
      HTML : initial_mapping_testpool-to_ref_out.html
      Simple text : initial_mapping_testpool-to_ref_out.txt
      TCS overview : initial_mapping_testpool-to_ref_out.tcs
      Wiggle : initial_mapping_testpool-to_ref_out.wig
      Deleting old directory initial_mapping_testpool-to_ref_assembly ... done.
      Creating directory initial_mapping_testpool-to_ref_assembly ... done.
      Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_results ... done.
      Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_info ... done.
      Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_chkpt ... done.
      Creating directory initial_mapping_testpool-to_ref_assembly/initial_mapping_testpool-to_ref_d_tmp ... done.

      Tmp directory is not on a NFS mount, good.

      Localtime: Tue May 18 15:11:49 2021

      Loading reference backbone from P08H-3-Mito_circ_reoriented.fa type fa
      Localtime: Tue May 18 15:11:49 2021
      Loading data from FASTA file:
      [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
      Localtime: Tue May 18 15:11:49 2021
      rnm size: 0
      No FASTA quality file given, using default qualities for all reads just loaded.
      Localtime: Tue May 18 15:11:49 2021

      Done.
      Loaded 1 reads with 0 reads having quality accounted for.
      Loading reads from subset_reads1.fastq type fastq
      Localtime: Tue May 18 15:11:49 2021
      Loading data from FASTQ file: subset_reads1.fastq
      (sorry, no progress indicator for that, possible only with zlib >=1.34)

      Done.
      Loaded 5 reads, Localtime: Tue May 18 15:11:49 2021
      Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
      Running quality values adaptation ... done.
      Loading reads from subset_reads2.fastq type fastq
      Localtime: Tue May 18 15:11:49 2021
      Loading data from FASTQ file: subset_reads2.fastq
      (sorry, no progress indicator for that, possible only with zlib >=1.34)

      Done.
      Loaded 5 reads, Localtime: Tue May 18 15:11:49 2021
      Looking at FASTQ type ... guessing FASTQ-33 (Sanger)
      Running quality values adaptation ... done.
      Deleting gap columns in backbones ... Postprocessing backbone(s) ... this may take a while.
      1 to process
      P08H-3-Mito_bb 16386
      Contig P08H-3-Mito_bb has strain AlvCau
      TODO: Like Readpool: strain x has y reads
      Checking reads for trace data (loading qualities if needed):
      [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%]
      No SCF data present in any read, EdIt automatic contig editing for Sanger data is now switched off.
      11 reads with valid data for assembly.
      Localtime: Tue May 18 15:11:49 2021

      Generated 11 unique DNA template ids for 11 valid reads.
      No useful template information found.
      TODO: Like Readpool: strain x has y reads
      Have read pool with 11 reads.

      ===========================================================================
      Pool statistics:
      Backbones: 1 Backbone rails: 0

      Sanger  454 IonTor  PcBioHQ PcBioLQ Text    Solexa  SOLiD
      ------------------------------------------------------------
      

      Total reads 0 0 0 0 0 0 10 0
      Reads wo qual 0 0 0 0 0 0 0 0
      Used reads 0 0 0 0 0 0 10 0
      Avg tot rlen 0 0 0 0 0 0 86 0
      Avg rlen used 0 0 0 0 0 0 86 0
      W/o clips 0 0 0 0 0 0 10 0

      Solexa total bases: 864 used bases in used reads: 864

      Checking pairs of readgroup 1 (named: ''): found 0
      Checking pairs of readgroup 2 (named: 'reads'): found 0
      WARNING: in the above readgroup, no read is paired although the manifest says there should be pairs. This is fishy!

      In func: void Assembly::basicReadGroupChecks()
      Throw message: MIRA found readgroups where pairs are expected but no read has a partner. See log above and then check your input please (either manifest file or data files loaded or segment_naming scheme).

      ========================== Memory self assessment ==============================
      Running in 64 bit mode.

      Could not read file /proc/meminfo

      Could not read file /proc/self/status

      Information on current assembly object:

      AS_readpool: 11 reads.
      AS_contigs: 0 contigs.
      AS_bbcontigs: 1 contigs.
      Mem used for reads: 192 (192 B)

      Memory used in assembly structures:
      Eff. Size Free cap. LostByAlign
      AS_writtenskimhitsperid: 0 24 B 0 B 0 B
      AS_skim_edges: 0 24 B 0 B 0 B
      AS_adsfacts: 0 24 B 0 B 0 B
      AS_confirmed_edges: 0 24 B 0 B 0 B
      AS_permanent_overlap_bans: 1 24 B 0 B 0 B
      AS_readhitmiss: 0 24 B 0 B 0 B
      AS_readhmcovered: 0 24 B 0 B 0 B
      AS_count_rhm: 0 24 B 0 B 0 B
      AS_clipleft: 0 24 B 0 B 0 B
      AS_clipright: 0 24 B 0 B 0 B
      AS_used_ids: 0 24 B 0 B 0 B
      AS_multicopies: 0 24 B 0 B 0 B
      AS_hasmcoverlaps: 0 24 B 0 B 0 B
      AS_maxcoveragereached: 0 24 B 0 B 0 B
      AS_coverageperseqtype: 0 24 B 0 B 0 B
      AS_istroublemaker: 0 24 B 0 B 0 B
      AS_isdebris: 0 24 B 0 B 0 B
      AS_needalloverlaps: 0 24 B 0 B 0 B
      AS_readsforrepeatresolve: 0 40 B 0 B 0 B
      AS_allrmbsok: 0 24 B 0 B 0 B
      AS_probablermbsnotok: 0 24 B 0 B 0 B
      AS_weakrmbsnotok: 0 24 B 0 B 0 B
      AS_readmaytakeskim: 0 40 B 0 B 0 B
      AS_skimstaken: 0 40 B 0 B 0 B
      AS_numskimoverlaps: 0 24 B 0 B 0 B
      AS_numleftextendskims: 0 24 B 0 B 0 B
      AS_rightextendskims: 0 24 B 0 B 0 B
      AS_skimleftextendratio: 0 24 B 0 B 0 B
      AS_skimrightextendratio: 0 24 B 0 B 0 B
      AS_usedtmpfiles: 1 48 B 0 B 0 B
      Total: 984 (984 B)

      ================================================================================
      Dynamic s allocs: 0
      Dynamic m allocs: 0
      Align allocs: 0

      Fatal error (may be due to problems of the input data or parameters):

      MIRA found readgroups where pairs are expected but no read has a partner. *
      See log above and then check your input please (either manifest file or data *
      files loaded or segment_naming scheme). *
      ->Thrown: void Assembly::basicReadGroupChecks()
      ->Caught: main

      Aborting process, probably due to error in the input data or parametrisation.
      Please check the output log for more information.
      For help, please write a mail to the mira talk mailing list.
      Subscribing / unsubscribing to mira talk, see: http://www.freelists.org/list/mira_talk http://www.freelists.org/list/mira_talk
      CWD: /Users/maeperez/Desktop/Bioinf_softwares/MITObim/AlvCau
      Thank you for noticing that this is NOT a crash, but a
      controlled program stop.
      Failure, wrapped MIRA process aborted.

      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/mira-assembler/tickets/44/ https://sourceforge.net/p/mira-assembler/tickets/44/
      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions/

       

      Related

      Tickets: #44


Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.