You can subscribe to this list here.
2012 |
Jan
(1) |
Feb
(2) |
Mar
|
Apr
(29) |
May
(8) |
Jun
(5) |
Jul
(46) |
Aug
(16) |
Sep
(5) |
Oct
(6) |
Nov
(17) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
(5) |
Feb
(2) |
Mar
(10) |
Apr
(13) |
May
(20) |
Jun
(7) |
Jul
(6) |
Aug
(14) |
Sep
(9) |
Oct
(19) |
Nov
(17) |
Dec
(3) |
2014 |
Jan
(3) |
Feb
|
Mar
(7) |
Apr
(1) |
May
(1) |
Jun
(30) |
Jul
(10) |
Aug
(2) |
Sep
(18) |
Oct
(3) |
Nov
(4) |
Dec
(13) |
2015 |
Jan
(27) |
Feb
|
Mar
(19) |
Apr
(12) |
May
(10) |
Jun
(18) |
Jul
(4) |
Aug
(2) |
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(9) |
2016 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Walenz, B. <bw...@jc...> - 2013-10-25 21:39:25
|
Hi, Geoff- (yes, behind on email) It’s the distance from the end of the first scaffold to the start of the second one. The “BA_AB” label means the first scaffold is reversed and the second one is forward. However, I’m not exactly sure where the distance is measured from. I would guess it is the lowest coord of each scaffold (X&Y below) but it could also be from 3’ to 3’. 1810018bp from X to Y X<---------- Y-------------------------------------------------------------> b On 10/25/13 5:03 PM, "Waldbieser, Geoff" <Geo...@AR...> wrote: Is the number that follows “gap” the starting position of the insertion? geoff@RAMbo:/mnt/data2/CocoAssembly14/CA_22/7-0-CGW> grep -B 7 "succeeded with contig" cgw.out | grep "Merge scaffolds" | tail -25 isQualityScaffoldMergingEdge()-- Merge scaffolds 27265 (2959.0bp) and 275496 (19181214.0bp): gap -1810018.9bp +- 6704.2bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 255057 (3728.1bp) and 275505 (19534418.0bp): gap -240550.0bp +- 1503.4bp weight 10 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 196645 (1065.0bp) and 275509 (19566468.0bp): gap -387971.0bp +- 2792.1bp weight 6 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 102999 (8904.9bp) and 275523 (19753393.0bp): gap -3177567.2bp +- 14899.6bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 3002 (14059.2bp) and 275530 (19941091.0bp): gap -36176.4bp +- 1436.3bp weight 10 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 130728 (2557.0bp) and 275532 (19963021.0bp): gap -24261.5bp +- 556.5bp weight 5 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 158301 (1467.0bp) and 275543 (20459003.0bp): gap -341483.1bp +- 6817.8bp weight 4 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 2578 (5620.0bp) and 275544 (20459880.0bp): gap -230824.4bp +- 7767.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 256757 (6270.1bp) and 275545 (20459880.0bp): gap -441446.8bp +- 7892.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 256706 (7152.3bp) and 275546 (20462065.0bp): gap -597985.2bp +- 9782.8bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 169859 (3338.0bp) and 275554 (20926019.0bp): gap -7353996.3bp +- 87666.8bp weight 7 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 89363 (2520.0bp) and 275565 (21528144.0bp): gap -530220.7bp +- 2990.5bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 137605 (3648.0bp) and 275571 (21574644.0bp): gap -19179.3bp +- 598.9bp weight 5 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 97209 (2991.0bp) and 275574 (21595403.0bp): gap -1329484.3bp +- 6888.5bp weight 5 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 108832 (3143.0bp) and 275575 (21595403.0bp): gap -1364466.4bp +- 8103.7bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 200330 (3653.0bp) and 275590 (22015373.0bp): gap -8198259.3bp +- 26024.1bp weight 4 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 118137 (1070.0bp) and 275598 (22146361.0bp): gap -34650.2bp +- 747.7bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 120336 (3509.0bp) and 275606 (22278336.0bp): gap -1465814.6bp +- 9960.3bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 44863 (1703.0bp) and 275607 (22280305.0bp): gap -1532787.1bp +- 10803.6bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 164622 (3629.0bp) and 275617 (22369299.0bp): gap -4778854.5bp +- 22817.0bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 171746 (2227.0bp) and 275618 (22369299.0bp): gap -4970596.8bp +- 23553.5bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 28139 (7112.0bp) and 275627 (22457141.0bp): gap -5279352.4bp +- 119327.0bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 115015 (2847.0bp) and 275629 (22595843.0bp): gap -161127.7bp +- 960.9bp weight 3 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 46804 (2414.0bp) and 275631 (22596305.0bp): gap -5717371.5bp +- 21776.5bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 140594 (1492.0bp) and 275642 (23025717.0bp): gap -338964.9bp +- 3568.5bp weight 5 BA_AB edge geoff@RAMbo:/mnt/data2/CocoAssembly14/CA_22/7-0-CGW> grep -B 7 "succeeded without contig" cgw.out | grep "Merge scaffolds" | tail -25 isQualityScaffoldMergingEdge()-- Merge scaffolds 43484 (1256.0bp) and 275616 (22369299.0bp): gap -3811285.9bp +- 17646.8bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 176207 (3169.0bp) and 275619 (22370140.0bp): gap -4992237.9bp +- 24006.4bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 76046 (1455.0bp) and 275620 (22372478.0bp): gap -5025563.5bp +- 117961.1bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 212452 (22404.6bp) and 275621 (22373963.0bp): gap -5198434.2bp +- 24639.0bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 256082 (13669.5bp) and 275622 (22395222.0bp): gap -5186521.9bp +- 19084.6bp weight 5 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 82929 (1082.0bp) and 275623 (22408941.0bp): gap -20478.6bp +- 600.3bp weight 4 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 234338 (27822.9bp) and 275624 (22408941.0bp): gap -956218.7bp +- 6173.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 102168 (2679.0bp) and 275625 (22436045.0bp): gap -943548.1bp +- 6082.3bp weight 4 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 210059 (18375.0bp) and 275626 (22438695.0bp): gap -2391622.3bp +- 16237.1bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 242702 (133674.2bp) and 275628 (22461991.0bp): gap -5413283.2bp +- 119246.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 137172 (1025.0bp) and 275630 (22595843.0bp): gap -5720434.4bp +- 25165.8bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 194041 (7341.0bp) and 275632 (22596442.0bp): gap -5734829.0bp +- 25228.1bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 95031 (5580.0bp) and 275633 (22603472.0bp): gap -5739511.9bp +- 16512.3bp weight 7 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 265235 (309343.0bp) and 275634 (22608821.0bp): gap -5756824.1bp +- 125387.5bp weight 3 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 265845 (44614.9bp) and 275635 (22918214.0bp): gap -11455756.5bp +- 67995.9bp weight 18 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 222439 (7760.0bp) and 275636 (22964158.0bp): gap -447388.3bp +- 1642.9bp weight 17 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 219987 (30740.0bp) and 275637 (22971968.0bp): gap -313590.7bp +- 2132.8bp weight 10 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 259391 (20621.3bp) and 275638 (22983007.0bp): gap -336934.6bp +- 2791.4bp weight 9 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 182966 (11936.0bp) and 275639 (23003678.0bp): gap -342229.5bp +- 1436.1bp weight 14 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 153932 (6252.0bp) and 275640 (23015419.0bp): gap -366394.0bp +- 2033.3bp weight 7 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 91518 (5186.0bp) and 275641 (23020693.0bp): gap -484056.8bp +- 3462.7bp weight 6 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 198374 (4460.0bp) and 275643 (23025717.0bp): gap -75374.2bp +- 1180.4bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 106339 (4893.0bp) and 275644 (23029023.0bp): gap -271853.9bp +- 2949.4bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 27128 (2895.0bp) and 275645 (23033445.0bp): gap -1152943.7bp +- 7915.6bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 182153 (5442.0bp) and 275646 (23035173.0bp): gap -1688141.4bp +- 9892.4bp weight 3 BA_AB edge Thanks. Geoff This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. |
From: Waldbieser, G. <Geo...@AR...> - 2013-10-25 21:19:04
|
Is the number that follows "gap" the starting position of the insertion? geoff@RAMbo:/mnt/data2/CocoAssembly14/CA_22/7-0-CGW> grep -B 7 "succeeded with contig" cgw.out | grep "Merge scaffolds" | tail -25 isQualityScaffoldMergingEdge()-- Merge scaffolds 27265 (2959.0bp) and 275496 (19181214.0bp): gap -1810018.9bp +- 6704.2bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 255057 (3728.1bp) and 275505 (19534418.0bp): gap -240550.0bp +- 1503.4bp weight 10 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 196645 (1065.0bp) and 275509 (19566468.0bp): gap -387971.0bp +- 2792.1bp weight 6 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 102999 (8904.9bp) and 275523 (19753393.0bp): gap -3177567.2bp +- 14899.6bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 3002 (14059.2bp) and 275530 (19941091.0bp): gap -36176.4bp +- 1436.3bp weight 10 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 130728 (2557.0bp) and 275532 (19963021.0bp): gap -24261.5bp +- 556.5bp weight 5 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 158301 (1467.0bp) and 275543 (20459003.0bp): gap -341483.1bp +- 6817.8bp weight 4 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 2578 (5620.0bp) and 275544 (20459880.0bp): gap -230824.4bp +- 7767.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 256757 (6270.1bp) and 275545 (20459880.0bp): gap -441446.8bp +- 7892.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 256706 (7152.3bp) and 275546 (20462065.0bp): gap -597985.2bp +- 9782.8bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 169859 (3338.0bp) and 275554 (20926019.0bp): gap -7353996.3bp +- 87666.8bp weight 7 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 89363 (2520.0bp) and 275565 (21528144.0bp): gap -530220.7bp +- 2990.5bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 137605 (3648.0bp) and 275571 (21574644.0bp): gap -19179.3bp +- 598.9bp weight 5 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 97209 (2991.0bp) and 275574 (21595403.0bp): gap -1329484.3bp +- 6888.5bp weight 5 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 108832 (3143.0bp) and 275575 (21595403.0bp): gap -1364466.4bp +- 8103.7bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 200330 (3653.0bp) and 275590 (22015373.0bp): gap -8198259.3bp +- 26024.1bp weight 4 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 118137 (1070.0bp) and 275598 (22146361.0bp): gap -34650.2bp +- 747.7bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 120336 (3509.0bp) and 275606 (22278336.0bp): gap -1465814.6bp +- 9960.3bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 44863 (1703.0bp) and 275607 (22280305.0bp): gap -1532787.1bp +- 10803.6bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 164622 (3629.0bp) and 275617 (22369299.0bp): gap -4778854.5bp +- 22817.0bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 171746 (2227.0bp) and 275618 (22369299.0bp): gap -4970596.8bp +- 23553.5bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 28139 (7112.0bp) and 275627 (22457141.0bp): gap -5279352.4bp +- 119327.0bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 115015 (2847.0bp) and 275629 (22595843.0bp): gap -161127.7bp +- 960.9bp weight 3 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 46804 (2414.0bp) and 275631 (22596305.0bp): gap -5717371.5bp +- 21776.5bp weight 4 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 140594 (1492.0bp) and 275642 (23025717.0bp): gap -338964.9bp +- 3568.5bp weight 5 BA_AB edge geoff@RAMbo:/mnt/data2/CocoAssembly14/CA_22/7-0-CGW> grep -B 7 "succeeded without contig" cgw.out | grep "Merge scaffolds" | tail -25 isQualityScaffoldMergingEdge()-- Merge scaffolds 43484 (1256.0bp) and 275616 (22369299.0bp): gap -3811285.9bp +- 17646.8bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 176207 (3169.0bp) and 275619 (22370140.0bp): gap -4992237.9bp +- 24006.4bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 76046 (1455.0bp) and 275620 (22372478.0bp): gap -5025563.5bp +- 117961.1bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 212452 (22404.6bp) and 275621 (22373963.0bp): gap -5198434.2bp +- 24639.0bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 256082 (13669.5bp) and 275622 (22395222.0bp): gap -5186521.9bp +- 19084.6bp weight 5 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 82929 (1082.0bp) and 275623 (22408941.0bp): gap -20478.6bp +- 600.3bp weight 4 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 234338 (27822.9bp) and 275624 (22408941.0bp): gap -956218.7bp +- 6173.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 102168 (2679.0bp) and 275625 (22436045.0bp): gap -943548.1bp +- 6082.3bp weight 4 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 210059 (18375.0bp) and 275626 (22438695.0bp): gap -2391622.3bp +- 16237.1bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 242702 (133674.2bp) and 275628 (22461991.0bp): gap -5413283.2bp +- 119246.5bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 137172 (1025.0bp) and 275630 (22595843.0bp): gap -5720434.4bp +- 25165.8bp weight 3 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 194041 (7341.0bp) and 275632 (22596442.0bp): gap -5734829.0bp +- 25228.1bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 95031 (5580.0bp) and 275633 (22603472.0bp): gap -5739511.9bp +- 16512.3bp weight 7 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 265235 (309343.0bp) and 275634 (22608821.0bp): gap -5756824.1bp +- 125387.5bp weight 3 BA_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 265845 (44614.9bp) and 275635 (22918214.0bp): gap -11455756.5bp +- 67995.9bp weight 18 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 222439 (7760.0bp) and 275636 (22964158.0bp): gap -447388.3bp +- 1642.9bp weight 17 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 219987 (30740.0bp) and 275637 (22971968.0bp): gap -313590.7bp +- 2132.8bp weight 10 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 259391 (20621.3bp) and 275638 (22983007.0bp): gap -336934.6bp +- 2791.4bp weight 9 BA_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 182966 (11936.0bp) and 275639 (23003678.0bp): gap -342229.5bp +- 1436.1bp weight 14 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 153932 (6252.0bp) and 275640 (23015419.0bp): gap -366394.0bp +- 2033.3bp weight 7 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 91518 (5186.0bp) and 275641 (23020693.0bp): gap -484056.8bp +- 3462.7bp weight 6 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 198374 (4460.0bp) and 275643 (23025717.0bp): gap -75374.2bp +- 1180.4bp weight 3 AB_BA edge isQualityScaffoldMergingEdge()-- Merge scaffolds 106339 (4893.0bp) and 275644 (23029023.0bp): gap -271853.9bp +- 2949.4bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 27128 (2895.0bp) and 275645 (23033445.0bp): gap -1152943.7bp +- 7915.6bp weight 3 AB_AB edge isQualityScaffoldMergingEdge()-- Merge scaffolds 182153 (5442.0bp) and 275646 (23035173.0bp): gap -1688141.4bp +- 9892.4bp weight 3 BA_AB edge Thanks. Geoff This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. |
From: Mayank M. <may...@ic...> - 2013-10-25 17:20:55
|
Hej Brian, I have preasembled sanger data from the repeated regions in my assembly. The repeats are too many and pretty messy to handle manually. I know that I can provide them to the assembler as normal fragments using fastaToCA. Q1.Is there some way to give really high priority to these contig/unitig size fragments as each one of these reads is much high quality than the Illumina reads which I use with 40X coverage. Q2. Is it possible to do backbone assembly. Regards, Mayank |
From: <J....@lu...> - 2013-10-11 16:10:28
|
Hello everyone, I used pacBioToCA to correct PacBio reads using a unpaired + paired Illumina dataset. During correction a number of reads were split or trimmed, this is of course expected. However when I compared the uncorrected reads with the corrected reads I observed that 6.5% of the reads actually increased in length. When a read is extended it size increases by an average of 32 bp. One particular read seemed to have been extended > 750 bp! I was wondering what mechanism actually causes this extension seen in some reads. Is this normal behavior, have you seen it before, or might there be something off? I hope someone can enlighten me on this Thanks (and have a nice weekend), Jeroen Frank |
From: Sacha L. <sac...@un...> - 2013-10-11 09:02:20
|
Hello everybody, For a sequencing project of a genome around 1Gb, I have 3 Illumina lanes and 8 SMRTcell of sequences. I am trying to use the pacbioToCA correction tool of wgs assembler, using the 3 lanes of Illumina. Up to now, I have managed to successfully run the pacBioToCA pipeline to the counting of kmers with merryl. I thus have a bunch (a lot actually) of files "asm-C-ms13.cm°.batch*.[mcdat|mcidx]". The next step is thus estimate mer threshold but it invariably failes with: merylStreamReader()-- ERROR: 0-mercounts/asm-C-ms14-cm0.mcidx is an INCOMPLETE merylStream index file! merylStreamReader()-- ERROR: 0-mercounts/asm-C-ms14-cm0.mcdat is an INCOMPLETE merylStream data file! >From what I understands, the pipeline is expecting only one file of mcdat and one of mcidx but I have a whole batch of them. Shouldn't be some part involved where the batch files are merged or shouldn't the meryl reader understand that it's not a single file ? Do you have any idea on how to overcome this ? >From a larger perspective, what kind of sotfware analysis would you recommand for my data ? I can't use allPathLG because I don't have overlapping reads, I can't use scaffolding with AHA because my genome is way to large, and I'm failling at pacBioToCA. Do you think its feasible/advisable to perform a classic assembly with wgs assembler without read correction that might be more likely to succeed ? Should I try MIRA (I think it's the last alternative I have to get a whole genome assembly with the data that I have...) Thank you so much for your help ! Sacha |
From: Walenz, B. <bw...@jc...> - 2013-10-07 20:31:50
|
The 'fragmentDepth' program, with option -scaffold, in the assembler binaries might do it. It takes as input a SORTED posmap.frgscf (or frgctg or frgdeg or whatever - they're all the same format) and outputs one of three reports. The -scaffold option claims to output mode/mean/median for each scaffold. b On 10/7/13 10:11 AM, "Cristell Navarro" <cri...@om...> wrote: > Hi all > > I was wondering how could I get the depth coverage for each scaffold in > the assembly, and more important for me, for each degenerate contigs. > > Thanks in advance. > > Cristell. > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Cristell N. <cri...@om...> - 2013-10-07 14:11:37
|
Hi all I was wondering how could I get the depth coverage for each scaffold in the assembly, and more important for me, for each degenerate contigs. Thanks in advance. Cristell. |
From: Serge K. <se...@um...> - 2013-10-05 20:39:36
|
Hi, I have not seen this error before. It looks like an rm command is failing on your system. I just committed a fix that should fix the issue. Alternatively, you could either comment out the assert on line 445 of AS_PBR/CorrectPacBio.cc or provide your Illumina data as unpaired frg files rather than paired ones. Both of these should get around your error. Sergey On Oct 4, 2013, at 2:42 AM, Francois Sabot <fra...@ir...> wrote: > Dear all > > I have followed the instructions (re-install from SVN from fresh > download for CA, kmer and even samtools), and just ran the > runCorrection.sh... > > But it failed again.. > > Have an idea ? Do I have to re-run everything from scratch ? > > here is the command > > qsub -A Glabcorr -cwd -V -S /bin/sh -q highmem.q -e $HOME/jobs/ -o > $HOME/jobs -v > LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH,PATH=$HOME/scripts:$HOME/bin:/home/sabotf/bin:/usr/local/amos-3.1.0/bin:$PATH > -pe ompi 12 -l mem_free=128g -cwd -N "pBcR_correct_asm_Glabcorr" -j y -o > /dev/null > /data/projects/assembling-glab//tempTog5681_corrected/runCorrection.sh > > I joined the runCorrection.sh script as well as the asm.layout.err files > here > > On 30/09/2013 23:33, Serge Koren wrote: >> From your output, it looks like the script is stuck trying to run the correction step. There should be only one correction step. However, your step is failing and as a result, the script sees that correction did not run and tries to run it again, in an infinite loop. This bug should be fixed in the latest source code in the repository. >> >> As far as why the actual correction step is failing, it looks like the CA version that is included with smrtportal is not CA7.0 but a development version which had some bugs calculating which partition a sequence belongs to. Again, this should be fixed in the latest source code in the repository. As a workaround, you can try lowering the number of partitions (the -p parameter in runCorrection.sh). I would recommend setting it to #threads +1. You can then run the runCorrection.sh script by hand and once there is an asm.layout.success file, re-launch the full correction which will then pick up after this step. >> >> Sergey >> >> On Sep 30, 2013, at 4:46 AM, Francois Sabot <fra...@ir...> wrote: >> >>> Hi >>> >>> Still have a weird bug, it cannot find (again) bank-transact, even if >>> AMOS is in the path (I checked) >>> >>> Here is the end of my asm.layout error for this step >>> >>> Using partition 10 to re-estimate insert sizes, total partitions so far >>> 1/200 >>> Using partition 58 to re-estimate insert sizes, total partitions so far >>> 2/200 >>> Using partition 66 to re-estimate insert sizes, total partitions so far >>> 3/200 >>> Using partition 83 to re-estimate insert sizes, total partitions so far >>> 4/200 >>> Using partition 118 to re-estimate insert sizes, total partitions so far >>> 5/200 >>> Using partition 122 to re-estimate insert sizes, total partitions so far >>> 6/200 >>> Using partition 136 to re-estimate insert sizes, total partitions so far >>> 7/200 >>> Using partition 163 to re-estimate insert sizes, total partitions so far >>> 8/200 >>> Using partition 175 to re-estimate insert sizes, total partitions so far >>> 9/200 >>> Using partition 200 to re-estimate insert sizes, total partitions so far >>> 10/200 >>> openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file >>> or directory >>> Couldn't open 'asm.200.olaps' for read: No such file or directory from 0-0 >>> >>> IlluminaTog5681 >>> 400.00 +- 100.00 -> 315.38 +- 76.35 I 37774891/37919939 >>> samples external happy 34167579 sad 112016534 >>> N/A +- N/A -> 845.33 +- 1030.84 O 198321/218601 samples >>> N/A +- N/A -> 2521.31 +- 1937.73 N 65640/69560 samples >>> N/A +- N/A -> 2592.02 +- 2017.69 A 55226/58297 samples >>> >>> Tog5681_corrected >>> Filtering mates >>> openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file >>> or directory >>> Couldn't open 'asm.200.olaps' for write: No such file or directory from 0-0 >>> CorrectPacBio.cc:445: int main(int, char**): Assertion `system(command) >>> == 0' failed. >>> >>> Failed with 'Aborted' >>> >>> Backtrace (mangled): >>> >>> /home/sabotf/sources/wgs/Linux-amd64/bin//(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x412797] >>> /lib64/libpthread.so.0[0x3bdd20f500] >>> /lib64/libc.so.6(gsignal+0x35)[0x3bdce328a5] >>> /lib64/libc.so.6(abort+0x175)[0x3bdce34085] >>> /lib64/libc.so.6[0x3bdce2ba1e] >>> /lib64/libc.so.6(__assert_perror_fail+0x0)[0x3bdce2bae0] >>> /home/sabotf/sources/wgs/Linux-amd64/bin//(main+0x2cf2)[0x40e742] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3bdce1ecdd] >>> /home/sabotf/sources/wgs/Linux-amd64/bin//[0x40b989] >>> >>> Backtrace (demangled): >>> >>> [0] /home/sabotf/sources/wgs/Linux-amd64/bin//::AS_UTL_catchCrash(int, >>> siginfo*, void*) + 0x27 [0x412797] >>> [1] /lib64/libpthread.so.0() [0x3bdd20f500] >>> [2] /lib64/libc.so.6::(null) + 0x35 [0x3bdce328a5] >>> [3] /lib64/libc.so.6::(null) + 0x175 [0x3bdce34085] >>> [4] /lib64/libc.so.6() [0x3bdce2ba1e] >>> [5] /lib64/libc.so.6::(null) + 0 [0x3bdce2bae0] >>> [6] /home/sabotf/sources/wgs/Linux-amd64/bin//::(null) + 0x2cf2 [0x40e742] >>> [7] /lib64/libc.so.6::(null) + 0xfd [0x3bdce1ecdd] >>> [8] /home/sabotf/sources/wgs/Linux-amd64/bin//() [0x40b989] >>> >>> GDB: >>> >>> >>> Any idea ?? >>> >>> Francois >>> >>> On 27/09/2013 13:12, Francois Sabot wrote: >>>> The script still continues to generate correction steps... >>>> >>>> Here are the asm.layout.err for run 25 and the output of run24 >>>> >>>> Francois >>>> >>>> PS: I changed the -t value >>>> >>>> On 25/09/2013 18:17, Serge Koren wrote: >>>>> Hi, >>>>> >>>>> There is only one step in runCorrection.sh. If you are using CA 7.0, >>>>> there is a known bug in the code that causes it to fail. The wiki >>>>> documents a workaround: >>>>> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=PacBioToCA#Error_in_runCorrection.sh_Step >>>>> >>>>> Re-launching the script will restart the step from the beginning. The >>>>> output of the step is in asm.layout.err. If you can share this, it might >>>>> give more information on how far the step is and why it may be failing. >>>>> >>>>> Sergey >>>>> >>>>> On Sep 23, 2013, at 5:22 AM, Francois Sabot <fra...@ir... >>>>> <mailto:fra...@ir...>> wrote: >>>>> >>>>>> Hi folks >>>>>> >>>>>> I am running the PacBioToCa on my entire set of data (coverage 12x >>>>>> PacBio, 55x Illumina, genome size ~ 400-450 Mb). >>>>>> >>>>>> I have few weird errors of missed linked, and i re-launch regularly the >>>>>> script to ensure its completion. >>>>>> >>>>>> My question s how can you calculate the numbers of steps in the >>>>>> runCorrection.sh part ? It is already on the 19th one, and it ran for >>>>>> days.... >>>>>> >>>>>> Can someone help me ? >>>>>> >>>>>> Francois >>>>>> >>>>>> -- >>>>>> -------------------------------------------------------- >>>>>> Francois Sabot, PhD >>>>>> >>>>>> Be realistic. Demand the Impossible. >>>>>> http://bioinfo.mpl.ird.fr/ >>>>>> http://www.mpl.ird.fr/rice >>>>>> ----------------------------------------- >>>>>> UMR DIversity, Adaptation & DEvelopment >>>>>> Centre IRD >>>>>> 911, Av Agropolis BP 64501 >>>>>> 34394 Montpellier Cedex 5 >>>>>> France >>>>>> Phone: +33 4 67 41 64 18 >>>>>> ----------------------------------------- >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >>>>>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >>>>>> SharePoint >>>>>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >>>>>> includes >>>>>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk_______________________________________________ >>>>>> wgs-assembler-users mailing list >>>>>> wgs...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>>>> >>>> >>> >>> -- >>> -------------------------------------------------------- >>> Francois Sabot, PhD >>> >>> Be realistic. Demand the Impossible. >>> http://bioinfo.mpl.ird.fr/ >>> http://www.mpl.ird.fr/rice >>> ----------------------------------------- >>> UMR DIversity, Adaptation & DEvelopment >>> Centre IRD >>> 911, Av Agropolis BP 64501 >>> 34394 Montpellier Cedex 5 >>> France >>> Phone: +33 4 67 41 64 18 >>> ----------------------------------------- >>> >> >> >> > > -- > -------------------------------------------------------- > Francois Sabot, PhD > > Be realistic. Demand the Impossible. > http://bioinfo.mpl.ird.fr/ > http://www.mpl.ird.fr/rice > ----------------------------------------- > UMR DIversity, Adaptation & DEvelopment > Centre IRD > 911, Av Agropolis BP 64501 > 34394 Montpellier Cedex 5 > France > Phone: +33 4 67 41 64 18 > ----------------------------------------- > <runCorrection.sh><asm.layout.err> |
From: Walenz, B. <bw...@jc...> - 2013-10-04 18:47:50
|
Interesting question. It can be done, but will take a bit of work. 9-terminator/*.posmap.frgscf lists read positions in scaffolds. Unfortunately, the read is labeled with it's CA-supplied UID. *.gkpStore.fastqUIDmap contains a map of the CA-supplied UID back to the original name. >From there, it's up to you to read your input files and write the reads. I'll look into modifying one of the assembler programs to generate this. If I can do it easily, I'll do it. On 10/4/13 12:02 PM, "Cristell Navarro" <cri...@om...> wrote: > Hi, I'm new using celera assembler. > > I was wondering if could I get a fastq file of the reads that were used > for make the assembly...only thats reads...the effectly assembled. > > Thanks in advance. > > Cristell. > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk > _______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Cristell N. <cri...@om...> - 2013-10-04 16:25:02
|
Hi, I'm new using celera assembler. I was wondering if could I get a fastq file of the reads that were used for make the assembly...only thats reads...the effectly assembled. Thanks in advance. Cristell. |
From: Walenz, B. <bw...@jc...> - 2013-10-04 14:57:50
|
Hi, again- (I guess I should have replied to this one instead) We changed the way the bogart unitigger detects and handles repeats. Which versions of the code are you comparing? It might be an obscure bug, but more likely, it's the unitig changes. We're preparing a release, but in general, using the latest svn is recommended. If anything, that will make the bugs easier to find and fix, and easier for you to restart after the fix. b On 10/3/13 4:45 AM, "Mayank Mahajan" <may...@ic...> wrote: > Hej Brian, > wow! I was stuck for three weeks with this consensus problem. Thats called > extreme bad luck. > > But have you check the GC content with the latest version. because when I > calculate it from the scf.fasta its 2% higher than what the caqc.pl shows me. > > Also, usually celera gives very stable assemblies when you do no change the > data and parameters. But the new version is giving me a bit different > assemblies. It seems that these builds are not completely reliable. Can you > guide me which build I must choose and how to download it. > > WarmRegards, > Mayank > > On 30 sep 2013, at 15:04, "Walenz, Brian" <bw...@jc...> wrote: > >> Dang, I think you might have hit a stupid bug I introduced and then fixed >> much later than I'm happy to admit. >> >> Broken in this: >> >> r4393 | brianwalenz | 2013-08-24 03:46:56 -0400 (Sat, 24 Aug 2013) | 2 lines >> Add runCA option cgwPreserveConsensus. >> >> Fixed in this: >> >> r4406 | brianwalenz | 2013-09-06 18:42:08 -0400 (Fri, 06 Sep 2013) | 3 lines >> The cgwPreserveConsensus default value was incorrect resulting in cgw always >> retaining the consensus sequence, and 8-consensus never recomputing it. >> >> Does 'svn info' show a revision between those two? >> >> >> >> On 9/30/13 8:12 AM, "Mayank Mahajan" <may...@ic...> wrote: >> >>> Hello, >>> I have been using the unstable release of WGS assembler. I have some >>> really good assemblies as per my post assembly checks. >>> The problem is that the {CCO tags in the asm file are all loosing the >>> information whenever they have more than one unitig in the contig. >>> Whenever there is more than one unitig in the contig the quality >>> values of the whole consensus just become zero. The contigs loose all >>> the gap information which takes into consideration the indels in the >>> reads. And respectively, all the reads also loose the indel >>> information in the contigs. >>> >>> Regards, >>> Mayank Mahajan >> > |
From: Walenz, B. <bw...@jc...> - 2013-10-04 14:54:24
|
Hi- I can't reproduce your gc content claim. Looking in the code, the gc compute will return an incorrect value if there are lowercase letters in the scaffold sequence. Can you check that your scaffold sequences have no lower case? Can you elaborate on what 'quality decreased' means? Just shorter assemblies, or validated against a reference incorrect regions? What parameters are you using? What type of reads? b On 10/4/13 7:26 AM, "Mayank Mahajan" <may...@ic...> wrote: > Hello, > I just downloaded the new wgs-assembler 4 days ago. The GC > content reported by caqc.pl in the end is wrong. When I calculate GC > using the XXX.scf.fasta file I get 3% higher GC content which is > correct as verified using assemblers. Also the quality of assembly has > decreased now as compared to the release from the last month. I have > tried different coverage and parameters but there are always these > extremely sceptic regions with almost no coverage or too much coverage. > > Warm Regards, > Mayank Mahajan |
From: Mayank M. <may...@ic...> - 2013-10-04 11:26:35
|
Hello, I just downloaded the new wgs-assembler 4 days ago. The GC content reported by caqc.pl in the end is wrong. When I calculate GC using the XXX.scf.fasta file I get 3% higher GC content which is correct as verified using assemblers. Also the quality of assembly has decreased now as compared to the release from the last month. I have tried different coverage and parameters but there are always these extremely sceptic regions with almost no coverage or too much coverage. Warm Regards, Mayank Mahajan |
From: Francois S. <fra...@ir...> - 2013-10-04 06:45:03
|
Dear all I have followed the instructions (re-install from SVN from fresh download for CA, kmer and even samtools), and just ran the runCorrection.sh... But it failed again.. Have an idea ? Do I have to re-run everything from scratch ? here is the command qsub -A Glabcorr -cwd -V -S /bin/sh -q highmem.q -e $HOME/jobs/ -o $HOME/jobs -v LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH,PATH=$HOME/scripts:$HOME/bin:/home/sabotf/bin:/usr/local/amos-3.1.0/bin:$PATH -pe ompi 12 -l mem_free=128g -cwd -N "pBcR_correct_asm_Glabcorr" -j y -o /dev/null /data/projects/assembling-glab//tempTog5681_corrected/runCorrection.sh I joined the runCorrection.sh script as well as the asm.layout.err files here On 30/09/2013 23:33, Serge Koren wrote: > From your output, it looks like the script is stuck trying to run the correction step. There should be only one correction step. However, your step is failing and as a result, the script sees that correction did not run and tries to run it again, in an infinite loop. This bug should be fixed in the latest source code in the repository. > > As far as why the actual correction step is failing, it looks like the CA version that is included with smrtportal is not CA7.0 but a development version which had some bugs calculating which partition a sequence belongs to. Again, this should be fixed in the latest source code in the repository. As a workaround, you can try lowering the number of partitions (the -p parameter in runCorrection.sh). I would recommend setting it to #threads +1. You can then run the runCorrection.sh script by hand and once there is an asm.layout.success file, re-launch the full correction which will then pick up after this step. > > Sergey > > On Sep 30, 2013, at 4:46 AM, Francois Sabot <fra...@ir...> wrote: > >> Hi >> >> Still have a weird bug, it cannot find (again) bank-transact, even if >> AMOS is in the path (I checked) >> >> Here is the end of my asm.layout error for this step >> >> Using partition 10 to re-estimate insert sizes, total partitions so far >> 1/200 >> Using partition 58 to re-estimate insert sizes, total partitions so far >> 2/200 >> Using partition 66 to re-estimate insert sizes, total partitions so far >> 3/200 >> Using partition 83 to re-estimate insert sizes, total partitions so far >> 4/200 >> Using partition 118 to re-estimate insert sizes, total partitions so far >> 5/200 >> Using partition 122 to re-estimate insert sizes, total partitions so far >> 6/200 >> Using partition 136 to re-estimate insert sizes, total partitions so far >> 7/200 >> Using partition 163 to re-estimate insert sizes, total partitions so far >> 8/200 >> Using partition 175 to re-estimate insert sizes, total partitions so far >> 9/200 >> Using partition 200 to re-estimate insert sizes, total partitions so far >> 10/200 >> openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file >> or directory >> Couldn't open 'asm.200.olaps' for read: No such file or directory from 0-0 >> >> IlluminaTog5681 >> 400.00 +- 100.00 -> 315.38 +- 76.35 I 37774891/37919939 >> samples external happy 34167579 sad 112016534 >> N/A +- N/A -> 845.33 +- 1030.84 O 198321/218601 samples >> N/A +- N/A -> 2521.31 +- 1937.73 N 65640/69560 samples >> N/A +- N/A -> 2592.02 +- 2017.69 A 55226/58297 samples >> >> Tog5681_corrected >> Filtering mates >> openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file >> or directory >> Couldn't open 'asm.200.olaps' for write: No such file or directory from 0-0 >> CorrectPacBio.cc:445: int main(int, char**): Assertion `system(command) >> == 0' failed. >> >> Failed with 'Aborted' >> >> Backtrace (mangled): >> >> /home/sabotf/sources/wgs/Linux-amd64/bin//(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x412797] >> /lib64/libpthread.so.0[0x3bdd20f500] >> /lib64/libc.so.6(gsignal+0x35)[0x3bdce328a5] >> /lib64/libc.so.6(abort+0x175)[0x3bdce34085] >> /lib64/libc.so.6[0x3bdce2ba1e] >> /lib64/libc.so.6(__assert_perror_fail+0x0)[0x3bdce2bae0] >> /home/sabotf/sources/wgs/Linux-amd64/bin//(main+0x2cf2)[0x40e742] >> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3bdce1ecdd] >> /home/sabotf/sources/wgs/Linux-amd64/bin//[0x40b989] >> >> Backtrace (demangled): >> >> [0] /home/sabotf/sources/wgs/Linux-amd64/bin//::AS_UTL_catchCrash(int, >> siginfo*, void*) + 0x27 [0x412797] >> [1] /lib64/libpthread.so.0() [0x3bdd20f500] >> [2] /lib64/libc.so.6::(null) + 0x35 [0x3bdce328a5] >> [3] /lib64/libc.so.6::(null) + 0x175 [0x3bdce34085] >> [4] /lib64/libc.so.6() [0x3bdce2ba1e] >> [5] /lib64/libc.so.6::(null) + 0 [0x3bdce2bae0] >> [6] /home/sabotf/sources/wgs/Linux-amd64/bin//::(null) + 0x2cf2 [0x40e742] >> [7] /lib64/libc.so.6::(null) + 0xfd [0x3bdce1ecdd] >> [8] /home/sabotf/sources/wgs/Linux-amd64/bin//() [0x40b989] >> >> GDB: >> >> >> Any idea ?? >> >> Francois >> >> On 27/09/2013 13:12, Francois Sabot wrote: >>> The script still continues to generate correction steps... >>> >>> Here are the asm.layout.err for run 25 and the output of run24 >>> >>> Francois >>> >>> PS: I changed the -t value >>> >>> On 25/09/2013 18:17, Serge Koren wrote: >>>> Hi, >>>> >>>> There is only one step in runCorrection.sh. If you are using CA 7.0, >>>> there is a known bug in the code that causes it to fail. The wiki >>>> documents a workaround: >>>> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=PacBioToCA#Error_in_runCorrection.sh_Step >>>> >>>> Re-launching the script will restart the step from the beginning. The >>>> output of the step is in asm.layout.err. If you can share this, it might >>>> give more information on how far the step is and why it may be failing. >>>> >>>> Sergey >>>> >>>> On Sep 23, 2013, at 5:22 AM, Francois Sabot <fra...@ir... >>>> <mailto:fra...@ir...>> wrote: >>>> >>>>> Hi folks >>>>> >>>>> I am running the PacBioToCa on my entire set of data (coverage 12x >>>>> PacBio, 55x Illumina, genome size ~ 400-450 Mb). >>>>> >>>>> I have few weird errors of missed linked, and i re-launch regularly the >>>>> script to ensure its completion. >>>>> >>>>> My question s how can you calculate the numbers of steps in the >>>>> runCorrection.sh part ? It is already on the 19th one, and it ran for >>>>> days.... >>>>> >>>>> Can someone help me ? >>>>> >>>>> Francois >>>>> >>>>> -- >>>>> -------------------------------------------------------- >>>>> Francois Sabot, PhD >>>>> >>>>> Be realistic. Demand the Impossible. >>>>> http://bioinfo.mpl.ird.fr/ >>>>> http://www.mpl.ird.fr/rice >>>>> ----------------------------------------- >>>>> UMR DIversity, Adaptation & DEvelopment >>>>> Centre IRD >>>>> 911, Av Agropolis BP 64501 >>>>> 34394 Montpellier Cedex 5 >>>>> France >>>>> Phone: +33 4 67 41 64 18 >>>>> ----------------------------------------- >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >>>>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >>>>> SharePoint >>>>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >>>>> includes >>>>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk_______________________________________________ >>>>> wgs-assembler-users mailing list >>>>> wgs...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>>> >>> >> >> -- >> -------------------------------------------------------- >> Francois Sabot, PhD >> >> Be realistic. Demand the Impossible. >> http://bioinfo.mpl.ird.fr/ >> http://www.mpl.ird.fr/rice >> ----------------------------------------- >> UMR DIversity, Adaptation & DEvelopment >> Centre IRD >> 911, Av Agropolis BP 64501 >> 34394 Montpellier Cedex 5 >> France >> Phone: +33 4 67 41 64 18 >> ----------------------------------------- >> > > > -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Mayank M. <may...@ic...> - 2013-10-03 08:45:16
|
Hej Brian, wow! I was stuck for three weeks with this consensus problem. Thats called extreme bad luck. But have you check the GC content with the latest version. because when I calculate it from the scf.fasta its 2% higher than what the caqc.pl shows me. Also, usually celera gives very stable assemblies when you do no change the data and parameters. But the new version is giving me a bit different assemblies. It seems that these builds are not completely reliable. Can you guide me which build I must choose and how to download it. WarmRegards, Mayank On 30 sep 2013, at 15:04, "Walenz, Brian" <bw...@jc...> wrote: > Dang, I think you might have hit a stupid bug I introduced and then fixed > much later than I'm happy to admit. > > Broken in this: > > r4393 | brianwalenz | 2013-08-24 03:46:56 -0400 (Sat, 24 Aug 2013) | 2 lines > Add runCA option cgwPreserveConsensus. > > Fixed in this: > > r4406 | brianwalenz | 2013-09-06 18:42:08 -0400 (Fri, 06 Sep 2013) | 3 lines > The cgwPreserveConsensus default value was incorrect resulting in cgw always > retaining the consensus sequence, and 8-consensus never recomputing it. > > Does 'svn info' show a revision between those two? > > > > On 9/30/13 8:12 AM, "Mayank Mahajan" <may...@ic...> wrote: > >> Hello, >> I have been using the unstable release of WGS assembler. I have some >> really good assemblies as per my post assembly checks. >> The problem is that the {CCO tags in the asm file are all loosing the >> information whenever they have more than one unitig in the contig. >> Whenever there is more than one unitig in the contig the quality >> values of the whole consensus just become zero. The contigs loose all >> the gap information which takes into consideration the indels in the >> reads. And respectively, all the reads also loose the indel >> information in the contigs. >> >> Regards, >> Mayank Mahajan > |
From: Serge K. <se...@um...> - 2013-09-30 21:52:54
|
From your output, it looks like the script is stuck trying to run the correction step. There should be only one correction step. However, your step is failing and as a result, the script sees that correction did not run and tries to run it again, in an infinite loop. This bug should be fixed in the latest source code in the repository. As far as why the actual correction step is failing, it looks like the CA version that is included with smrtportal is not CA7.0 but a development version which had some bugs calculating which partition a sequence belongs to. Again, this should be fixed in the latest source code in the repository. As a workaround, you can try lowering the number of partitions (the -p parameter in runCorrection.sh). I would recommend setting it to #threads +1. You can then run the runCorrection.sh script by hand and once there is an asm.layout.success file, re-launch the full correction which will then pick up after this step. Sergey On Sep 30, 2013, at 4:46 AM, Francois Sabot <fra...@ir...> wrote: > Hi > > Still have a weird bug, it cannot find (again) bank-transact, even if > AMOS is in the path (I checked) > > Here is the end of my asm.layout error for this step > > Using partition 10 to re-estimate insert sizes, total partitions so far > 1/200 > Using partition 58 to re-estimate insert sizes, total partitions so far > 2/200 > Using partition 66 to re-estimate insert sizes, total partitions so far > 3/200 > Using partition 83 to re-estimate insert sizes, total partitions so far > 4/200 > Using partition 118 to re-estimate insert sizes, total partitions so far > 5/200 > Using partition 122 to re-estimate insert sizes, total partitions so far > 6/200 > Using partition 136 to re-estimate insert sizes, total partitions so far > 7/200 > Using partition 163 to re-estimate insert sizes, total partitions so far > 8/200 > Using partition 175 to re-estimate insert sizes, total partitions so far > 9/200 > Using partition 200 to re-estimate insert sizes, total partitions so far > 10/200 > openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file > or directory > Couldn't open 'asm.200.olaps' for read: No such file or directory from 0-0 > > IlluminaTog5681 > 400.00 +- 100.00 -> 315.38 +- 76.35 I 37774891/37919939 > samples external happy 34167579 sad 112016534 > N/A +- N/A -> 845.33 +- 1030.84 O 198321/218601 samples > N/A +- N/A -> 2521.31 +- 1937.73 N 65640/69560 samples > N/A +- N/A -> 2592.02 +- 2017.69 A 55226/58297 samples > > Tog5681_corrected > Filtering mates > openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file > or directory > Couldn't open 'asm.200.olaps' for write: No such file or directory from 0-0 > CorrectPacBio.cc:445: int main(int, char**): Assertion `system(command) > == 0' failed. > > Failed with 'Aborted' > > Backtrace (mangled): > > /home/sabotf/sources/wgs/Linux-amd64/bin//(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x412797] > /lib64/libpthread.so.0[0x3bdd20f500] > /lib64/libc.so.6(gsignal+0x35)[0x3bdce328a5] > /lib64/libc.so.6(abort+0x175)[0x3bdce34085] > /lib64/libc.so.6[0x3bdce2ba1e] > /lib64/libc.so.6(__assert_perror_fail+0x0)[0x3bdce2bae0] > /home/sabotf/sources/wgs/Linux-amd64/bin//(main+0x2cf2)[0x40e742] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x3bdce1ecdd] > /home/sabotf/sources/wgs/Linux-amd64/bin//[0x40b989] > > Backtrace (demangled): > > [0] /home/sabotf/sources/wgs/Linux-amd64/bin//::AS_UTL_catchCrash(int, > siginfo*, void*) + 0x27 [0x412797] > [1] /lib64/libpthread.so.0() [0x3bdd20f500] > [2] /lib64/libc.so.6::(null) + 0x35 [0x3bdce328a5] > [3] /lib64/libc.so.6::(null) + 0x175 [0x3bdce34085] > [4] /lib64/libc.so.6() [0x3bdce2ba1e] > [5] /lib64/libc.so.6::(null) + 0 [0x3bdce2bae0] > [6] /home/sabotf/sources/wgs/Linux-amd64/bin//::(null) + 0x2cf2 [0x40e742] > [7] /lib64/libc.so.6::(null) + 0xfd [0x3bdce1ecdd] > [8] /home/sabotf/sources/wgs/Linux-amd64/bin//() [0x40b989] > > GDB: > > > Any idea ?? > > Francois > > On 27/09/2013 13:12, Francois Sabot wrote: >> The script still continues to generate correction steps... >> >> Here are the asm.layout.err for run 25 and the output of run24 >> >> Francois >> >> PS: I changed the -t value >> >> On 25/09/2013 18:17, Serge Koren wrote: >>> Hi, >>> >>> There is only one step in runCorrection.sh. If you are using CA 7.0, >>> there is a known bug in the code that causes it to fail. The wiki >>> documents a workaround: >>> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=PacBioToCA#Error_in_runCorrection.sh_Step >>> >>> Re-launching the script will restart the step from the beginning. The >>> output of the step is in asm.layout.err. If you can share this, it might >>> give more information on how far the step is and why it may be failing. >>> >>> Sergey >>> >>> On Sep 23, 2013, at 5:22 AM, Francois Sabot <fra...@ir... >>> <mailto:fra...@ir...>> wrote: >>> >>>> Hi folks >>>> >>>> I am running the PacBioToCa on my entire set of data (coverage 12x >>>> PacBio, 55x Illumina, genome size ~ 400-450 Mb). >>>> >>>> I have few weird errors of missed linked, and i re-launch regularly the >>>> script to ensure its completion. >>>> >>>> My question s how can you calculate the numbers of steps in the >>>> runCorrection.sh part ? It is already on the 19th one, and it ran for >>>> days.... >>>> >>>> Can someone help me ? >>>> >>>> Francois >>>> >>>> -- >>>> -------------------------------------------------------- >>>> Francois Sabot, PhD >>>> >>>> Be realistic. Demand the Impossible. >>>> http://bioinfo.mpl.ird.fr/ >>>> http://www.mpl.ird.fr/rice >>>> ----------------------------------------- >>>> UMR DIversity, Adaptation & DEvelopment >>>> Centre IRD >>>> 911, Av Agropolis BP 64501 >>>> 34394 Montpellier Cedex 5 >>>> France >>>> Phone: +33 4 67 41 64 18 >>>> ----------------------------------------- >>>> >>>> ------------------------------------------------------------------------------ >>>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >>>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >>>> SharePoint >>>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >>>> includes >>>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk_______________________________________________ >>>> wgs-assembler-users mailing list >>>> wgs...@li... >>>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >>> >> > > -- > -------------------------------------------------------- > Francois Sabot, PhD > > Be realistic. Demand the Impossible. > http://bioinfo.mpl.ird.fr/ > http://www.mpl.ird.fr/rice > ----------------------------------------- > UMR DIversity, Adaptation & DEvelopment > Centre IRD > 911, Av Agropolis BP 64501 > 34394 Montpellier Cedex 5 > France > Phone: +33 4 67 41 64 18 > ----------------------------------------- > |
From: Walenz, B. <bw...@jc...> - 2013-09-30 13:05:45
|
Dang, I think you might have hit a stupid bug I introduced and then fixed much later than I'm happy to admit. Broken in this: r4393 | brianwalenz | 2013-08-24 03:46:56 -0400 (Sat, 24 Aug 2013) | 2 lines Add runCA option cgwPreserveConsensus. Fixed in this: r4406 | brianwalenz | 2013-09-06 18:42:08 -0400 (Fri, 06 Sep 2013) | 3 lines The cgwPreserveConsensus default value was incorrect resulting in cgw always retaining the consensus sequence, and 8-consensus never recomputing it. Does 'svn info' show a revision between those two? On 9/30/13 8:12 AM, "Mayank Mahajan" <may...@ic...> wrote: > Hello, > I have been using the unstable release of WGS assembler. I have some > really good assemblies as per my post assembly checks. > The problem is that the {CCO tags in the asm file are all loosing the > information whenever they have more than one unitig in the contig. > Whenever there is more than one unitig in the contig the quality > values of the whole consensus just become zero. The contigs loose all > the gap information which takes into consideration the indels in the > reads. And respectively, all the reads also loose the indel > information in the contigs. > > Regards, > Mayank Mahajan |
From: Mayank M. <may...@ic...> - 2013-09-30 12:31:50
|
Hello, I have been using the unstable release of WGS assembler. I have some really good assemblies as per my post assembly checks. The problem is that the {CCO tags in the asm file are all loosing the information whenever they have more than one unitig in the contig. Whenever there is more than one unitig in the contig the quality values of the whole consensus just become zero. The contigs loose all the gap information which takes into consideration the indels in the reads. And respectively, all the reads also loose the indel information in the contigs. Regards, Mayank Mahajan |
From: Francois S. <fra...@ir...> - 2013-09-30 08:48:25
|
Hi Still have a weird bug, it cannot find (again) bank-transact, even if AMOS is in the path (I checked) Here is the end of my asm.layout error for this step Using partition 10 to re-estimate insert sizes, total partitions so far 1/200 Using partition 58 to re-estimate insert sizes, total partitions so far 2/200 Using partition 66 to re-estimate insert sizes, total partitions so far 3/200 Using partition 83 to re-estimate insert sizes, total partitions so far 4/200 Using partition 118 to re-estimate insert sizes, total partitions so far 5/200 Using partition 122 to re-estimate insert sizes, total partitions so far 6/200 Using partition 136 to re-estimate insert sizes, total partitions so far 7/200 Using partition 163 to re-estimate insert sizes, total partitions so far 8/200 Using partition 175 to re-estimate insert sizes, total partitions so far 9/200 Using partition 200 to re-estimate insert sizes, total partitions so far 10/200 openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file or directory Couldn't open 'asm.200.olaps' for read: No such file or directory from 0-0 IlluminaTog5681 400.00 +- 100.00 -> 315.38 +- 76.35 I 37774891/37919939 samples external happy 34167579 sad 112016534 N/A +- N/A -> 845.33 +- 1030.84 O 198321/218601 samples N/A +- N/A -> 2521.31 +- 1937.73 N 65640/69560 samples N/A +- N/A -> 2592.02 +- 2017.69 A 55226/58297 samples Tog5681_corrected Filtering mates openLayFile()-- Failed to open 'asm.200.olaps' for reading: No such file or directory Couldn't open 'asm.200.olaps' for write: No such file or directory from 0-0 CorrectPacBio.cc:445: int main(int, char**): Assertion `system(command) == 0' failed. Failed with 'Aborted' Backtrace (mangled): /home/sabotf/sources/wgs/Linux-amd64/bin//(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x412797] /lib64/libpthread.so.0[0x3bdd20f500] /lib64/libc.so.6(gsignal+0x35)[0x3bdce328a5] /lib64/libc.so.6(abort+0x175)[0x3bdce34085] /lib64/libc.so.6[0x3bdce2ba1e] /lib64/libc.so.6(__assert_perror_fail+0x0)[0x3bdce2bae0] /home/sabotf/sources/wgs/Linux-amd64/bin//(main+0x2cf2)[0x40e742] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3bdce1ecdd] /home/sabotf/sources/wgs/Linux-amd64/bin//[0x40b989] Backtrace (demangled): [0] /home/sabotf/sources/wgs/Linux-amd64/bin//::AS_UTL_catchCrash(int, siginfo*, void*) + 0x27 [0x412797] [1] /lib64/libpthread.so.0() [0x3bdd20f500] [2] /lib64/libc.so.6::(null) + 0x35 [0x3bdce328a5] [3] /lib64/libc.so.6::(null) + 0x175 [0x3bdce34085] [4] /lib64/libc.so.6() [0x3bdce2ba1e] [5] /lib64/libc.so.6::(null) + 0 [0x3bdce2bae0] [6] /home/sabotf/sources/wgs/Linux-amd64/bin//::(null) + 0x2cf2 [0x40e742] [7] /lib64/libc.so.6::(null) + 0xfd [0x3bdce1ecdd] [8] /home/sabotf/sources/wgs/Linux-amd64/bin//() [0x40b989] GDB: Any idea ?? Francois On 27/09/2013 13:12, Francois Sabot wrote: > The script still continues to generate correction steps... > > Here are the asm.layout.err for run 25 and the output of run24 > > Francois > > PS: I changed the -t value > > On 25/09/2013 18:17, Serge Koren wrote: >> Hi, >> >> There is only one step in runCorrection.sh. If you are using CA 7.0, >> there is a known bug in the code that causes it to fail. The wiki >> documents a workaround: >> http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=PacBioToCA#Error_in_runCorrection.sh_Step >> >> Re-launching the script will restart the step from the beginning. The >> output of the step is in asm.layout.err. If you can share this, it might >> give more information on how far the step is and why it may be failing. >> >> Sergey >> >> On Sep 23, 2013, at 5:22 AM, Francois Sabot <fra...@ir... >> <mailto:fra...@ir...>> wrote: >> >>> Hi folks >>> >>> I am running the PacBioToCa on my entire set of data (coverage 12x >>> PacBio, 55x Illumina, genome size ~ 400-450 Mb). >>> >>> I have few weird errors of missed linked, and i re-launch regularly the >>> script to ensure its completion. >>> >>> My question s how can you calculate the numbers of steps in the >>> runCorrection.sh part ? It is already on the 19th one, and it ran for >>> days.... >>> >>> Can someone help me ? >>> >>> Francois >>> >>> -- >>> -------------------------------------------------------- >>> Francois Sabot, PhD >>> >>> Be realistic. Demand the Impossible. >>> http://bioinfo.mpl.ird.fr/ >>> http://www.mpl.ird.fr/rice >>> ----------------------------------------- >>> UMR DIversity, Adaptation & DEvelopment >>> Centre IRD >>> 911, Av Agropolis BP 64501 >>> 34394 Montpellier Cedex 5 >>> France >>> Phone: +33 4 67 41 64 18 >>> ----------------------------------------- >>> >>> ------------------------------------------------------------------------------ >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >>> SharePoint >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >>> includes >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk_______________________________________________ >>> wgs-assembler-users mailing list >>> wgs...@li... >>> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users >> > -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Francois S. <fra...@ir...> - 2013-09-27 11:14:49
|
The script still continues to generate correction steps... Here are the asm.layout.err for run 25 and the output of run24 Francois PS: I changed the -t value On 25/09/2013 18:17, Serge Koren wrote: > Hi, > > There is only one step in runCorrection.sh. If you are using CA 7.0, > there is a known bug in the code that causes it to fail. The wiki > documents a workaround: > http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=PacBioToCA#Error_in_runCorrection.sh_Step > > Re-launching the script will restart the step from the beginning. The > output of the step is in asm.layout.err. If you can share this, it might > give more information on how far the step is and why it may be failing. > > Sergey > > On Sep 23, 2013, at 5:22 AM, Francois Sabot <fra...@ir... > <mailto:fra...@ir...>> wrote: > >> Hi folks >> >> I am running the PacBioToCa on my entire set of data (coverage 12x >> PacBio, 55x Illumina, genome size ~ 400-450 Mb). >> >> I have few weird errors of missed linked, and i re-launch regularly the >> script to ensure its completion. >> >> My question s how can you calculate the numbers of steps in the >> runCorrection.sh part ? It is already on the 19th one, and it ran for >> days.... >> >> Can someone help me ? >> >> Francois >> >> -- >> -------------------------------------------------------- >> Francois Sabot, PhD >> >> Be realistic. Demand the Impossible. >> http://bioinfo.mpl.ird.fr/ >> http://www.mpl.ird.fr/rice >> ----------------------------------------- >> UMR DIversity, Adaptation & DEvelopment >> Centre IRD >> 911, Av Agropolis BP 64501 >> 34394 Montpellier Cedex 5 >> France >> Phone: +33 4 67 41 64 18 >> ----------------------------------------- >> >> ------------------------------------------------------------------------------ >> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >> SharePoint >> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >> includes >> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk_______________________________________________ >> wgs-assembler-users mailing list >> wgs...@li... >> https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users > -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Serge K. <se...@um...> - 2013-09-25 16:17:26
|
Hi, There is only one step in runCorrection.sh. If you are using CA 7.0, there is a known bug in the code that causes it to fail. The wiki documents a workaround: http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=PacBioToCA#Error_in_runCorrection.sh_Step Re-launching the script will restart the step from the beginning. The output of the step is in asm.layout.err. If you can share this, it might give more information on how far the step is and why it may be failing. Sergey On Sep 23, 2013, at 5:22 AM, Francois Sabot <fra...@ir...> wrote: > Hi folks > > I am running the PacBioToCa on my entire set of data (coverage 12x > PacBio, 55x Illumina, genome size ~ 400-450 Mb). > > I have few weird errors of missed linked, and i re-launch regularly the > script to ensure its completion. > > My question s how can you calculate the numbers of steps in the > runCorrection.sh part ? It is already on the 19th one, and it ran for > days.... > > Can someone help me ? > > Francois > > -- > -------------------------------------------------------- > Francois Sabot, PhD > > Be realistic. Demand the Impossible. > http://bioinfo.mpl.ird.fr/ > http://www.mpl.ird.fr/rice > ----------------------------------------- > UMR DIversity, Adaptation & DEvelopment > Centre IRD > 911, Av Agropolis BP 64501 > 34394 Montpellier Cedex 5 > France > Phone: +33 4 67 41 64 18 > ----------------------------------------- > > ------------------------------------------------------------------------------ > LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint > 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes > Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |
From: Francois S. <fra...@ir...> - 2013-09-23 09:42:24
|
Hi folks I am running the PacBioToCa on my entire set of data (coverage 12x PacBio, 55x Illumina, genome size ~ 400-450 Mb). I have few weird errors of missed linked, and i re-launch regularly the script to ensure its completion. My question s how can you calculate the numbers of steps in the runCorrection.sh part ? It is already on the 19th one, and it ran for days.... Can someone help me ? Francois -- -------------------------------------------------------- Francois Sabot, PhD Be realistic. Demand the Impossible. http://bioinfo.mpl.ird.fr/ http://www.mpl.ird.fr/rice ----------------------------------------- UMR DIversity, Adaptation & DEvelopment Centre IRD 911, Av Agropolis BP 64501 34394 Montpellier Cedex 5 France Phone: +33 4 67 41 64 18 ----------------------------------------- |
From: Walenz, B. <bw...@jc...> - 2013-09-23 05:07:21
|
Hi, Ole- You are correct in the interpretation of size 20 gaps. Half seems kind of high, but I can’t say I’ve ever counted. I had something that almost did what you’re after. I made a few mods (svn update!) to make it even closer. It will now output a list of the mates that are in the same scaffold but different contigs (unitigs). Output file OUTPUT.mate.diff.ctgscf will contain the mates these mates. There are three other similar outputs; one for mates in the same contig, and two for the same analysis on unitigs. 9-terminator% analyzePosMap –p ASM –o OUTPUT –g ../ASM.gkpStore –A libraryfate For a pair of overlapping contigs, a simple grep will get all the mates between them. CAUTION! Mates can span multiple gaps! I have no documentation for this. Feel free (and encouraged) to write some for me, even if its just an outline. Super easy to add this to runCA if useful....and documented (hint, hint). b On 9/22/13 7:44 AM, "Ole Kristian Tørresen" <o.k...@ib...> wrote: Hi, I've been thinking a bit. In some of the assemblies I have, half the gaps are of size 20. This means that the contigs that are on each side of the gap, was supposed to overlap, but had too much differences to be able to be merged (10%). Have I understood this correctly? Is half the gaps in size 20 an expected number? I'm wondering if these failed overlaps is because of heterozygosity, for example, the wrong haplotype is tested for overlap. Could this be the case? I'm not sure how to test this though. One possibility is that there's a difference in the length of the haplotypes. So if I can find all the insert sizes of the pairs that span a gap of 20 bases, I might see that they group into two different groups. If the length differences is large, this should be pretty clear. Is there a way to get all the insert sizes of the pairs that map across these gaps? Thank you. Ole |
From: Ole K. T. <o.k...@ib...> - 2013-09-22 12:19:24
|
Hi, I've been thinking a bit. In some of the assemblies I have, half the gaps are of size 20. This means that the contigs that are on each side of the gap, was supposed to overlap, but had too much differences to be able to be merged (10%). Have I understood this correctly? Is half the gaps in size 20 an expected number? I'm wondering if these failed overlaps is because of heterozygosity, for example, the wrong haplotype is tested for overlap. Could this be the case? I'm not sure how to test this though. One possibility is that there's a difference in the length of the haplotypes. So if I can find all the insert sizes of the pairs that span a gap of 20 bases, I might see that they group into two different groups. If the length differences is large, this should be pretty clear. Is there a way to get all the insert sizes of the pairs that map across these gaps? Thank you. Ole |
From: Serge K. <se...@um...> - 2013-08-31 13:02:03
|
Yes, Jared is correct. You need to use gatekeeper to cerate the frg file. CA renames the sequences from a fastq file which is why you are getting the error about undefined fragments. Dumping the frg file from the gatekeeper will get the correct names. cavalidate will run toAmos and other scripts to create your bank. toAmos_new is recommended for larger projects as it is significantly faster than toAmos. Sergey On Aug 28, 2013, at 10:28 PM, "Decker, Jared Egan (MU-Student)" <je...@ma...> wrote: > Diego, > First I would run this: > gatekeeper -dumpfrg PROJECT.gkpStore | grep -v 'No source' > PROJECT.frg > In your case, it looks like SE-MT8 is your PROJECT prefix. > Then run > cavalidate PROJECT > with your PROJECT prefix. This will create a bank that you can analyze with FRCurve and Hawkeye. > > If I am off base, hopefully one of the other list serve members can straighten me out. :-) > > Thanks, > Jared > > Jared Decker > Assistant Professor, Beef Genetics Extension and Computational Genomics > Division of Animal Sciences > University of Missouri > S132B ASRC > 920 East Campus Dr. > Columbia, MO 65211 > Phone 573-882-2504 > http://www.linkedin.com/in/jarededecker > > > > From: diego [mailto:die...@gm...] > Sent: Wednesday, August 28, 2013 5:30 PM > To: wgs...@li... > Subject: [wgs-assembler-users] problem with celera 7.0 and amos 3.1.0 > > Hi > > I'm trying to visualize my celera assembly with Hawkeye, but i get an error when i use toAmos script to parse my .asm file. > > I tried with two scripts, "toAmos" and "toAmos_new", but i get similar errors. > > Error with toAmos > "$toAmos -f ../C28.frg -a SE-MT8.asm -o - | bank-transcat -m - -b example.bnk -c > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133168. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133175. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133182. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133189. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133196. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133203. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133210. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133217. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133224. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133231. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133238. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133245. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133252. > Use of uninitialized value $iid in hash element at /projects3/ddiaz/amos/bin/toAmos line 1274, <IN> line 133259. > .." > > i checked the line 1274 on toAmos script and it has the following sentence: " $seq_range{$iid} = $clrstr;". > This error arises from previous line "my $iid = $seqids{$acc};" where $seqids{$acc} is null. > I noticed that "$seqids{$acc}" is filled up on the sub "parseFrgFile", on the following sentence: > > " if ($type eq "FRG") { > my $id = getCAId($$fields{acc}); > my $iid = $minSeqId++; > my $nm = $$fields{src}; > my @lines = split('\n', $nm); > $nm = $lines[0]; # join('', @lines); > if ($byaccession || !defined $nm || $nm =~ /^\s*$/) { > $seqnames{$iid} = $id; > } else { > $seqnames{$iid} = $nm; > $seqids{$nm} = $iid; > } > $seqids{$id} = $iid;" > > but there isn't any FRG string on the .asm file. > > Error with toAmos_new > "$ toAmos_new ../data/trimmomatic_outputs/Vpkx_unmated.frg -a SE-MP-3.4.asm -b Vpkx.bank" > Error fragments 110000762732 are not defined > Error fragments 110000596019 are not defined > Error fragments 120000810529 are not defined > Error fragments 200001924304 are not defined > Error fragments 200001469500 are not defined > Error fragments 200001648709 are not defined > Error fragments 110000674424 are not defined > Error fragments 110001085229 are not defined > Error fragments 200001936657 are not defined > Error fragments 110001088561 are not defined > Error fragments 120001030615 are not defined > Error fragments 120000286346 are not defined > ...." > > It believe this it's similar to previous error, due to the absence of the FRG string on the .asm output, i think. > Despite this, i can get the fasta file with contigs and scaffolds of my assembly. > > First, I used sff_extract script to get fastq files, then , i converted this files to celera inputs with the fastqtoCA and finally, i executed celera. > > Could someone help me with this please? > > i need to transform my celera outputs to AMOS bank to analyze it with Hawkeye. > > Thanks in advance! > > PD: sorry for my English. > Diego Díaz. > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk_______________________________________________ > wgs-assembler-users mailing list > wgs...@li... > https://lists.sourceforge.net/lists/listinfo/wgs-assembler-users |