casm-breaks output file: asm.break.fea
The singleton reads are then aligned to the consensus sequences of the contigs and then analyzed for shared breakpoints. casm-breaks reports positions where there are multiple reads that all have the same breakpoint pattern. Unlike some of the other pipeline tools, casm-breaks writes an XML like message file:
{FEA Feature message
typ:B Breakpoint feature
src:N,CTG The breakpoint occurs in contig N
com: <string> string linking all of the breakpoint features for a set of reads
clr:X,Y Range the contig where the read aligns
} End of feature
This is the section of bank-transact.cc responsible for the error message:
I see the problem. The lack of an iid is not the problem. In the 'src' of the FEA record
{FEA
typ:B
src:gi|115315570|ref|AC_000021.2|,CTG
com:60 (60,0) B <-[gi|115315570|ref|AC_000021.2|]- 8294
clr:8295,8294
}
It lists the full name of the contig. The 'src' field must reference the iid of the contig, which is always an integer. The parser is confused because it finds a string instead of an int in this field. I'm not sure why a sequence id is showing up instead of an iid. The breakpoint code gets the ids from the fasta file produced by "bank2fasta" at command 100 of the amosvalidate script. That command should use the iids of the contigs by default. In the PREFIX.fasta file generated by amosvalidate, do you see integers or strings on the fasta headers?
Best,
Adam
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi there,
I’m using Amos version 2.0.8. and when I run ‘amosvalidate’ it fails at stage 710 with this repeated error message:
AFG ERROR: Invalid source format
could not parse 'FEA' message with iid:NULL, message ignored
Here’s the related section from my *.log file:
…
!!! 2008-09-17 19:42:24 Doing step 700: Analyzing singleton alignment breakpoints
!!! 2008-09-17 19:42:24 Running: /machine/home/ngenuser /apps/amos/bin/casm-breaks -b run2-lane6-paired.bnk -t 100 -c 2 -F run2-lane6-paired.break.fea run2-lane6-paired.delta
Building alignment graph
build
flag score
flag QLIS
clean
Fetching read and mate information
getting mate info from bank
getting read info from bank
looks like alignment uses IIDs
Placing unique reads
Placing repetitive reads
Outputting read paths
!!! 2008-09-17 19:46:36 Done! Elapsed time:0d 0h 4m 12s
!!! 2008-09-17 19:46:36 Doing step 710
!!! 2008-09-17 19:46:36 Running: /machine/home/ngenuser /apps/amos/bin/bank-transact -b run2-lane6-paired.bnk -m run2-lane6-paired.break.fea
START DATE: Wed Sep 17 19:46:36 2008
Bank is: run2-lane6-paired.bnk
0% 100%
AFG ERROR: Invalid source format
could not parse 'FEA' message with iid:NULL, message ignored
ERROR:Invalid source format
could not parse 'FEA' message with iid:NULL, message ignored
…
(ERROR MESSAGE IS REPEATED A TOTAL OF 3320 TIMES)
…
Messages read: 3320
Objects added: 0
Objects deleted: 0
Objects replaced: 0
END DATE: Wed Sep 17 19:46:36 2008
!!! 2008-09-17 19:46:36 Command: /machine/home/ngenuser /apps/amos/bin/bank-transact -b run2-lane6-paired.bnk -m run2-lane6-paired.break.fea exited with status: 1
!!! END - Elapsed time: 0d 0h 22m 52s
The feature file does indeed contain no iid entries:
head run2-lane6-paired.break.fea
{FEA
typ:B
src:gi|115315570|ref|AC_000021.2|,CTG
com:60 (60,0) B <-[gi|115315570|ref|AC_000021.2|]- 8294
clr:8295,8294
}
{FEA
typ:B
src:gi|115315570|ref|AC_000021.2|,CTG
com:136 (136,0) B <-[gi|115315570|ref|AC_000021.2|]- 7169
…
(TOTAL 3320 FEATURE ENTRIES IN FILE)
Inserting 'iid:1' into a dummy *.fea file gave the error:
ERROR: Invalid source format
could not parse 'FEA' message with iid:1, message ignored
It appears that there should not normally be an 'iid' entry in the *.fea file:
FORMAT OF .fea FILE http://amos.sourceforge.net/forensics\):
casm-breaks output file: asm.break.fea
The singleton reads are then aligned to the consensus sequences of the contigs and then analyzed for shared breakpoints. casm-breaks reports positions where there are multiple reads that all have the same breakpoint pattern. Unlike some of the other pipeline tools, casm-breaks writes an XML like message file:
{FEA Feature message
typ:B Breakpoint feature
src:N,CTG The breakpoint occurs in contig N
com: <string> string linking all of the breakpoint features for a set of reads
clr:X,Y Range the contig where the read aligns
} End of feature
This is the section of bank-transact.cc responsible for the error message:
//-- Parse the message
try {
op -> readMessage (msg);
}
catch (const Exception_t & e) {
cerr << "ERROR: " << e . what( ) << endl
<< " could not parse '" << Decode (ncode)
<< "' message with iid:"
<< (msg . exists (F_IID) ? msg . getField (F_IID) : "NULL")
<< ", message ignored" << endl;
exitcode = EXIT_FAILURE;
I'm not sure where the call to op->readMessage() goes to but is the format of my *.fea file correct? Or is there anything else I'm missing?
Thanks for your help.
Stuart.
I see the problem. The lack of an iid is not the problem. In the 'src' of the FEA record
{FEA
typ:B
src:gi|115315570|ref|AC_000021.2|,CTG
com:60 (60,0) B <-[gi|115315570|ref|AC_000021.2|]- 8294
clr:8295,8294
}
It lists the full name of the contig. The 'src' field must reference the iid of the contig, which is always an integer. The parser is confused because it finds a string instead of an int in this field. I'm not sure why a sequence id is showing up instead of an iid. The breakpoint code gets the ids from the fasta file produced by "bank2fasta" at command 100 of the amosvalidate script. That command should use the iids of the contigs by default. In the PREFIX.fasta file generated by amosvalidate, do you see integers or strings on the fasta headers?
Best,
Adam