TripleV requires your data to be formatted in a custom file format. The VizFileCreatorPackage that is included with the download package has all tools to get started and convert commons genomics file formats into the required format. At the bare minimum you need to load in a multiple alignment file that includes the reference sequence and a separate file called reference.txt
This package was tested and designed to run on Unix-like system.
The [TripleV file format description] gives an in-depth description how the custom file format works.
File extensions (what comes after the dot in a file name; e.g. ".txt") that must be used to designate the appropriate file types
The following three file types must be designated as shown below in the regular expression column. These are not case sensitive. For example, the valid name for the reference file can be references.txt, REFERENCES.TXT, or even ReFeReNcEs.TxT. The sampleID in the reference file must match exactly with the sampleID in the alignments, genelist annotations, variants files, etc.
| File type | Permitted extensions | Files name expression | Notes |
|---|---|---|---|
| Reference Files | 'txt' | references.txt | |
| Alignment Files (DNA) | 'fa', 'fas', 'fsa', 'fasta', 'fna','frn', 'mfa', 'afa', 'aln', 'dna' | *DNA* | |
| Alignment Files (AA) | 'fa', 'fas', 'fsa', 'fasta', 'faa', 'frn', 'mfa', 'afa', 'aln', 'pep' | *AA* | |
| Variant Files (DNA) | 'txt' | *ntfreq.txt | These include the output files from vPhaser and vProfiler. |
| Variant Files (AA) | 'txt', 'xls' | codonfreq.txt or codonfreq.xls | |
| Gene List Files | 'txt' | *genelist.txt | |
| Muscle Path | 'txt' | muscle_path.txt | Due to the way that genomes containing genes with introns are spliced, it’s extremely difficult to use a user’s protein alignment since this will not match up perfectly with the variant file data. Therefore, we need to perform an alignment from the existing nucleotide data. |
| Epitope Files | 'fasta', 'fas', 'fa' | *EPITOPES.FASTA | The epitope file is just a fasta file with a small number of amino acids that make up the peptide. We actively map all of the short protein fragments to the translated polypeptide. Note, a single peptide may map to multiple places if it can be perfectly aligned without any gaps to the reference in more than one locus. |
| Metadata Files | 'txt' | metadata.txt |
These files are included in the VizFileCreatorPackage you downloaded.
| File type | Example file |
|---|---|
| Reference Files | references.txt |
| Alignment Files (DNA) | 9213_all_nuc_aligned_DNA.fas |
| Alignment Files (AA) | none |
| Variant Files (DNA) | 9213_165_ntfreq.txt |
| Variant Files (AA) | 9213_final_cleaned_9213_165_codonfreq.xls |
| Gene List Files | 9213_0_genelist.txt |
| Muscle Path | muscle_path.txt |
| Epitope Files | epitopes.fasta |
| Metadata Files | 9213_metadata.txt |
Documentation: Home
Documentation: TripleV file format description