Name | Modified | Size | Downloads / Week |
---|---|---|---|
GLUVAB_v0.6.pl | 2019-11-07 | 27.9 kB | |
GLUVAB_README.txt | 2019-11-07 | 4.0 kB | |
LICENSE.txt | 2019-11-07 | 35.8 kB | |
GLUVAB_v0.5.pl | 2019-07-18 | 27.4 kB | |
Totals: 4 Items | 95.1 kB | 0 |
GLUVAB Genomic Lineages of Uncultured Viruses of Archaea and Bacteria Copyright (C) 2019 Felipe Hernandes Coutinho (felipehcoutinho@gmail.com) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation version 3 of the License. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Usage: perl GLUVAB_v0.6.pl --help | Print this help message and exit. --file_prefix | String to be added to the output files (Default: GLUVAB) --threads | Number of threads to be used during Diamond search (Default: 1) \tLineage identification criteria: --min_lineage_reps | Minimum number of representatives in a tree node to establish a valid lineage (Default: 50). --max_lineage_reps | Maximum number of representatives in a tree node to establish a valid lineage (Default: 999999). --node_variable | Which node variable to use when defining lineages. One of Average_Node_Distances, Node_Depth or Node_Height (Default: Node_Depth). --max_cutoff | Maximum value of the selected node variable of a node to establish a valid lineage (Default: 100). --min_cutoff | Minimum value of the selected node variable of a node to establish a valid lineage (Default: 0.0014). \tInput files: --genomes_file_1 | Fasta format file containing DNA sequences of viral genomes to be analyzed --pegs_file_1 | Fasta format file containing protein sequences derived from viral genomes.\n\tSequences MUST be named as the Id of the original genomic sequence followed by _SEQNUM. \n\tExample: Scaffold_1_1, Scaffold_1_2, Scaffold_1_3, GenomeA_1, GenomeA_2... --m8_file | M8 format file containing the results of the all-versus-all protein search generated by Diamond. MUST use the same nomeclature as for the pegs file --dice_file | tsv format file containing the Dice distances among genomes --tree_file | Newick format file of the tree built based on the Dice distances --node_stats_file | tsv format file containing the average Dice distances within each node of the tree --ref_info_file | tsv format file containing the taxonomic classification or lineage assignment of sequences in dataset1 to be used when performing closets relative classification The order in which analyses are performed is: 1) Identify protein encoding genes in the genomes file with Prodigal (provide --genomes_file_1). 2) Perform all-verus-all search of protein encoding genes using Diamond (provide --pegs_file_1). 3) Calculate Dice distances between genomes based on the output of the diamond search (provide --m8_file). 4) Build Neighbor-Joining tree based on Dice distances (provide --dice_file). 5) Calculate statistics of Dice distances for each node of the tree (provide --tree_file and --dice_file). 6) Identify lineages in the tree (provide --tree_file and --node_stats_file). ##################################################################################### Optional: To classify sequences in dataset2 according to their closest relative (CR) in dataset1 based on average amino acid identity (AAI) and percentage of matched protein encoding genes (provide --genomes_file_1 and --genomes_file_2) or (--pegs_file_1 and --pegs_file_2). Optionally provide --ref_info_file to the output table includes de taxonomic classification / lineage assignment of the CR ##################################################################################### Providing a file other than the --genomes_file_1 will skip all of the previous steps and start the analysis according to the provided file. ##################################################################################### Dependencies: BioPerl Perl modules: Digest::MD5 Prodigal (v2.60) DIAMOND (v0.9.14) R (v3.2.5) R libraries: phangorn