Hamstr Wiki

A tool for directed ortholog search in ESTs and proteins

Brought to you by: ebersbi2

HaMStR Output

Authors:

HaMStR Output files

When you run HaMStR you will typically get a number of output files the content of which will no be briefly described.

hamstrsearch.log This log file contains the basic information about the parameters of your HaMStR search. Note, it does not contain the information that is printed to your screen in the course of your HaMStR search. To capture the latter, you could, for example, start HaMStR with nohup. this starts a list with numbers his log file contains the basic information about the parameters of your HaMStR search. Note, it does not contain the information that is printed to your screen in the course
hamstrsearch_<YourTaxonName>_<YourCoreOrthologSet>_<refspec>[.strict].out This file contains all (co-)orthologs that have been identified by HaMStR in this run. The information in this file is the following:
1. The Id of the core ortholog group
2. The species that has been chosen as reference taxon. Note, if the strict option has been used for the HaMStR search, the primer taxon with the protein most similar to the identified ortholog will be inserted.
3. The taxon name for the query data set
4. The sequence id of the identified (co-)ortholog. Note, in some cases the length of this sequence is added after a '-'.
5. The representative flag. 1 denotes the representative ortholog, i.e. the one among two or more co-orthologs that is most similar to the reference protein. 0 denotes the non-representative co-orthologs. In the case that only a single sequence has been identified as ortholog it gets assigned a 1 automatically.
6. The amino acid sequence of the orthologous sequence
hamstrsearch_<YourTaxonName>_<YourCoreOrthologSet>_<refspec>_cds[.strict].out This file will be generated only when you are analyzing transcript data. It contains the predicted CDS for the protein sequences in the correspondig output file. NOTE: Both, CDS and amino acid sequence data have been taken as is from the genewise output data. There are rare occasions where the amino acid sequence data that results from a translation of the CDS differs from the provided amino acid sequence. Contact me for further details.
hmm_search_<YourTaxonName>_<YourCoreOrthologSet> This directory contains the output files of the individual hmmer searches.
fa_dir_<YourTaxonName>_<YourCoreOrthologSet>_<refspec>[_strict] This directory contains the sequence data for each core ortholog group that HaMStR could extend with at least one ortholog from your query taxon in a multi fasta format (un-aligned). The header information for the sequences from the primer taxa is the following:
1. The Id of the core ortholog group
2. The taxon name of the primer taxon. Typically, we use the first three letters of the genus name followed by the first two letters of the species name, optionally followed by a '_' and an internal TaxonId.
3. The SequenceId of the primer taxon sequence
  The header information for the identified novel orthologs from the query taxon follows the order described for the pipe-delimited hamstrsearch output files described above.
tmp This directory contains several auxilliary files that HaMStR generates and uses in the course of its analysis. It is not deleted automatically, as it can be helpful for debugging in case you encounter any problems.
<YourSequenceFile>.mod HaMStR processes your input file by removing line breaks in the sequence. The .mod-file contains this reformated sequence information. NOTE: HaMStR will check at the beginning of each run, whether this file already exists. If so, it will re-use this file rather than modifying the original file again.
<YourSequenceFile>.mod.tc In case of analyzing transcript data, HaMStR stores a crude six-frame translation of your transcript in this file. NOTE: HaMStR will check at the beginning of each run, whether this file already exists. If so, it will re-use this file rather than translating the original file again.

Hamstr Wiki

A tool for directed ortholog search in ESTs and proteins

HaMStR Output

HaMStR Output files

Related