Home / Tool
Name Modified Size InfoDownloads / Week
Parent folder
PredictFP2.exe 2018-11-27 352.3 kB
Totals: 1 Item   352.3 kB 0
Introduction:

1. Tool:

PredictFP2.exe can be used to identify putative FPs in amino acid sequences or DNA sequences of retroviruses. It was implemented in Microsoft Visual C++ 6.0 and supported on Windows environment. The input and output files should be in FASTA format. FP annotation clearly indicates the start and end positions of FP in the inquired sequence.

2. Datasets:

  benchmark.fasta: env sequences with FP annotations in the benchmark dataset for training and testing.
  FP-benchmark.fasta: predicted np-FP annotations after scanning the file of "benchmark.fasta".
  non-env.fasta: non-env (gag/pol..) sequences for external test.
  env.fasta: env sequences without FP annotations for np-FPs prediction.
  env1.fasta: converted env sequences when they contain ambiguous characters B, Z, J, X.
  np-FP-AA.fasta: predicted np-FP annotations after scanning the file of "env1.fasta". 
  HERV-int.fasta: gene sequences of HERV without FP annotations for np-FPs prediction.
  np-FP-DNA.fasta: predicted np-FP annotations after scanning the file of "HERV-int.fasta". 
  
  An example of FP annotation in the data file is shown below.
  >sp|O70902|ENV_HV190 Envelope glycoprotein gp160 OS=Human immunodeficiency virus type 1 group M subtype H (isolate 90CF056) GN=env PE=3 SV=1|501-521
  AVGMGASFLGFLGAAGSTMGA                                  
  O70902: Uniprot Accession Number
  ENV_HV190 Envelope glycoprotein gp160 OS=Human immunodeficiency virus type 1 group M subtype H (isolate 90CF056) GN=env PE=3 SV=1: The definition of the inquired sequence
  501-521: The start site and end site of the new putative FP in the inquired sequence
  AVGMGASFLGFLGAAGSTMGA: The new putative FP subsequence 

3.Example:

1) Please run the software named "PredictFP2.exe".
 
2) The screen shows a string of characters: "Please select an input file in FASTA format: ".
If the software and the input file are in the same folder, you can only type the name of the input file, such as: benchmark.fasta. 
If the software and the input file are in the different folders, you should type the name and path of the input file, such as: E:\\dataset\\benchmark.fasta. 
The name and the path of the input file cannot contain a space.

3) The screen shows a string of characters: "Please select an output file for putative FPs:".
If you want the output file and software in the same folder, you should only type the name of the output file, such as: "FP-benchmark.fasta". 
If you want the output file in another folder, you should type the name and path of the output file, such as: "E:\\dataset\\FP-benchmark.fasta". 
The name and the path of the output file cannot contain a space.

4) The screen shows a string of characters: "Amino acid or DNA sequence? A/D".
If the input file contains amino acid sequences for scanning, you should type "A".
If the input file contains DNA sequences for scanning, you should type "D".

5) Runining... 
Mission accomplished!
Then you can get the file containing the predicted putative FPs.

6) The screen shows a string of characters: "Do you want to input another file to predict putative FPs? y/n".
Then you can input "y" to continue predicting putative FPs, or "n" to quit the software.
Source: README.txt, updated 2018-11-27