The software is in 'ISDTool' folder. Datasets are in 'Datasets' folder, which includes amino acid sequences and DNA sequences.
Introduction:
1.The software could be used to identify typical ISDs in retroviruses including HERV, HTLV, HIV, STLV, SIV and MLV.
2.The software consists of two modules, 'ISDFindera' for amino acid sequences and 'ISDFindern' for DNA sequences.
3.It receives sequences or file in FASTA format as input.
4.As an output result, ISD annotation will be appended to the description line of an input sequence. The annotation will clearly indicate the starting and ending positions of ISD in the sequence.
Datasets available on the site:
1.Input DNA sequences in 'Datasets\DNA sequences for ISDFindern' folder
'HERV-int.fsa' - 94,671 RepeatMasker sequences 77.5MB
'HERV-LTR-LTR.fsa' - 148,402 DNA sequences 61.5MB
'HTLV1-env.fsa' - 737 DNA sequences 558.6KB
'HTLV1-gag.fsa' - 25 DNA sequences 15.4KB
'HTLV1-pol.fsa' - 20 DNA sequences 4.2KB
'HTLV1-LTR.fsa' - 869 DNA sequences 492.1KB
'HIV1-env.fsa' - 32,146 DNA sequences 236.0MB
'HIV1-gag.fsa' - 14,791 DNA sequences 44.0MB
'HIV1-pol.fsa' - 7,016 DNA sequences 30.9MB
'HIV1-5'LTR.fsa' - 657 DNA sequences 718.0KB
'Example.fsa' - Part of the 'HERV-int.fsa' 2.1MB
2.Input amino acid sequences in 'Datasets\Amino acid sequences for ISDFindera' folder
'HERV-ISD.fsa' - 94 amino acid sequences 53.6KB
'HTLV-ISD-1234.fsa' - 635 amino acid sequences 237.9KB
'HIV-ISD.fsa' - 100 amino acid sequences 58.8KB
'STLV-ISD-1236.fsa' - 183 amino acid sequences 59.9KB
'SIV-ISD.fsa' - 90 amino acid sequences 83.9KB
'MLV-ISD.fsa' - 111 amino acid sequences 74.9KB
'STLV-env.fsa' - 157 amino acid sequences 27.7KB
'STLV-gag.fsa' - 14 amino acid sequences 3.6KB
'STLV-pol.fsa' - 42 amino acid sequencess 11.1KB
'SIV-env.fsa' - 12,804 amino acid sequences 6.4MB
'SIV-gag.fsa' - 1,716 amino acid sequences 523.0KB
'SIV-pol.fsa' - 4,673 amino acid sequences 2.2MB
'MLV-env.fsa' - 97 amino acid sequences 47.4KB
'MLV-gag.fsa' - 66 amino acid sequences 35.5KB
'MLV-pol.fsa' - 22 amino acid sequences 21.5KB
3.Positive and negative training sample sequences in 'Datasets\Training sample sequences' folder
'HERVHTLVHIVSTLVSIVMLV-ISD-P.fsa' - 191 positive training samples 9.2KB
'HERVHTLVHIVSTLVSIVMLV-ISD-N.fsa' - 11109 negative training samples 277KB
In citing the ISDTool 2.0, please refer to:
H. Lv, et al., "ISDTool 2.0: A computational model for predicting immunosuppressive domain of retroviruses," J Theor Biol, Jul 4 2014.
For details, refer to the manual coming with the software ISDTool.