met-predictor - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
Name	Modified	Size
met-predictor	2017-05-21
Met-Predictor_12302019.zip	2019-12-31	1.0 GB
dataset.zip	2019-12-31	13.0 MB
licence.txt	2019-12-31	2.4 kB
readme.txt	2019-12-31	8.9 kB
Totals: 5 Items		1.1 GB
                            Met-predictor RELEASE NOTES
                            ===========================

Met-predictor program

by Qiqige Wuyun and Wei Zheng (wuyunqqg@163.com or jlspzw139@sina.com)

This predictor is developed to predict lysine and arginice methylation sites based on 
support vector machine (SVM) classifier. It is supplied in source code form along with the 
required data files and run under the linux. The input is a protein sequence file (fasta format)


How to use it?
Firstly we can download the Met-predictor.zip in http://sourceforge.net/p/met-predictor

===================================================================================================================================================
===================================================================================================================================================

We give the version 64bit binaries.

step 1. Install

To run Met-predictor, you need to download and install:
	gfortan
	python
	numpy
	scipy
	bioperl
	tcsh

We incorporated some used tools:
	/mono [you should compile it according to its README]
	
	for sequence
	/lib/blast-2.2.26 
	/lib/disopred [you had better compile it according to its README]
	/lib/hhsuite-2.0.16-linux-x86_64
	/lib/HSE
	/lib/libsvm-3.14 [you had better compile it according to its README]
	/lib/psipred3.3 [you had better compile it according to its README]
	/lib/SPIDER2_local
	/lib/spineX [you had better compile it according to its README]
	
	for structure
	/lib/hhsuite-2.0.16-linux-x86_64/scripts/hhpred [you had better compile it according to its README]
	modeller has been included in hhpred, while you MUST go to https://salilab.org/modeller/registration.html
	for licence
	/lib/PfamScan [you had better compile it according to its README]
	/lib/hmmer
	/lib/depth-1.0
	/lib/Structure for NACCESS CHOPS HSE L1depth kthCH DSSP

Step 2. Change variable

You should change the following paths:
		1. In Run_Metprodictor.py
		The line: "os.environ['Met_predictor_HOME']='/nfs/amino-home/zhengwei/wuyunqqg/Met-predictor';"
		$Met_predictor_HOME should change to your own path.
		
		2. In scripts/GetFeature.sh
		The line: "setenv METHOME /nfs/amino-home/zhengwei/wuyunqqg/Met-predictor"
		$METHOME should change to your own path.

		3. In scripts/GetStructureFeature.sh
		The line: "set METHOME = /nfs/amino-home/zhengwei/wuyunqqg/Met-predictor"
		$METHOME should change to your own path.
		
		4. In scripts/update-hhsearchpdb70.py
		The line: " os.environ['METHOME']='/nfs/amino-home/Met-Predictor' "
		$METHOME should change to your own path.
		
Step 3. Download database

You should put complied nr database to db/blast_nr/nr 
nr database download wbsite ftp://ftp.ncbi.nlm.nih.gov/blast/db/

the uniprot20 database to db/hhblits_db/uniprot20_2013_03 
or db/hhblits_db/uniprot20_2015_06 according to your downloaded uniprot20database
uniprot20 database download wbsite http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/

the hhsearch database to db/hhsearch_db/
you can download hhsearch/hhpred database by /scripts/update-hhsearchpdb70.py

the pfam database to db/pfam/
you can download Pfam-A database from ftp://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam31.0/

Step 4. Input format and Command
usage: [python] Run_Metpredictor.py -i input_fasta -o outfile -t type [-s structure -r isscale]
-i：Input fasta files:
		Your should use the absolute path and the suffix of the filename should be .fasta
		format:
		>xxxxxx
		AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

		For example: P0ADN2.fasta
		>P0ADN2
		MAESFTTTNRYFDNKHYPRGFSRHGDFTIKEAQLLERHGYAFNELDLGKREPVTEEEKLFVAVCRGEREPVTEAERVWSKYMTRIKRPKRFHTLSGGKPQVEGAEDYTDSDD

-o：Output file 
		Your should use the absolute path
-t: residue type [K R]
		K for lysine or R for arginine
-s: Adding structure features or not [1 0]
		1 for adding structure features; 0 for not
-r: scaling or not in SVM [1 0]
		1 for scaling; 0 for not 
example: ./Run_Metpredictor.py -i /home/Met-predictor/example/P0CX53.fasta -o /home/Met-predictor/example/P0CX53.out -t R -s 1 -r 1
Notice that the input fasta must use full absolute path!


Step5. Result

The instruction of result file: 
	The first column is the location index of the predicted lysine (K) residue located in the sequence.
	The second column is the predicted label of lysine residue, +1 represents the lysine is predicted as positive sample (i.e., acetylation site) while the -1 represents the lysine is predicted as negative sample (i.e., non-acetylation site)
	The third column is the predicted probability of positive sample
	The 4-th column is the predicted probability of negative sample
	The 5-th column is the predicted label of lysine residue, +1 represents the lysine is predicted as positive MONO-methylation sample (i.e., acetylation site) while the -1 represents the lysine is predicted as negative MONO-methylation sample (i.e., non-acetylation site)
	The 6-th column is the predicted probability of positive MONO-methylation sample
	The 7-th column is the predicted probability of negative MONO-methylation sample
	The 8-th column is the predicted label of lysine residue, +1 represents the lysine is predicted as positive DI-methylation sample (i.e., acetylation site) while the -1 represents the lysine is predicted as negative DI-methylation sample (i.e., non-acetylation site)
	The 9-th column is the predicted probability of positive DI-methylation sample
	The 10-th column is the predicted probability of negative DI-methylation sample
	The 11-th column is the predicted label of lysine residue, +1 represents the lysine is predicted as positive TRI-methylation sample (i.e., acetylation site) while the -1 represents the lysine is predicted as negative TRI-methylation sample (i.e., non-acetylation site)
	The 12-th column is the predicted probability of positive TRI-methylation sample
	The 13-th column is the predicted probability of negative TRI-methylation sample



===================================================================================================================================================
===================================================================================================================================================
Please see the LICENSE file for the license terms of the software. It is
basically free for academic users, but a license fee applies to commercial
users. 

THE PUBLICATION OF RESEARCH USING Our method MUST INCLUDE AN APPROPRIATE
CITATION TO THE METHOD:

Improved Protein Methylation Sites Prediction Based on a Large Variety of Structure Features Set
Wei Zheng, Qiqige Wuyun, Micah Cheng, Gang Hu and Yanping Zhang

OTHERS:

The DISOPRED server for the prediction of protein disorder. 
Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT (2004) Bioinformatics 20: 2138-2139.

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. 
Remmert M, Biegert A, Hauser A, Soding J (2012) Nat Meth 9: 173-175.

Highly accurate sequence-based prediction of half-sphere exposures of amino acid residues in proteins. 
Heffernan R, Dehzangi A, Lyons J, Paliwal K, Sharma A, et al. (2015) Bioinformatics (Oxford, England).

The PSIPRED protein structure prediction server. 
McGuffin LJ, Bryson K, Jones DT (2000) Bioinformatics 16: 404-405.

SPINE X: Improving protein secondary structure prediction by multi-step learning coupled with prediction of solvent accessible surface area and backbone torsion angles. 
E. Faraggi, T. Zhang, Y. Yang, Kurgan L, Zhou Y (2002) J Comput Chem 33: 259-267.

Residue depth: a novel parameter for the analysis of protein structure and stability, 
Chakravarty, S. and Varadarajan, R. (1999) Structure, 7, 723-732.

Highly accurate sequence-based prediction of half-sphere exposures of amino acid residues in proteins,
Heffernan, R., et al. (2015)  Bioinformatics (Oxford, England).

Computer Program, Department ofBiochemistry and Molecular Biology,
Hubbard, S.J. and Thornton, J.M. (1993) NACCESS.  University College London.

AAindex: amino acid index database, progress report 2008,
Kawashima, S., et al. (2008)  Nucleic Acids Research, 36, D202-D205.

Analysis of Conformational B-Cell Epitopes in the Antibody-Antigen Complex Using the Depth Function and the Convex Hull,
Zheng, W., et al. (2015)  PLoS ONE, 10, e0134835.

The Pfam protein families database,
Punta, M., et al. (2012)  Nucleic Acids Research, 40, D290-D301.

Accelerated profile HMM searches. 
Eddy SR. PLoS Comput Biol. 7:e1002195 (2011) 

Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling.
Meier A., Söding J. (2015) PLoS Comput Biol. 11(10):e1004343. doi: 10.1371/journal.pcbi.1004343. PMID: 26496371
Source: readme.txt, updated 2019-12-31
met-predictor Files

Get an email when there's a new version of met-predictor