CRUMp Code

A probabilistic prediction system of protein phosphorylation sites

Brought to you by: mmenor

Tree [5a3d93] master / History

HTTPS access

File	Date	Author	Commit
README.txt	2012-06-11	unknown	[5a3d93] First commit
crum_models.mat	2012-06-11	unknown	[5a3d93] First commit
phospredict.m	2012-06-11	unknown	[5a3d93] First commit

Read Me

CRUMp for OCTAVE and MATLAB

0. REQUIREMENTS

1) OCTAVE or MATLAB. CRUMp was tested 
using OCTAVE 3.2.4 and MATLAB 7.12.0, but CRUMp may work 
with older versions.

2) OCTAVE requires the BIOINFO package, available 
at http://octave.sourceforge.net/. While MATLAB requires the
Bioinformatics Toolbox from MathWorks.

1. INSTALLATION

1.1 OCTAVE INSTRUCTIONS

OCTAVE users may install CRUMp as a package. Download the
lastest package and use the following OCTAVE command:

	pkg install crump-0.2.0.tar.gz
	
You may need to substitute the version number to the one you
downloaded. If you receive an error that you do not have the
BIOINFO package installed, visit the website 
http://octave.sourceforge.net/ and download the latest 
BIOINFO package. Install the package using the following 
command, replacing with the appropriate version number:

	pkg install bioinfo-0.1.2.tar.gz

Alternatively, you may download the zip of the source code 
of CRUMp and unpackage it in any directory desired. Then 
in OCTAVE you may change to that directory using the "cd" 
command, or add the directory to OCTAVE's load path, for 
example, using:

	addpath('~/myfolder/crump-0.2.0')
	
1.2 MATLAB INSTRUCTIONS

Download the zip of the CRUMp's source code. Unpackage 
the zip in the directory desired. Then in MATLAB, add that 
directory to the search path, for example, using the 
command:

	addpath('~/myfolder/crump-0.2.0')


2. USAGE

The input of CRUMp is a FASTA file of the protein 
sequences you want to analyze. To have the results print to 
screen, use the following command in OCTAVE or MATLAB:

	phospredict myfasta.fasta
	
If you want to print the results to file, you may 
additionally specify the output filename, e.g.:

	phospredict myfasta.fasta myoutput.txt
	
The output report tells you, for each sequence, the position
number of each potential site, the type of site (S, T, or Y)
and the probability that the site is phosphorylatable.

If you are a programmer using the CRUMp in a script, it 
may be convenient to use the output cell array using the
command: 

	results = phospredict('myfasta.fasta', 'myoutput.txt');

The returned cell array consists of sequence structures, one
for each of the n sequences in the FASTA file. Each sequence
structure consists of three site structures: s_sites, 
t_sites and y_sites. Finally each sites structure consists 
of the following fields:

	Name		Type			Description
	--------------------------------------------------------
	position	cell array		Position numbers of sites
	site		cell array		Protein sequence of sites
	matrix		matrix			Kernel matrix of sites
	pred		array			Posterior probabilities
	
For example, if you want to access the probability that the
first S site in the second sequence in the FASTA is 
phosphorylatable:

	results{2}.s_sites.pred(1)

CRUMp Code

A probabilistic prediction system of protein phosphorylation sites

Branches

Tree [5a3d93] master / Download Snapshot History

Read Me

Tree [5a3d93] master /

History