Name | Modified | Size | Downloads / Week |
---|---|---|---|
gen-blosum-xx-1.zip | 2014-09-15 | 114.2 kB | |
README | 2014-09-15 | 5.6 kB | |
Totals: 2 Items | 119.8 kB | 0 |
DESCRIPTION: Program: GENBLOSUMxx-Generation of BLOCK substitution Matrix of least Identity Score xx Version:version 1 Author:Rifat Nawaz UL Islam, Arnab Nayek, Buddhadev Mondal, Shyamashree Banerjee, Parth Sarthi Sen Gupta and Amal Kumar Bandyopadhyay* Department of Biotechnology, The University of Burdwan Interpreter: AWK Programming Language ------------------------ FEATURES: The program "GENBLOSUMxx" analyze block fasta files of any length and any width and by any name >>>>BUT IN .fasta EXTENSION eg block1.fasta, block2.fasta, u_xx_polymerase.fasta, ww-kk-telomerase.fasta etc<<<< It can be used for single block or multiple block (whatever present in the current directory). note: in the example files 7 block fasta files (of length 30 and width 30 are provided) The program a. finds minimum paired identity score for each block b. compute bit-score in 20x20 format and properties from each block c. computes average BLOSUMxx matrix where xx if the average min identity score (see point 1) d. list blocks properties and 20x20 bit-score mat in excel table and on the screen ------------------------ INSTRUCTION: The program run from any UNIX/LINUX/CYGWIN shell The zip folder "gen-blosum-xx-1.zip" comtains 5 items (APPENDIX 1). step-by-step open shell and a. extract files in a directory. 'ls -asl' would show all files. b. chmod 600 * (read & write for all); chmod 100 GENBLOSUMxx.exe (execute for prog); check by 'ls -asl' c. chown owner:group * (eg chown parth:root *); check by 'ls -asl' d. run the program: ./GENBLOSUMxx dummy ------------------------- IN SUCCESSFUL RUN THE PROG SHOW FOLLOWING DETAILS ON THE SCREEN ---------------------------------------------------------------------------------------------- ))))Screen output(((( Program: GENBLOSUMxx-Generation of BLOCK substitution Matrix of least Identity Score xx Version:version 1 Author:Rifat Nawaz UL Islam, Arnab Nayek, Buddhadev Mondal, Shyamashree Banerjee, Parth Sarthi Sen Gupta and Amal Kumar Bandyopadhyay* Department of Biotechnology, The University of Burdwan Interpreter: AWK Prog. Lang. report bug: akbanerjee[at]biotech[dot]buruniv[dot]ac[dot]in; ----------------------------- ))))run update(((( processing 1] block1.fasta length 30 width 30 least identity score 10... processing 2] block2.fasta length 30 width 30 least identity score 15... processing 3] block3.fasta length 30 width 30 least identity score 14... processing 4] block4.fasta length 30 width 30 least identity score 19... processing 5] block5.fasta length 30 width 30 least identity score 20... processing 6] block6.fasta length 30 width 30 least identity score 14... processing 7] block7.fasta length 30 width 30 least identity score 11... ......done ))))AVERAGE MATRIX with average score (((( [also in excel format in output file] BLOSUM-49: Average Matrix of 7 BLOCKs C S T P A G N D E Q H R K M I L V F Y W C 4 S 0 6 T 0 0 7 P 0 -1 0 5 A 0 -2 -2 -3 6 G -3 -3 -4 0 -4 6 N 0 0 0 0 -1 -2 6 D -1 -2 -1 1 -1 -3 0 5 E 0 -2 0 0 -1 -1 0 0 6 Q 0 -1 -2 0 -3 -3 -1 -2 -1 7 H 1 0 0 0 0 0 3 0 1 1 3 R 0 0 0 -3 0 -2 -1 0 -1 1 0 8 K 0 0 -1 -1 0 -1 0 0 0 0 0 1 7 M 0 0 -2 0 -1 -1 0 0 0 -1 0 0 0 4 I 1 -2 0 -1 -3 0 -1 0 0 0 0 -1 -1 0 6 L 0 -3 -3 0 -4 -2 -1 0 -2 -1 0 -1 -3 1 -2 6 V -1 -3 -1 -4 -3 -3 -1 -3 -3 -4 0 -2 -1 -2 2 -2 5 F 0 -1 0 0 -2 0 0 -1 0 0 0 0 0 0 0 0 -1 7 Y 0 -3 0 0 0 0 0 0 -1 -1 0 -1 -1 0 0 -2 0 5 5 W 0 0 0 0 0 0 0 0 0 1 0 0 -1 0 0 0 0 1 1 1 ))))BLOCK Properties (((( [In excel both absolute and normalized values] BLOCK properties -Absolute values BLAA BLOCK amino acids; HET Hetero-pairs; HOMO homo-pairs; TOT Total frequency; HT_mx Maximum hetero pair frequency and its type for the given block; HT_mn Minimum hetero pair frequency and type for the given block; Hm_MX Maximum homo pair frequency and type for the given block; HM_MN minimum homo pair frequency and type for the given block; BL Seqq WDT BLAA HET HOMO TOT HT_mx Hm_MX HT_Mn HM_MN 1 30 30 900 5270 7780 13050 607(IV) 1205(KK) 1(CQ) 1(RR) 2 30 30 900 4180 8870 13050 973(IV) 1795(GG) 1(SP) 1(SS) 3 30 30 900 3278 9772 13050 804(IV) 1823(VV) 1(GH) 1(EE) 4 30 30 900 1908 11142 13050 315(DE) 2033(AA) 1(CS) 1(AA) 5 30 30 900 2397 10653 13050 231(SA) 2428(SS) 1(TF) 1(NN) 6 30 30 900 3306 9744 13050 245(IV) 2060(DD) 1(NW) 1(SS) 7 30 30 900 2289 10761 13050 262(AV) 2278(PP) 1(CS) 1(MM) BLOSUM49 and BLOCK properties in >>output_dummy.xls<< file ----------------------------------------------------------------------------------------------- Appendix 1 - Files supplied as part of gen-blosum-xx-1.zip 1) GENBLOSUMxx.exe (APPLICATION) 2) BLOCK_FASTA FILES : total 7 block fasta files of length 30 and width 30 (act as Input file when present in the run directory) 3) A dummy (STARTER file) file; if absent in current directory -program autogenerates (no worry). 4) An output of these inputs "output_dummy.xls" 5) Readme (this file) END report bug: akbanerjee[at]biotech[dot]buruniv[dot]ac[dot]in;