| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Java Sampling Program | 2009-12-19 | ||
| R Analysis Program | 2009-12-19 | ||
| README | 2009-12-19 | 1.1 kB | |
| misacylation_and_protein_structure.pdf | 2009-12-19 | 114.9 kB | |
| ProgressReport.pdf | 2009-11-23 | 60.7 kB | |
| Totals: 5 Items | 176.8 kB | 0 |
Please refer to the individual READMEs in each zip file. I will note that the Java sampling program can be run on multiple machines in parallel using a simple bash script. The script I used to generate my large dataset is defined below. Basically, it repeatedly calls the JAR file until a certain amount of PDB files is seen in the misacylation directory. This can be run on multiple machines (I suggest not running many processes on the same machine because we might get many IOExceptions trying to hit NCBI too often) #!/bin/bash MISACYLATION_DIR=/home/rap/priv/misacylation E_XCD=86 # Can't change directory? NUMBER_OF_PDB=30 COUNT=0 # Go to project directory and run pipeline cd $MISACYLATION_DIR if [ `pwd` != "$MISACYLATION_DIR" ] then echo "Can't change to $MISACYLATION_DIR." exit $E_XCD fi # Doublecheck if in right directory COUNT=`find . -name '*.pdb' | wc -l` while [ $COUNT -le $NUMBER_OF_PDB ] do find . -name '*.pdb' | wc -l if [ $COUNT -le $NUMBER_OF_PDB ] then java -jar -Xms1g -Xmx1g getPDBs.jar >> output fi COUNT=`find . -name '*.pdb' | wc -l` done exit 0