Confused in Translation Files

Brought to you by: derek_aguiar

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Java Sampling Program	2009-12-19		0
R Analysis Program	2009-12-19		0
README	2009-12-19	1.1 kB	0
misacylation_and_protein_structure.pdf	2009-12-19	114.9 kB	0
ProgressReport.pdf	2009-11-23	60.7 kB	0
Totals: 5 Items		176.8 kB	0

Please refer to the individual READMEs in each zip file.

I will note that the Java sampling program can be run on multiple machines in parallel using a simple bash script. 
The script I used to generate my large dataset is defined below.  Basically, it repeatedly calls the JAR file until 
a certain amount of PDB files is seen in the misacylation directory.  This can be run on multiple machines (I suggest
not running many processes on the same machine because we might get many IOExceptions trying to hit NCBI too often)

#!/bin/bash

MISACYLATION_DIR=/home/rap/priv/misacylation
E_XCD=86       # Can't change directory?
NUMBER_OF_PDB=30
COUNT=0

# Go to project directory and run pipeline
cd $MISACYLATION_DIR

if [ `pwd` != "$MISACYLATION_DIR" ]  
then
  echo "Can't change to $MISACYLATION_DIR."
  exit $E_XCD
fi  # Doublecheck if in right directory 


COUNT=`find . -name '*.pdb' | wc -l`

while [ $COUNT -le $NUMBER_OF_PDB ]
do
find . -name '*.pdb' | wc -l
if [ $COUNT -le $NUMBER_OF_PDB ]
then
java -jar -Xms1g -Xmx1g getPDBs.jar >> output
fi
COUNT=`find . -name '*.pdb' | wc -l`
done

exit 0

Source: README, updated 2009-12-19

Other Useful Business Software

AI-powered service management for IT and enterprise teams Icon

AI-powered service management for IT and enterprise teams

Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free

Gemini 3 and 200+ AI Models on One Platform Icon

Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free

Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free

Recommended Projects

SUMmOn
Automated identification of complex post translational modifications by mass spectrometry
MIREX
MIREX (MapReduce Information Retrieval Experiments) provides solutions to easily and quickly run large-scale information retrieval experiments on a cluster of machines using Hadoop. Version 0.3 has tools for the TREC ClueWeb09 and ClueWeb12 collections.
Hadoop-BAM
Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. The file formats currently supported are BAM, SAM, FASTQ, FASTA, QSEQ, BCF, and VCF. For a longer high-level description of Hadoop-BAM, refer to the article "Hadoop-BAM: directly manipulating next generation sequencing data in the cloud" in Bioinformatics Volume 28 Issue 6 pp. 876-877, available online at: http://dx.doi.org/10.1093/bioinformatics/bts054 Note that the library part of Hadoop-BAM is mainly for developers with experience in using Hadoop. The command line tools of Hadoop-BAM should be understandable to all users, but they are limited in scope. See the SeqPig project for a higher-level interface to the file formats supported by Hadoop-BAM: http://seqpig.sourceforge.net See Seal for Hadoop-based read alignment tools, Seal: http://biodoop-seal.sourceforge.net
Crowbar
A complete operations platform to deploy, maintain and scale clusters.
English to Telugu Dictionary
An online interface to English-Telugu Dictionary (Reprint of 1853 Edition) by CHARLES PHILIP BROWN, with more than 31000 English words.