> *[svtoolkit-help] SVtoolkit question
> <http://sourceforge.net/mailarchive/message.php?msg_id=27148621>*
> From: Michael Parfenov <parfenov@ge...> - 2011-03-03 20:07
> Hi,
>
> Thanks for the SV toolkit!
> Could you tell me, please, how to run the ComputeGenomeMask utility? I see
> required arguments in the wiki page, but how to invoke the utility?
>
> best regards,
> Michael.
>
Hi, Michael,
ComputeGenomeMask is a regular java command line program and is in
SVToolkit.jar,
so basically you just need to do "java -cp
SVToolkit.jar:GenomeAnalysisTK.jar
org.broadinstitute.sv.apps.ComputeGenomeMask".
But there are a couple of wrinkles with ComputeGenomeMask in particular.
1. You need to index the reference for bwa alignment.
2. ComputeGenomeMask uses libbwa.so, so you need that on your library path.
3. To do any reasonably large genome, you need to run it in parallel and
then merge.
Here is a shell script I used to do a 75bp mask for hg19 that you can
use as an example.
It's designed to use the local lsf environment at the Broad and knows
where to find the reference fasta locally, etc.
It also assumes the reference has already been indexed with "samtools
faidx".
The script just runs the parallel jobs, it doesn't merge the results
back together
(you will need to cat together the resulting fasta files in the same
order as the reference).
At some point I will likely write a queue script to automate this, but
it's not a high priority right now because I'm also
working on some changes to the mask file format.
Hope this helps,
-Bob
#!/bin/bash
outdir=Homo_sapiens_assembly19/75
readLength=75
reference=/seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta
export SV_DIR=/humgen/cnp04/bobh/sv/stable
# These executables must be on your path.
which java > /dev/null || exit 1
which bwa > /dev/null || exit 1
export LD_LIBRARY_PATH=${SV_DIR}/bwa/lib:${LD_LIBRARY_PATH}
classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar"
mkdir -p ${outdir}/work
localReference=${outdir}/work/`echo ${reference} | awk -F / '{ print $NF }'`
if [ ! -e ${localReference} ]; then
(cd ${outdir}/work && ln -s ${reference} .) || exit 1
(cd ${outdir}/work && ln -s ${reference}.fai .) || exit 1
fi
bwa index -a bwtsw ${localReference} || exit 1
chroms=`cat ${localReference}.fai | cut -f 1`
for chr in ${chroms}; do
bsub -o ${outdir}/work/svmask_${chr}.log \
-R "rusage[mem=5000]" \
java -cp ${classpath} -Xmx4g \
org.broadinstitute.sv.apps.ComputeGenomeMask \
-R ${localReference} \
-O ${outdir}/work/svmask_${chr}.fasta \
-readLength ${readLength} \
-sequence ${chr} \
|| exit 1
done
|