Tuque Wiki

Tools for mapping RNA-Seq reads to eukaryotic genomes

Brought to you by: ian-d-reid

UseTuque

Running the programs

tuqueSplice

tuqueSplice will find the splice junctions in a set of RNA-Seq reads and map the reads to a genome.

Command line: tuqueSplice -g Genome_sequence_filename [-j Predicted.juncs] [-m Memory limit] [-o Output directory] [-n Number of input files] [-c Config file path] Reads_filename
Parameters:
-c Path to configuration file [tuque/tuque.cfg]
-g The name of a fasta file containing the genome sequence to be mapped against. Required.
-j Name of a file containing predicted splice junctions in .juncs format [None]
-m The maximum amount of RAM to use, in bytes [10000000000]
-o Path to a directory where output should be written [Genome/RNA-Seq]. It will be created if it does
not exist.
-n The number of input reads files [1]; if n > 1 these files should be named Reads_filename.1 through Reads_filename.n

Only the genome filename and the reads filename need to be specified; the other parameters have reasonable default
values.

Outputs:
Filtered splice junction locations in classified.juncs
Read mappings in tuque_hits.bam
Coverage depths in tuque.coverage.wig.gz
tuqueSplice.log contains a complete record of the run and bowtie.log contains read mapping statistics.

tuqueSplice_ps

tuqueSplice_ps is a variant of tuqueSplice that is more efficient in genomes with numerous small scaffolds or contigs.

Command line: tuqueSplice -g Genome_sequence_filename [-j Predicted.juncs] [-m Memory limit] [-o Output directory] [-n Number of input files] [-c Config file path] Reads_filename

Outputs: Same as tuqueSplice

tuqueIndex

tuqueIndex will prepare a Bowtie index including spliced sequences suitable for use with tuqueMap

Command line: tuqueIndex Genome_sequence_filename Read_length Splice_junctions.juncs Index_name_stem
Outputs: A set of Bowtie index files

tuqueMap

tuqueMap will map a set of RNA-Seq reads to a genome using known splice junctions

Command line: tuqueMap [Options] bowtie_index Output_directory Reads_filename
Options:
-e The maximum sum of Phred quality values at all mismatched read positions [70]
-h The maximum number of hits allowed for one read [25]
-m The maximum amount of RAM to use, in bytes [10000000000]
-n The maximum number of mismatches allowed in the first 28 bases of the read [3]
-p The maximum number of processes to use [1].
-c Path to configuration file [tuque/tuque.cfg]
-a Path to .bam file containing Already-mapped reads [None]
-u Keep uncompressed intermediate files

Outputs:
Read mappings in tuque_hits.bam
Coverage depths in tuque.coverage.wig.gz
bowtie.log contains read mapping statistics.

tuqueCount

tuqueCount will count the reads mapping inside annotated sequence features in a set of BAM files.

Command line: tuqueCount -g Annotation file [-o Output filename] [-c Config file path] Mapped reads filenames
Options:
-c Path to configuration file [tuque/tuque.cfg]
-g The name of a GFF3 file containing the gene features to be counted. Required.
-o Path to a file where output should be written [counts.txt].

Output: The output file is plain text with tab-delimited columns containing two sections. The first section has read counts as its data values and the second section has FPKM values. Each section contains a header line followed by a line for each sequence feature. Each line contains the feature ID, then its length, and then the data value for each input BAM file.
This file can be imported by most spreadsheet programs.

Wiki: Home