Name Modified Size InfoDownloads / Week
Parent folder
Read_Assignment.h 2016-11-05 8.5 kB
gcc_LinuxDebug.h 2016-11-05 17.7 kB
ISA.h 2016-11-05 4.8 kB
Batch_Mode.h 2016-11-05 12.1 kB
Entropy.h 2016-11-05 7.9 kB
Totals: 5 Items   50.9 kB 0
# Introduction

MaxInfo is a software program for simultaneous isoform discovery and abundance estimation from RNA-Seq data. It is flexibly switched between the annotation-free mode and the annotation-dependent mode.

# Instructions

## Prerequisites
The core algorithms are implemented and complied in C++ enforcement. The implementations of MaxInfo rely on two extra software programs for raw data pre-processing.
To align the sequencing read to the genome sequences, the junction-sensitive alignment tool Tophat is required. The instructions for installing Tophat can be found here. Then, the SAMTools is recommended to convert BAM files into TXT format.
Installation

The source codes of MaxInfo have been compiled to generate an executable file (in bin file). The executable file can be used directly. There are no additional steps for installation.

# Usage
There are some preparations to run MaxInfo for isoform discovery and abundance estimation.

# Stages:

1. Download the genome sequences. They can be downloaded from public databases such as the NCBI website or the UCSC website. If you want to conduct isoform discovery and abundance estimation with genome annotations, you also need to download the genome annotations.

2. Prepare the files of sequencing reads. Use SAMTools for necessary format conversion of read data to prepare the reads for the stage of read alignment.
3. Use the junction-sensitive alignment tool to align the sequencing reads to the genome sequences. Tophat is recommended here. The results of read alignment are generally of the format of BAM. Keep the junction identification results, which are usually generated as junctions.bed. Use SAMTools to convert the dot BAM file to the file of TXT format (samtools view *.bam >> *.txt).

4. Use MaxInfo for isoform discovery and abundance estimation. Step into the directory where the executable file is kept, and use the commands as follows according to your options. The isoform identification and quantification results will be found in the file results.gtf in the output path.

# Commands:

MaxInfo –i <junctions files> <read alignment files> <output path>
This command is used for isoform discovery and abundance estimation without genome annotations.

# Example:

Suppose that the sequencing read data and auxiliary data (junction identification results or genome annotations) are kept in the current path, and the output path is current folder, too. The command is as follows:
MaxInfo -i ./junctions.bed ./read_data.txt ./

MaxInfo –d <genome annotations> <junctions files> <read alignment files> <output path>
This command is used for isoform discovery and abundance estimation with genome annotations. Novel isoforms are discovered.

## Example:
MaxInfo -d ./genes.gtf ./junctions.bed ./read_data.txt ./
MaxInfo –t <junctions files> <read alignment files> <output path>
This command is used for isoform identification and abundance estimation with genome annotations. In this mode, MaxInfo doesn’t discover novel isoforms beyond the known isoforms. It identifies isoforms and estimates the abundances of isoforms within the provided genome annotations.
## Example:
MaxInfo -t ./genes.gtf ./read_data.txt ./

# Compile the MaxInfo software:

1. "cmake ." to generate makefile.
2. "make" to complie the source code.
3. the binary build is in the bin file.
Source: readme, updated 2016-11-04