HTQC Code

Quality control and filtration for illumina sequencing data

Status: Beta

Brought to you by: jiandingzhe

Tree [3b5f34] master / History

HTTPS access

File	Date	Author	Commit
src	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
t	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
.gitignore	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
CMakeLists.txt	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
Changes.txt	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
FindHTIO1.cmake.in	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
INSTALL	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-asm	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-demul	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-filter	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-primer-trim	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-rename	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-sample	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-stat	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_ht-trim	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball
README_htio	2015-06-27	Xi Yang	[3b5f34] initial commit to repo via 0.90.8 source tarball

Read Me

===========================================================
HTQC - a high-throughput sequencing quality control toolkit
===========================================================

------------
Introduction
------------

This is a read quality control toolkit for high-throughput sequencing. It
contain a program for quality statistics, and several programs for quality
filtration.

Currently, only Illumina sequencing platform is supported.

-------------------
System requirements
-------------------

- Boost and Zlib is required for build and run the programs.
- Perl and Gnuplot are required to run "ht-stat-draw.pl", which renders the
  output tables of "ht-stat" to charts.

If you build HTQC from source:
- CMake is used for cross-platform build configuration.

If your system don't have those softwares installed, please refer to your OS's
package management system (yum for Fedora, apt-get for Debian, ), or visit their official website:

  http://www.freedesktop.org/wiki/Software/pkg-config
  http://www.cmake.org

-------
Install
-------

See "INSTALL" document.

----------------
List of Programs
----------------

- ht-demul        : separate reads into individual files by barcode sequence.

- ht-filter       : filter reads by quality / length / tile ID.

- ht-asm          : concatenate paired-end reads into single sequences.

- ht-primer-trim  : remove primer sequences from reads.

- ht-rename       : give sequences short name using auto-increased number and
                    user-specified prefix and suffix.

- ht-sample       : randomly pick some sequences.

- ht-stat         : generate reads quality statistics report.

- ht-stat-draw.pl : draw charts from ht-stat output.

- ht-trim         : trim reads from start and/or end by quality.

For detailed descriptions, see individual README-XXX files for each program.
Run a program with "-h" or "--help" will show command-line options.

-------------
Typical usage
-------------

First of all, to know whether the sequencing reads are good:
  $ ht-stat -P -i reads_R1_* reads_R2_* -o report_dir
  $ ht-stat-draw.pl --dir report_dir

Suppose it shows tile 5 and 14 is bad. Remove reads from these tiles:
  $ ht-filter -P -i reads_R1_* reads_R2_* --filter tile --reject-tiles 5,14 -o tile_removal

Trim bad ending:
  $ ht-trim -i tile_removal_1.fastq -o trim_1.fastq
  $ ht-trim -i tile_removal_2.fastq -o trim_2.fastq

Remove reads that are too short:
  $ ht-filter --filter length -i trim_1.fastq trim_2.fastq -o long

Maybe you want to concatenate paired-ends to longer sequences:
  $ ht-join -i trim_1.fastq trim_2.fastq -o joined.fastq -u unjoined

----------------------
Single-end or pair-end
----------------------

Some programs handle single-end and paired-end reads differently. For those
programs, outputs files are specified by a prefix, and multiple files will
generated. For "ht-filter", when one end of a paired-end is rejected but the
other end is accepted, it is stored to "PREFIX_s.fastq".

Programs like "ht-trim" don't distinguish between paired-end or single-end mode.
It only accepts one input file and one output file. You should run them twice
for paired-end reads, one time for the file of each end.

---------
Reference
---------

We would be really appreciated if you cite our article:

Yang X, Liu D, Liu F, Wu J, Zou J, Xiao X, Zhao F, Zhu B.
HTQC: a fast quality control toolkit for Illumina sequencing data.
BMC Bioinformatics. 2013 Jan 31;14:33

-------
Contact
-------

If you have any questions or find any bugs, please email me:
  yangx@im.ac.cn

HTQC Code

Quality control and filtration for illumina sequencing data

Branches

Tree [3b5f34] master / Download Snapshot History

Read Me

Tree [3b5f34] master /

History