Print this usage summary.
The sequences to be processed for insertions.
A tab delimited file where each line contains the barcode and name of a library. Additionally, columns after the second column will be treated as metadata tags to be associated with the library.
OPTIONAL. If there is a custom tapdance_base_config.txt to be used in special cases, use this parameter to specify it's use. An example where this might be useful is the case where distinct groups of users are using the same TAPDANCE installation, but separate mutagens.
A configuration file may be used rather than specify options on the command line. Any options specified in the base config file will be overriden by values specified in this config file.
Use this option if the database configuration needs to be kept separate from other configuration information. This is most useful in Galaxy where the end user should not have the database user credentials exposed to them.
The name of the bowtie index to use for aligning individual sequences. This is only used during the first phase of TAPDANCE. It is important to note that the index name is not a single file. For instance, the mm9 index has several files name mm9.[0-9].ebwt and mm9.rev.[0-9].ebwt. However, the correct value for this parameter would be /my/path/to/indexes/mm9
The sequence to match determining whether the mutagen of interest is present. Any sequences not matching this sequence will not be used in the analysis while those that do will have the mutagen trimmed prior to alignment. If, for instance the mutagen for a particular project has a sequence of ACTG, but the user also wanted to remove up two bases following the mutagen sequence the wildcard character '_' can be used to specify a mutagen sequence of ACTG__.
Any number of mutagen sequences may be specified by entering multiple -mutagen entries on the command line. E.G. perl tapdance_runner.pl -mutagen ACGT -mutagen TGCA. This is useful when a mutagen has more than one common captured sequence in the data.
A name for the project, up to 255 chars.
It can be useful to remove the chromosome of the donor concatamer from the calulations to remove the effects of local hopping for some projects. The chromosomes can be specified as a comma delimited list and must match the names used in the reference genome. E.G -omittedChromosomes chr1 -omittedChromosomes chr2
The location where execution will be performed
DEFAULT:'./'
OPTIONAL. To specify metadata on libraries outside of the barcode to librarymapping file, this parameter may be used. The file should contain the name of the library in one column and the metadata tag to affiliate with it in the second. Each library may have as many entries as needed.
lib_pct library_percent
CIS_tot_p CIS_total_pvalue
CIS_lib_p CIS_library_pvalue
CIS_reg_p CIS_region_pvalue
coCIS_thresh cocis_threshold
merge merge
Specify projects to be merged as the new project specified with -project_name. E.G. -merge my_first_project -merge my_second_project -project_name my_merged_project.
Specify the bed file to annotate CISes with. The default feature set is USCS's mm9 refSeq genes.
To generate a list of inserts only, specify no_cis. This is useful in cases where a new set of data needs to be merged with a previous set of data. Use this option as a first step to prepare the new data. Use -merge to combine the resulting projects and call CISes on the new project.
OPTIONAL. If not specified, TAPDANCE will attempt to identify the input file type on it's own. Valid options are 'tab', 'fasta' and 'fastq'.
OPTIONAL.