Menu

Run_SOAPfuse

NOBEL89

Run SOAPfuse

To run SOAPfuse, we need to prepare the config file, and SOAPfuse will run based on the configuration.

  1. Check the config file


    $ cd /PATH_WHERE_YOU_PUT_THE_PACKAGE/SOAPfuse-vX.X/config/
    $ less -S config.txt

    Note:

    • All lines prefixed by '#' should be considered as comments.
    • Value and parameter name are separated by '=', and just modify the value behind '='.
    • Some values can be set as 'yes' or 'no', and some can be left as defaults.
    • Check prefix of each parameter.
      There are five kinds of prefixes, they are 'DB','PG','PS','PD' and 'PA'.
        a. 'DB' means the info of DataBase.
        b. 'PG' means the info of ProGrams.
        c. 'PS' means the info of Pipeline Steps.
        d. 'PD' means the info of Pipeline Directories.
        e. 'PA' means the info of PArameters.
      # 'DB','PG','PS' and 'PD' types are related to the database, so SOAPfuse could run successfully once
        these parameters are set accurately. 'PA' type is related to the parameters of each step, and they
        have been set as default value, so you can ignore them in your first time trying.
        But, 'PA_all_fq_postfix', which defines the PostFix of RNA-Seq data files, should be set
        accordiing to your RNA-Seq files before running.
  2. Modify the config file


    Now we presume that you have unpacked the SOAPfuse package, and obtained the SOAPfuse-vX.X directory. We call the absolutepath of this directory as 'TOOL_DIR'.
    Download database package ('hgXX-XX.for.SOAPfuse.tar.gz') from links aboved, and unpack it, then get the hgXX-database directory. We call the absolutepath of this directory as 'DATABASE_DIR'.
    # You can also follow the guide to construct your own database files in DATABASE_DIR.

    Then, modify the config file as below:

    • Define 'DB' prefix info
      DB_db_dir = /DATABASE_DIR/
    • Define 'PG' prefix info
      PG_pg_dir = /TOOL_DIR/source/bin
    • Define 'PS' prefix info
      PS_ps_dir = /TOOL_DIR/source
    • Define 'PD' prefix info*
      PD_all_out = /out_directory/
    • Define 'PA_all_fq_postfix' prefix info
      PA_all_fq_postfix = PostFix

    * PD_all_out is the directory which you prepared to store all results of SOAPfuse. You can set it via
      the option ('-o') of main program which is introduced below, and it has the higher priority. SOAPfuse
      will creat the sub-directories of each step in out_directory automatically when it runs.

  3. Run SOAPfuse


    You can find the main script 'SOAPfuse-RUN.pl' in TOOL_DIR. Use perl to run it.

    • From v1.27, SOAPfuse packages part of its functions into SOAPfuse perl module.
      you must set the PERL Lib PATH as this post tells.

      Command:
      $ perl SOAPfuse-RUN.pl -c  <config_file>
      -fd <WHOLE_SEQ-DATA_DIR>
      -l <sample_list>
      -o <out_directory>
      [Other Options]

      Options:
      -c  [s] Config File for run this pipeline. <required>
      -fd [s] Directory which stores Paired-end Sequenced Read Files. <required>
      Sequenced Reads Format can be fastq or fasta.
      Files could be compressed by gzip or just readable text-format.
      -l [s] The information list of sample(s) you want to deal. <required>
      This list can include infomation of one or more samples.
      It is suggested to include one sample/patient in each sample list file.
      -o [s] Directory which will store all results.
      It has the first priority, or you should set 'PD_all_out' in config file.
      -fs [i] The step you want to start from. [1]
      -es [i] The step you want to end at. [9]
      Step 9 is the last step of the SOAPfuse pipeline.
      -tp [s] The name-postfix of temp directory*. [data +%s.'_'.int(rand(1000)+1)]
      Donot set same string for different Sample-info-list files.
      It is suggested to set this parameter as same as SampleID for distinguishing
      the scripts of different samples easily in the general case that one
      sample-info-list file just includes one sample.
      -fm Sign to enable perl fork management. [disabled]
      -h Display this help info.

      * We suggest to set -tp as the sample-ID or patient-ID to easily distinguish the temp directory, as we have suggested to prepare one list for each sample or patient (in somatic mode).


    Other Command:

    • To check the version of SOAPfuse
      $ perl SOAPfuse-RUN.pl -c version

MongoDB Logo MongoDB