Menu

Quick Start

john

Back
Quick Start

1. Obtaining CStone

1.1 A zip file containing the CStone.jar file can be downloaded from the Files tab.

1.2 CStone has been tested on Ubuntu 20.04, Windows 10 and MacOS High Sierra, but it is usable on any operating system with installed Java Runtime Environment (JRE) 8.0 or higher. To find out what version of Java is running upen a terminal window and type java -version. If an update is required the latest JRE's can be obtained from the Oracle website: https://www.oracle.com/java/technologies/javase-downloads.html

1.3 Extract the contents of the .zip file and place the CStone.jar file within the desired directory, e.g. directory where reads will be analysed. Make sure permissions are set on this file so that it can be executed. To do this right click and use the properties tab OR chmod the file (sudo chmod +x).

2. Running CStone

2.1 Cstone is designed to run on servers with reasonable specifications for the analysis of next generation sequence data. For example the machines that cstone were tested on had 32 cores and 128GB of memory and 24 cores and 64GB of ram. To run cstone, open a terminal window and type the command:
java -jar -Xmx64G path-to-cstone.jar -r1 path-to-foward-reads-r1.fq.gz -r2 path-to-foward-reads-r2.fq.gz -gz y -p 32 –o path-to-output-directory

where:
-r1: (required) path to the forward reads (in fastq or compressed fastq format).
-r2: (optional) path to the reverse reads (in fastq or compressed fastq format).
-gz: (default = n) indicates if the reads are compressed or not .gz (y=yes / n=no).
-p: (default = 24) indicates the number of cores required.
-o: (required) path to output directory.

Note: if you are in the directory where the cstone.jar resides then the above can become:

java -jar -Xmx64G cstone.jar -r1 path-to-foward-reads-r1.fq.gz -r2 path-to-foward-reads-r2.fq.gz -gz y -p 32 –o path-to-output-directory

Parameters specific to the java virtual machine that must be set are:
-Xmx: which indicates the amount of heap space (general memory) to use for the assembly.
-jar: which indicates that the java virtual machine will execute a jar file.

Advance (optional) parameters specific to cstone, that we do not recommend altering are:
-k: indicates kmer length. Default is 40. Max. value is: 48, min value is: 24
-s: indicates step size for initial kmer identification. Default value is 1, which means that all kmers in the read will be used. A read of length 100 will have 60 kmers of default size 40. If this parameter is set to 2, it means that every second kmer will be used. The max value for this parameter is 5 and the min value is 1.
-ps: indicates the maximum number of contigs to output from a graph. Default is 3. Max. value is: 5, min value is 1.

2.2 Output files will be placed within the user specified directory. On starting, if that directory already exists the user will be informed and the program will exit. The output files included are:

contigs.fasta: This file contains the assembled transcripts. On each title the underlying network classification is places, which indicates whether or not the contig can be guarenteed to be a non-chimeric.

classificationsummary.txt: Sumarizes the number of contigs occuring within each of the seven classification categories.

timestamplogs.txt: Log file containing lenght of time take to complete each step.

3. Sample Data

Sample can be downloaded from the Files tab. Download the .gz file titled test_data.gz. Within this archive exist two files called r1_test_data.fq.gz and r2_test_data.fq.gz. Each file contains 1,986,401 reads, simulated off transcripts ranging in length from 300 to 5000 bps, from the Drosophila melanogaster cDNA library obtained from ensembl. Read length is 200 and the insert size used was 300. The per site coverage of the transcripts used given by these reads is 14.2. Place these two files within the same directory as the CStone.jar. They can then be assembled using the command:

java -jar -Xmx64G path-to-CStone.jar -r1 r1_test_data.fq.gz -r2 r2_test_data.fq.gz -gz y -p 32

Output will be placed into the cstone_out as described in the previous section.

4. Obtaining Source Code

Alternatively the code can be downloaded from the Code tab, imported into an IDE, such as Netbeans, and recompiled as desired. The steps below are for the Netbeans IDE, but others will have a similar process. Note: this is not the recommended (nor required) path for obtaining the working software, unless there is a specific requirement to edit the code. Steps to do this are:

4.1 On the code tab of the project obtain the read only svn checkout link (svn://svn.code.sf.net/p/cstone/code/). There are three options: (i) SSH, (ii) HTTPS and (iii) RO. The read only option is RO and does not require a password later.

4.2 Open Netbeans and under the Team menu select the sub menu Subversion and then sub-sub menu Checkout. This will open a small window with some field to fill.

4.3 In the field that is labelled Repository URL place the RO svn checkout link obtained in step 1.4.1. The username and password can be left blank. Click next.

4.4 Use the browse button to browse the project Repository Folders and select the core folder. This contains all the code. Once OK is pressed select the local folder where you want to download the code to e.g. testFolder.

4.5 Click finish. All the code files and subfolders within core folder will then be placed into the selected location.

4.6 These can be used to set up a new project within Netbeans and you can begin to edit and recompile the code. The easiest way to do this is to creating a new project from scratch and then past the core folder into the source directory of the new project.


Related

Wiki: Home

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.