RECORD - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size
edit_reference3.pl	2016-10-28	19.8 kB
main.pl	2016-05-02	11.7 kB
configuration_file.txt	2014-11-07	3.6 kB
readme.txt	2014-11-07	5.3 kB
reverse_complement.pl	2014-11-07	2.2 kB
SampleReference.class	2014-11-07	5.3 kB
SampleReference.java	2014-11-07	5.2 kB
Totals: 7 Items		53.0 kB

RECORD: Reference-Assisted Genome Assembly for Closely Related Genomes



1. Introduction
---------------

This software package contains the prototypical implementation of the approach
presented in the research paper titled "RECORD: Reference-Assisted Genome
Assembly for Closely Related Genomes" by K. Buza, B. Wilczynski and N. Dojer.

This file is a short documentation of the software. 


2. Licence, terms and conditions of usage
-----------------------------------------

By using the software you agree that:

- the software is a research prototype, and therefore, there is absolutely
no guarantee associated with the software, neither for its fit for any specific
purpose, nor that it produces correct or accurate results,

- the fact that the software is a research prototype means that it may contain 
much higher amount of bugs (errors) than usual (i.e., commercial) software. Such 
bugs may cause that the software stops working unexpectedly or the software may
produce incorrect results, therefore, the software is neither desinged nor
suited for the usage in any operational environment (including, but not limited
to, industrial and medical environments),    

- you may only use this software or its components entirely on your own 
responsibility,

- the author of the software is NOT responsible in any way for ANY kind of 
demage caused by the software or associated with its usage,

- it is absolutely forbidden to use the software in any application that is 
not conform with the law or applicable regulations, 

- the author definitely FORBIDS to use this software in ANY application that 
aims to infect humans or to cause diseases to humans in any other way or
might be associated with killing human persons (including abortion in any stage
of embrionic development), 

- whenever you use the software, you should properly acknowledge its authors
and/or the authors of the aforementioned research paper describing the 
methodology behind the software (e.g. if you write a paper and use this 
software, you are kindly asked to refer to the website from which you obtained
the software, and/or to the aforementioned research paper).

If you agree the above terms and conditions, you can freely use this software.  


3. Installation
---------------

The prototypical implementation of the RECORD approach consists of a set of
scripts, most of them being written in Perl, while one of the programs was
written in Java. Therefore, in order to be able to run RECORD, you must
have Perl and Java installed on your computer. RECORD calls the genome 
assembler Velvet and the genome aligner MUMmer, therefore, you need to have 
these tools installed as well.

Summary of the software tools that have to be installed to run RECORD:
- Perl,
- Java,
- genome assembler called "Velvet" (tested with version 1.2.08),
- genome alignment tool called "MUMmer" (tested with version 3.23). 

If the aforementioned required softwares are installed, in order to
install RECORD, you only need to copy the scripts into a new folder
and set them executable (using e.g. the command
chmod +x [name_of_the_script_file] ).


4. Running RECORD
-----------------

As the software tool RECORD has many parameters, in order to run it, 
these parameters are provided in a structured text file that is 
parsed by the program. Therefore, you only need to type

./main.pl configuration_file.txt

in order to start RECORD.

And example for the configuration file, together with the explanation
of each parameter is attached to the software, see 
configuration_file.txt . 

You have to prepare a workspace folder for each run of RECORD. In the 
configuration file, you will have to provide the name of the workspace 
folder. Please make sure that nothing else is stored in the workspace 
folder, because RECORD will produce the intermediate results in the 
workspace folder, and it may overwrite files that have the same name.

You have to place the reference genome in fasta format in the 
"ref" subfolder under the workspace folder:


[WORKSPACE]/ref/                       - this folder should contain the reference
   genome of the species, which is a necessary input of the RECORD pipeline
  

The results of the intermediate steps of the RECORD will appear
in the "results" subfolder of your workspace folder and its
subfolders.

In particular:

[WORKSPACE]/results/pseudoreads1.fastq - the first mate of the pseudoreads generated
   from the reference genome

[WORKSPACE]/results/pseudoreads2.fastq - the second mate of the pseudoreads generated 
   from the reference genome

[WORKSPACE]/results/velvet_assembly/   - the output of the genome assembler Velvet

[WORKSPACE]/results/alignment/         - this folder contains the alignment between 
   the contigs outputted by Velvet and the reference genome, RECORD uses MUMmer to
   obtain this alignment

[WORKSPACE]/results/edited_ref/        - this folder contains the edited reference,
  i.e., the primary output of RECORD in FASTA format, and an additional file similar 
  to MUMmer alignment-files which shows which parts of the reference were replaced 
  while editing the reference

 



5. Contact
----------

I case of further questions, you may contact the author of the software via
e-mail: Krisztian Buza, chrisbuza@yahoo.com

Good luck!

Source: readme.txt, updated 2014-11-07

RECORD Files

Assisted Genome Assembler