Home
Name Modified Size InfoDownloads / Week
Graphical 2015-10-15
Text_mode 2014-04-21
README 2015-10-22 6.9 kB
Square_Tutorial.pdf 2015-10-15 115.2 kB
Totals: 4 Items   122.1 kB 0
Square Tutorial
_________________________________________________________________________________________________________

Squere is a prokaryote genome annotation user-friendly software, with easy installer and graphical interface. This tutorial will guide you in only two simple steps for annotation. The Square features include:

Locate ORFs 
Locate tRNA
Add database information
Output in Genbank and genes with different color for favor identification of poor or missing information.

Actually this software are in Beta mode, errors can be reported to marcus.eslabao@yahoo.com.br, this action will contribute to improvement of Square! 


Installing Square

After download, for Square installation, just double-click in file square_x.x_amd64.deb and Ubuntu software center will open, after click in button install on top of window, the administrator password is asked and the installation will be concluded. For run Square, go to programs and search by name and just click over Square icon.



Using Square

The annotation speed depends basically of three variables, size of genome,  size of data base and computer speed. How bigger is the genome and database more slow are the annotation process and how many CPU threads/cores and speed more fast is the process, for chose the number of CPU threads/cores usable for Square click on Options and in the box Number of CPU cores/threads chose a number, if you select a bigger number of CPU then your computer own a maximum number of CPU available will be used. When Square starts the Linux administrator password will be ask, this is necessary to create files and run programs. 


Step 1 – Database (DB)
For add information in your working genome a db becomes necessary. The download and formatting of database can be proceed easily and automated with Square following one of steps described in the below table, in this table you can chose for the best db for your work, but remember how much bigger and unspecific is the database more slow is the annotation process. If you have a database proceed to step 2.


Database		Knowledge of working organism	Annotation speed	Description item
Swiss-Prot		Poor				Average – fast		1.1
TrEMBL			Poor				Very Slow		1.2
Search by specie	Good				Fast			1.3
Search by genus		Average				Fast - Average		1.3
Own DB			Good				Fast - Average		1.4
Table for selection of databases in relation to knowledge and speed


1.1 – Download do Swiss-Prot
The Swiss-Prot is a cured database, checked by hand, this makes it very reliable but with little information. We would recommend it to extremely basic notes and when you do not know more akin body of the genome that is working. To carry out its download, open the Square, click Options and click the Button Download Swiss-Prot.

1.2 – Download do TrEMBL
The TrEMBL is an uncured database, with all proteins of information contained in UniProt, it is extremely heavy and slow to download to do the annotation. We strongly recommend only in the case of little knowledge of the body to be noted. To download its, open the Square, click Options and click the Button Download TrEMBL.

1.3 – Download searching by Genre or Species
If you have a good knowledge about the organism that is being noted and the UniProt has a good deal of data on this body, we can fetch the data from related organisms and the Square will create a database from them. Searching kind makes usually the database smaller and faster annotation and looking for the gender database becomes larger relative to the created based on species and annotation becomes slower. To carry out its download, open the square, click Options and click the Options frame | Create my database from UniProt |, under which there will be open the following options:

1.3.1 – Search by taxonomy ID
For this it is necessary to know the code for táxom you are working, for this please visit http://www.ncbi.nlm.nih.gov/taxonomy in the search box locate the desired organism, a list of possible organisms appear and by clicking on one of them the information "Taxonomy ID: 000" will appear in which case the number "000" is the Taxonomic identification number and it must be inserted into the Square. By clicking the button | Search by taxonomy ID | A dialog window will appear in it must be entered táxom the identification number, right after another dialog box will appear in it you can enter a name for the DB, which will be used later in the annotation. After chasing the button indicate how many Kb of data is downloaded it can take a while to finish download a window tell you how many genes are present in the search and whether you want to or not save this data, if so automatically the database is created .

1.3.2 – Search by organism name
Search for the name of the parent organism (genus and/or species and/or isolated) simply click on the button | Search by organism name | A dialog window will appear in it must be entered the name of the organism, after another dialog box will appear in it you can enter a name for the DB, which will be used later in the annotation. After chasing the button indicate how many Kb of data is downloaded it can take a while to finish Download a window tell you how many genes are present in the search and whether you want to or not save this data, if so automatically the database is created.


1.4 – Create your own database from a Fasta file
You can format a fasta file into a database usable in the Square, the speed of annotation depend on the size of this file, the smaller the faster the process of annotation. To create the DB, open the square, click Options and click the button | Create my from fasta file database |, a window will open to select the fasta file and after another window will open to select the name for the new DB.



Step 2 – Annotation
For the annotation process it is necessary only two prerequisites, namely, a file in FASTA format containing the genome to be annotated and a database, which can be obtained in Step 1. To annotate a genome open the Square and click the Annotation button, a window opens to select the fasta file containing the genome to be noted, soon after open a window, allowing the choice of database to record this genome, then a dialog box will ask where the file containing the note should be saved. At the end of the process a window will inform the end annotation.



Viewing the annotation data
At the end of the annotation process you can view the data, we recommend the Artemis: Genome Browser for this purpose but any program that accepts the genbank format may be used. When you open the annotation you can view the genes in three colors, the green color indicates that the data contained in this gene has high identity with the database >= 95% and e-value 0.0, yellow indicates that the gene has identity of 80% to 95%, and e-value > 0.0 and red indicates that the gene has less than 80% identity and e-value > 0.0.


Source: README, updated 2015-10-22