CpDB tutorial Code

Hands-on automatic annotation tutorial in Linux

Brought to you by: asantosbioinfo

Tree [r20] / History

HTTPS access

File	Date	Author	Commit
blastp	2013-04-27	asantosbioinfo	[r1] Initial Upload
curated	2013-04-27	asantosbioinfo	[r1] Initial Upload
fgenesB	2013-04-27	asantosbioinfo	[r1] Initial Upload
firsthit-parser	2021-11-27	asantosbioinfo	[r8] ncbi-blast+ (blastp) changed file output compar...
parseEMBLtoCpDB-code	2024-06-09	asantosbioinfo	[r19] Bug corrections: Adjusting to gcc (Ubuntu 11.4....
sql	2021-11-10	asantosbioinfo	[r7] Exporting gene and CDS features; signal and tmh...
CpI19v1.schema.dump	2023-05-28	asantosbioinfo	[r14] Green color cut-off relaxed
CpI19v2.fasta	2013-04-27	asantosbioinfo	[r1] Initial Upload
CpI19v2.surfg	2013-04-27	asantosbioinfo	[r1] Initial Upload
README.md	2023-05-22	asantosbioinfo	[r12] README.md
Tutorial_en.odt	2023-10-03	asantosbioinfo	[r16] gunzip * does not work. Corrected for gunzip *.gz
Tutorial_en.pdf	2023-10-03	asantosbioinfo	[r16] gunzip * does not work. Corrected for gunzip *.gz
Tutorial_pt.odt	2023-10-03	asantosbioinfo	[r16] gunzip * does not work. Corrected for gunzip *.gz
Tutorial_pt.pdf	2023-10-03	asantosbioinfo	[r16] gunzip * does not work. Corrected for gunzip *.gz
firsthitparser	2023-05-21	asantosbioinfo	[r11] Turning the 64 bit version the default
parseEMBLtoCpDB	2024-06-09	asantosbioinfo	[r20] Bug corrections: Adjusting to gcc (Ubuntu 11.4....
parseblastpfiles	2013-04-27	asantosbioinfo	[r1] Initial Upload
valifasta	2023-09-29	asantosbioinfo	[r15] My program 'valifasta' now is needed to deal wi...

Read Me

Introduction

This software allows us to create a relational database in PostgreSQL hosting full bacterial genomes. Besides the database, there is software, like a parser, to convert EMBL or GBK files to the CpDB relational schema. Once in the CpDB, one can extract unlimited reports from bacterial genomes using SQL. This software is part of the Ph.D. in Bioinformatics from Anderson Santos developed under the Corynebacterium pseudotuberculosis (Cp) pangenome project. The Cp pangenome delivered to the scientific community fifteen bacterial strains deposited at the GenBank database between the years of 2009 and 2012. The thesis was written in Brazilian Portuguese. However, an English book chapter explaining the software is available at this address. CpDB is the backbone for the Pannotator software. Both software still is alive and kicking.

Downloading

svn checkout svn://svn.code.sf.net/p/cpdbtutorial/code/ cpdbtutorial-code

Note: The download zip is a copy of this above checkout result.

Installing

For Ubuntu 10 OS or higher:
1. Install flex:

sudo apt install flex

Install bison

sudo apt install bison

Install build-essential package:

sudo apt install build-essential

Compiling

This project has four different parsers: GO, GBK, and two first-hit parsers. To compile each one goes to the respective directory and type:

./make

Enjoy it.

BUG 1) Large EMBL/Genbank text qualifiers are not supported (>255 characters).
They certainly will throw a stack overflow. To walk around: just remove from the
EMBL/Genbank target file those text qualifiers containing texts greater than 255
characters.