| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| poseidon.1.0.2.bin.tar.gz | 2015-11-01 | 787.2 kB | |
| README | 2015-11-01 | 4.0 kB | |
| Totals: 2 Items | 791.2 kB | 1 | |
===============================================================================
Poseidon(v1.0.2)
-------------------------------------------------------------------------------
by Euncheon Lim in WeigelWorld
Due to the conflict of interests, the source codes are unavailable until the publication
of the algorithm finished. Only the binaries are provided.
Poseidon is a highly senstive and efficient taxonomy classifier based on a population index,
which combines an FM-index and a higher-level layer describing the texts in a dataset. For
a taxonomy classifier, taxonomy information are used to describe the texts. The algorithm has
been developed since 6. Feb. 2015. The algorithm is presented in the CSHL conference (Genome
Informatics) on 30, Oct, 2015.
I especially thank Prof. Weigel for his care and supports during my PhD study.
I am inspired by the aim of MEGAN software written by my co-supervisor, Prof. Huson.
I also appreciate Dr. Jared Simpson for explaining modules in SGA.
Alex Bowe nicely illustrated the backward searches on an FM-index.
The burrows wheeler transformed text is currently generated by ropebwt2, which
is written by Heng Li. The basic string operations are performed by modified modules taken from
SGA package written by Dr. Simpson. They are modified to support for searching ambiguous bases.
The taxonomy information are rendered by taxonomy tree and taxonomy map.
This tool supports for single-end FASTA and FASTQ files and is enabled with parallel IO.
1. Recommanded Environment
RAM: 64 GB memory
Cores: 64
Compiler: Latest GCC
Location of FASTQ: SSDs or RAID enabled with parallel access
2. Minimum Environment
Arch: linux 64bit
RAM: 16 GB memory
Core(s): 1
3. Usage
(1) Create a list file containing paths for UNCOMPRESSED FASTA or FASTQ files.
each line indicates a file to be classified
ebola_virus_infected_individual.fa
Corona_virus_infected_human.fq
Corona_virus_infected_bovine.fq
...
(2) poseidon -f file_list -r inc -e exc
Note: -r parameter indicates a folder containing genomes of known bacteria or viruses or fungi
-e parameter indicates a folder containing background genomes such as H. sapiens or A. Thaliana
4. Library Installation
(0-1) Prerequisite (especially, zlib should be installed)
sudo apt-get install build-essential python-dev libzip-dev libbz-dev zlib1g-dev libsparsehash-dev
(1-1) Download boost 1.55.0
wget http://sourceforge.net/projects/boost/files/boost/1.55.0/boost_1_55_0.tar.gz/download
mv download boost_1_55_0.tar.gz
(1-2) Uncompress and change the directory
tar xf boost_1_55_0.tar.gz
cd boost_1_55_0
(1-3) Install boost
./bootstrap.sh
sudo ./b2 install --prefix=/usr/local
(1-4) export the boost lib path to "LIBRARY_PATH" environment variable in .bashrc
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib
5. Installation
(0) Download compressed binary file
(1) ./toy_example.sh
6. Licence:
Poseidon: A highly sensitive and efficient taxonomy classifier
Copyright (C) 2015- <Euncheon Lim>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.