u/sbmv2012 Wiki

Taxonomy assignment of metazoans using a python based pipeline

Brought to you by: arvestad, sb2012, sbmv2012

Taxonomy assignment

Attachments

Virkki&Bourlat_Figure_1.pdf (90678 bytes)

The aim of this project is to create an automated pipeline for the taxonomic assignment of DNA sequences from environmental samples. In this study, we focus on DNA markers amplified from benthic sediment samples. Using a series of customized scripts written in python, DNA sequences were edited as follows: short sequence removal, primer pair removal and reversal to the correct orientation. Clean marker sequences were then clustered in operational taxonomic units (OTUs) and matched up against the Genbank database. All sequences and associated data were stored in a biosql relational database, which was then queried to retrieve taxonomy assignments for each cluster. Below is an illustration of the pipeline.