From: Scott C. <sc...@sc...> - 2019-03-14 19:21:28
|
Hello, I wanted to forward this to a few mailing lists where I think it might be of interest. Adhemar is proposing a new project that is a web front end to Chado based on Django be a GMOD project. Please feel free to reply to this email with comments and questions. Thanks, Scott ---------- Forwarded message --------- From: Adhemar <az...@gm...> Date: Thu, Mar 14, 2019 at 10:09 AM Subject: Nominate software: machado To: <he...@gm...> Nominate software: machado https://github.com/lmb-embrapa/machado machado is a Django2-based application that contains tools to interact with Chado databases. It provides users with a framework to store, search and visualize biological data. - Data loaders for the major bioinformatics formats: fasta, gff, obo, bibtex, blast, interproscan, orthomcl - The machado API delivers data directly to the JBrowse genome browser - The Haystack framework provides a very fast query interface using the Elasticsearch engine *Requirements for the nomination process* The development of machado was triggered by an undergoing research project at EMBRAPA, the largest public research institute on agriculture in Brazil. In this project we'll analyze over 50 genomes of plants by integrating diverse data, such as, RNA-Seq, orthology and functional annotation (blast, diamond, interproscan). As an initial approach, the project leaders chose to store data in an well-established ontology-based database (chado) using Python3 and started to develop machado with some objectives in mind: i) to have it integrated to JBrowse via API, and ii) to implement generic data loading tools to store data from some of the main bioinformatics data formats (fasta, gff, obo, bibtex, blast, interproscan, orthomcl). Future demands include the implementation of databases for mammalians species and the development of tools for additional data files such as vcf. The software is in early stage of development and there are many functionalities still remaining to be concluded, most of them related to visualization and interfaces. It is open source under the GPL license. Comprehensive documentation related to software installation/configuration and the data loading/visualization is available at http://machado.readthedocs.io We put extra efforts to make sure it's an inviting open environment in urge for new developers. It is hosted by Github and has a few quality control features, such as, continuous integration, and unit tests. The code contains type annotations and follow the coding styles specified at PEP8 (Style Guide for Python Code) and PEP20 (The Zen of Python). The Github's 'Issues' is the main communication method to inform bugs, request new functionalities and discuss code contributions. We intend to massively use it in further research projects and will be happy to support users and collaborators for the next few years. EMBRAPA is hosting a demo of machado containing only 4 genomes at http://www.machado.cnptia.embrapa.br -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research |