In the era of omics 'big data', and in particular next-generation sequencing (NGS), gene prioritization is a crucial task, involving the integration of huge amounts of heterogeneous data and the subsequent selection and analysis of genes predicted to be involved in a specific biological process, such as a pathology. Large sets of genes must be evaluated, in order to score and rank them according to their similarity to known genes and their potential viability as candidates for important applications, such as diagnostic/prognostic markers, drug targets, etc. The biomedical community urgently needs a customizable and extensible framework for gene selection that can handle large-scale biological information from public, as well as private data resources. To our knowledge, no other open source framework for gene prioritization has previously been developed.
[GEPETTO] (GEne PrioriTization ExTended TOol) is an original open-source framework, distributed under the LGPL license, for gene selection and prioritization on a desktop computer that ensures confidentiality of personal data. It takes advantage of the data integration capabilities in the SM2PH-Central knowledgebase, combined with in-house developed gene prioritization methods. It currently incorporates six prioritization modules, based on gene sequence, protein-protein interactions, gene expression, disease-causing probabilities, protein evolution and genomic context).
[GEPETTO] is written in Java/Python and supported by an advanced modular architecture, which means that it can easily be modified and extended by the user, in order to include alternative scoring methods and new public/private data sources. In the future, we intend to extend the system from gene-level prioritization to variant-level prioritization, by exploiting the variant data in the MSV3D database. The [GEPETTO] software is available at download page or on SM2PH website
Local prioritization methods are both genes and proteins based, so we have extended [GEPETTO] to protein prioritization.