Showing 46 open source projects for "dna sequence analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    doccano

    doccano

    Open source annotation tool for machine learning practitioners

    doccano is an open-source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequence-to-sequence tasks. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create a project, upload data and start annotating. You can build a dataset in hours.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Evo 2

    Evo 2

    Genome modeling and design across all domains of life

    Evo 2 is a DNA language model system designed for long-context genome modeling and biological sequence design across all domains of life. The project models DNA at single-nucleotide resolution and supports context windows of up to one million base pairs, which places it in a class of models built for very large genomic reasoning tasks. According to the repository, it uses the StripedHyena 2 architecture, was pretrained with Savanna, and was trained autoregressively on the OpenGenome2 dataset containing 8.8 trillion tokens. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Plaso

    Plaso

    Super timeline all the things

    Plaso (Plaso Langar Að Safna Öllu), or "super timeline all the things," is a Python-based engine designed for automatic creation of timelines in digital forensic investigations. It processes various log files and artifacts to generate a chronological sequence of events, aiding analysts in understanding system activities.​
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    DeepVariant

    DeepVariant

    DeepVariant is an analysis pipeline that uses a deep neural networks

    DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data. DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Python Zero to Hero for DevOps Engineers

    Python Zero to Hero for DevOps Engineers

    Learn Python from DevOps Engineer point of you

    Python Zero to Hero for DevOps Engineers is a structured “Python Zero to Hero for DevOps Engineers” course laid out as a day-by-day learning path. The repository is organized into Day-01 through Day-19 folders plus a small sample app, which makes it very easy to follow in sequence like a bootcamp. The curriculum starts with Python installation, environment setup, and writing your first script, then quickly moves into data types, strings, regular expressions, variables, and functions. It places a strong emphasis on DevOps-specific use cases: environment variables, command-line arguments, configuration handling, and automating log analysis or user management tasks are all explicitly woven into the exercises. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    relax

    relax

    Molecular dynamics by NMR data analysis

    The software package 'relax' is designed for the study of molecular dynamics through the analysis of experimental NMR data. Organic molecules, proteins, RNA, DNA, sugars, and other biomolecules are all supported. It supports exponential curve fitting for the calculation of the R1 and R2 relaxation rates, calculation of the NOE, reduced spectral density mapping, the Lipari and Szabo model-free analysis, study of domain motions via the N-state model and frame order dynamics theories using anisotropic NMR parameters such as RDCs and PCSs, the investigation of stereochemistry in dynamic ensembles, and the analysis of relaxation dispersion data.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    quantitative

    quantitative

    Quantized transactions python3

    ...As an open-source educational resource, it’s designed for Python users interested in automatic trading, algorithmic strategies, and financial data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    EduData

    EduData

    Datasets in Education and convenient interface for dataset

    Datasets in Education and convenient interface for downloading and preprocessing dataset in education. The CLI tools to quickly convert the "raw" data of the dataset into "mature" data for knowledge tracing task. The "mature" data is in json sequence format and can be modeled by XKT and TKT(TBA) The analysis dataset tool only supports the json sequence format. To check the following statical indexes of the dataset. In order to better verify the effectiveness of the model, the dataset is usually divided into train/valid/test or using kfold method. Each item in the sequence represents one interaction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Peptide Vaccine Analysis Tool (PVAT) is an optimization software that predicts the best possible peptide stretches in a given protein sequence based on two factors: 1. the surface exposure of the peptide stretches, and 2. their susceptibility to mutation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10

    MToolBox

    A bioinformatics pipeline to analyze mtDNA from NGS data

    MToolBox is a highly automated bioinformatics pipeline to reconstruct and analyze human mitochondrial DNA from high throughput sequencing data. MToolBox includes an updated computational strategy to assemble mitochondrial genomes from Whole Exome and/or Genome Sequencing (PMID: 22669646) and an improved fragment-classify tool (PMID:22139932) for haplogroup assignment, functional and prioritization analysis of mitochondrial variants. MToolBox provides pathogenicity scores, profiles of genome variability and disease-associations for mitochondrial variants. ...
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    NLP-Models-Tensorflow

    NLP-Models-Tensorflow

    Gathers machine learning and Tensorflow deep learning models for NLP

    NLP-Models-Tensorflow is a collection of natural language processing model implementations built using the TensorFlow deep learning framework. The repository provides numerous examples of neural network architectures used in modern NLP research and applications, including text classification, language modeling, machine translation, and sentiment analysis. Each model implementation is designed to illustrate how common NLP architectures operate, such as recurrent neural networks, convolutional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    jieba

    jieba

    Stuttering Chinese word segmentation

    "Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable for word segmentation in search engines. The paddle mode uses the PaddlePaddle deep learning framework to train the sequence labeling (bidirectional GRU) network model to achieve word segmentation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    TI2BioP allows mainly the calculation of topological indices (spectral moments) derived from inferred and artificial 2D structures of DNA, RNA and proteins being possible to carry out a structure-function correlation irrespective of sequence alignments. TI2BioP version 3.0 is a python platform with a graphical interface designed for Windows, Linux and Mac OS.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14

    FOSS license

    FOSS license and sentence token

    We propose a method to mark the comments of license as sentence-token. We use the term sentence-token to refer to a sentence of a known license. A license (both by-inclusion or by-reference) is a sequence of sentence-tokens. Sentence-tokens are generalized using one or more regular expressions. we propose an idea for license identification based on the analysis of each sentence in the license statement of a source code file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    OpenSeq2Seq is a TensorFlow-based toolkit for efficient experimentation with sequence-to-sequence models across speech and NLP tasks. Its core goal is to give researchers a flexible, modular framework for building and training encoder–decoder architectures while fully leveraging distributed and mixed-precision training. The toolkit includes ready-made models for neural machine translation, automatic speech recognition, speech synthesis, language modeling, and additional NLP tasks such as sentiment analysis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    D-Tailor

    D-Tailor

    D-Tailor: automated analysis and design of DNA sequences

    Recent advances in DNA cloning and synthesis technologies afford high throughput implementation of designed sequences into living cells. However, our ability to design sequences to interrogate multifactorial biological processes and further engineer biological functions is lagging behind. DNA-Tailor (D-Tailor) is a fully extendable software framework for biological sequence analysis and multi-objective sequence design.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17

    BioSeq

    A simple GUI for some of the biological sequence analysis.

    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The Deep Review

    The Deep Review

    A collaboratively written review paper on deep learning, genomics, etc

    This repository is home to the Deep Review, a review article on deep learning in precision medicine. The Deep Review is collaboratively written on GitHub using a tool called Manubot (see below). The project operates on an open contribution model, welcoming contributions from anyone. To see what's incoming, check the open pull requests. For project discussion and planning see the Issues. As of writing, we are aiming to publish an update of the deep review. We will continue to make project...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    P3BSseq

    Parallel processing pipeline for analysis of bisulfite sequencing data

    Bisulfite sequencing (BSseq) processing is among the most cumbersome next generation sequencing (NGS) applications. Though some BSseq processing tools are available, they are scattered, require puzzling parameters and are running-time and memory-usage demanding. We have developed P3BSseq, a parallel processing pipeline for fast, accurate and automatic analysis of BSseq reads that trims, aligns, annotates, records the intermediate results, performs bisulfite conversion quality assessment,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Matplotlib tutorial

    Matplotlib tutorial

    Matplotlib tutorial for beginner

    The Matplotlib tutorial repository is designed as a hands-on learning resource to help users — especially Python beginners — get started with Matplotlib for creating plots and charts. It provides a sequence of example scripts and notebooks that cover fundamental plotting tasks: line graphs, histograms, scatter plots, bar charts, customizing axes, labels, legends, and styling. This makes it ideal for someone learning data analysis or exploratory data visualization for the first time and needing concrete, runnable examples rather than abstract explanations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ChIP-RNA-seqPRO

    ChIP-RNA-seqPRO

    ChIP-RNA-sequencing-processing (ChIP-RNA-seqPRO)

    ...Runnable python scripts packaged together with customized annotation libraries, demo data input and README guide. 9/26 : v1.1 Updated MAIN_IV to debug error thrown by python pandas no longer supporting 'subset'. This code will no longer be actively maintained/updated here. A cloud-based resource for comparative analysis of epigenetic, sequence variation, and expression datasets is now available. Please visit the Cloudomics, project for cloud-based resources: https://sourceforge.net/projects/cloudomics-for-aws/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Stochastic Rule Builder (SRB)

    Stochastic Rule Builder (SRB)

    Modeling framework for capturing positional and temporal dynamics

    ...There is growing evidence that transcriptional regulation is the complex behavior that emerges not solely from the individual components, but rather from their collective behavior, including competition and cooperation. Our framework describes individual regulatory components using generic action oriented descriptions of their biochemical interactions with a DNA sequence. All the possible actions are based on the current state of factors bound to the DNA. We developed a rule builder to automatically generate the complete set of biochemical interaction rules for any given DNA sequence. Off-the-shelf stochastic simulation engines can model the behavior of a system of rules and the resulting changes in the configuration of bound factors can be visualized.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MSCViewer

    MSCViewer

    A tool for visualization and analysis of logs as sequence diagrams

    MSCViewer is a tool intended for debugging of control flows in concurrent, distributed systems. The tool loads logs generated by various entities in the system and visualize a sequence diagram chart for events and interactions. The diagram is fully interactive: entity can be added/removed from the diagram and shuffled; events can be filtered, searched, highlighted and annotated with comments. MSCViewer features integration with a Python interpreter which allows writing Python scripts...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Maximum Common Genome Alignment (MCGA)

    Maximum Common Genome Alignment (MCGA)

    Pipeline for creating core genome alignments for phylogenetic analysis

    Maximum Common Genome Alignment (MCGA) Tool MCGA is a bioinformatics analysis tool written in Python for generating core genome alignment for bacterial whole genome sequences which can be used to construct phylogenetic trees.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    irit_diff_sequences

    Python tool to create lifespan sequences from Wikipedia edits history

    A Python tool which produced lifespan sequences from edits history. The tool is first developed for the Wikipedia edits history but can easily be adapted for others applications. From a database containing for each article its list of revisions, produce one csv file per article containing authored sequences and lifespans. Output format: i,j,lifespan,author with - i : begining of the chars sequence - j : end of the chars sequence - lifespan : number of edits the sequence has survives...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB