Showing 13 open source projects for "duplicate"

View related business solutions
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 1

    ParDRe

    Parallel tool to remove duplicate DNA reads

    ParDRe is a parallel tool to remove duplicate reads. Duplicate reads can be seen as identical or nearly identical sequences with some mismatches. This tool will let the users to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset (e.g., assemblies, mappings, etc.). The tool is implemented with MPI in order to exploit the parallel capabilities of multicore clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    BBMap

    BBMap short read aligner, and other bioinformatic tools.

    ...Handles Illumina, PacBio, 454, and other reads; very high sensitivity and tolerant of errors and numerous large indels. Very fast. BBNorm: Kmer-based error-correction and normalization tool. Dedupe: Simplifies assemblies by removing duplicate or contained subsequences that share a target percent identity. Reformat: Reformats reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64, at over 500 MB/s. BBDuk: Filters, trims, or masks reads with kmer matches to an artifact/contaminant file. ...and more!
    Leader badge
    Downloads: 547 This Week
    Last Update:
    See Project
  • 3
    Burny1250 terminal

    Burny1250 terminal

    Program for transferring G-code to Burny 1250 or Burny 1250+

    The program is designed to transfer the G-code to the controller of the Burny1250 plasma cutting machines. If there is a postprocessor for a modern cam program, then it is necessary to resolve the issue only with the transfer data to the controller Burny1250. The program may be needed for machines: B & W SYSTEMS CNC Plasma Cutter (BWPC01) LOCKFORMER Vulcan NT 2000 Aviator XLT Innerlogic Proline 2200 Precision Plasma Cutter Innerlogic SR-45i CNC plasma C&G Aviator XL CNC plasma...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads through the clustering of single-end and paired-end sequences from FASTQ/FASTA datasets. This tool allows bioinformatics to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset. MarDRe is the Big Data counterpart of ParDRe (link above), which employs HPC technologies (i.e., hybrid MPI/multithreading) to reduce runtime on multicore systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 5
    DataCleaner

    DataCleaner

    Data quality analysis, profiling, cleansing, duplicate detection +more

    DataCleaner is a data quality analysis application and a solution platform for DQ solutions. It's core is a strong data profiling engine, which is extensible and thereby adds data cleansing, transformations, enrichment, deduplication, matching and merging. Website: http://datacleaner.github.io
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6

    ParDRR-MPI

    Parallel Duplicate Read Remover with MPI

    ParDRR-MPI is a parallel tool to remove duplicate reads. Duplicate reads can be seen as identical or nearly identical sequences with some mismatches. This tool will let the users to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset (e.g., assemblies, mappings, etc.). The tool is implemented with MPI in order to exploit the parallel capabilities of multicore clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    xorlisp

    Bit level lambda continuations and nothing else - Queue automata

    Not working yet. To deal with the Halting Problem, computing and data are navigated using debugger ops: linearForward and treeForward, which navigate an astronomically large bit string where 1 is ( and 0 is ). All pairs are derived from (). For example, true is represented as ((()())()), and false is (()(()())). It appears related to the church encoding of lambda where T chooses first parameter and F chooses second, of a pair. Continuations are nearly finished code and are represented as a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    RouteConverter is another route conversion tool. It helps me and I hope it could help you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9

    QUASR

    Cross-platform NGS processing and analysis pipeline in Python

    QUASR is a lightweight pipeline written to process and analyse next-generation sequencing (NGS) data from Illumina, 454, and Ion Torrent platforms. Although originally written for viral data, it is generic enough to work on any NGS dataset. Functions include: duplicate removal demultiplexing primer-removal quality-assurance (QA) graphing quality control (QC) consensus-generation minority-variant determination minority-variant graphing The main current version is 6.X, which is written in Python3. 7.X is my rewrite in Java, but is still work in progress. Both are written to be as lightweight as possible so they can run with minimal memory-requirements on a desktop or laptop as well as on a compute cluster. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • 10
    PyPedal is a Python module that provides tools for the manipulation of pedigrees, simple visualization of pedigrees, and the calculation of measures of genetic diversity from pedigrees.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    PAICE is a rapid bioinformatics pathway visualization tool for KEGG-compatible accessions derived from Illumina Solexa next-gen and Affymetrix datasets. It colors KEGG pathways while appreciating detection-calls and duplicate gene copies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    A high-performance implementation of bloom filters, a lightweight duplicate detection algorithm.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    This utility can process multiple POI (Points Of Interest) files (format CSV, comma separated), merge POI lists, find and eliminate (or mark) duplicates (POIs with similar coordinates, e.g. POIs having < 10m distance between them). See docs & screenshots
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB