Showing 186 open source projects for "robust"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Automate contact and company data extraction Icon
    Automate contact and company data extraction

    Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

    Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.
    Explore Apify Store
  • 1
    MCPM.sh

    MCPM.sh

    CLI MCP package manager & registry for all platforms and all clients

    ...With its advanced router and profile features, mcpm.sh simplifies the management of complex MCP environments, supporting clients like Claude Desktop, Cursor, and Windsurf. The tool is built with Python and leverages the Click framework for its CLI, ensuring a robust and user-friendly experience.​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Dshell

    Dshell

    Dshell is a network forensic analysis framework

    An extensible network forensic analysis framework. Enables rapid development of plugins to support the dissection of network packet captures. This is a major framework update to Dshell. Plugins written for the previous version are not compatible with this version, and vice versa. By extension, dpkt and pypcap have been replaced with Python3-friendly pypacker and pcapy (respectively). Enables development of external plugin packs, allowing the sharing and installation of new,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Omnilingual ASR

    Omnilingual ASR

    Omnilingual ASR Open-Source Multilingual SpeechRecognition

    ...It emphasizes modularity: acoustic modeling, language modeling, tokenization, and decoding are separable pieces you can swap or ablate. The repo is aimed at pushing practical multilingual ASR—robust to accents, code-switching, and domain shifts—rather than language-by-language systems. For practitioners, it’s a starting point to study transfer, zero-shot behavior, and trade-offs between model size, compute cost, and coverage.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    JEPA

    JEPA

    PyTorch code and models for V-JEPA self-supervised learning from video

    JEPA (Joint-Embedding Predictive Architecture) captures the idea of predicting missing high-level representations rather than reconstructing pixels, aiming for robust, scalable self-supervised learning. A context encoder ingests visible regions and predicts target embeddings for masked regions produced by a separate target encoder, avoiding low-level reconstruction losses that can overfit to texture. This makes learning focus on semantics and structure, yielding features that transfer well with simple linear probes and minimal fine-tuning. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • BILL eliminates tedious paperwork, automating data entry and approval routing. Icon
    BILL eliminates tedious paperwork, automating data entry and approval routing.

    Automate the way you pay and get paid

    BILL is a leading financial operations platform for small and midsize businesses (SMBs). As a champion of SMBs, we are automating the future of finance so businesses can thrive. Our integrated platform helps businesses to more efficiently control their payables, receivables and spend and expense management. Hundreds of thousands of businesses rely on BILL’s proprietary member network of millions to pay or get paid faster. Headquartered in San Jose, California, BILL is a trusted partner of leading U.S. financial institutions, accounting firms, and accounting software providers. For more information, visit bill.com.
    Learn More
  • 5
    VGGT

    VGGT

    [CVPR 2025 Best Paper Award] VGGT

    ...The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose or depth), making the system more robust to challenging viewpoints and textures. The repo provides inference pipelines to estimate geometry from monocular inputs, stereo pairs, or brief sequences, together with evaluation harnesses for common geometry benchmarks. Training utilities highlight data curation and augmentations that preserve geometric cues while improving generalization across scenes and cameras.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    ...Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports both text and audio inputs to generate outputs in various forms, including voice cloning, emotion control, and interactive role-playing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    ...Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document datasets, comparable with GoogleVision/AWS Textract. Easy integration (available templates for browser demo & API deployment). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Qwen2.5-Omni

    Qwen2.5-Omni

    Capable of understanding text, audio, vision, video

    Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds state-of-the-art performance in many multimodal benchmarks, particularly spoken language understanding, audio reasoning, image/video understanding, etc. Very strong benchmark performance across modalities (audio understanding, speech recognition, image/video reasoning) and often outperforming or matching single-modality models at a similar scale. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    IMS Toucan

    IMS Toucan

    Controllable and fast Text-to-Speech for over 7000 languages

    IMS-Toucan is a toolkit for training, using, and teaching state-of-the-art text-to-speech systems, built at the Institute for Natural Language Processing (IMS), University of Stuttgart. It is the official home of ToucanTTS, a massively multilingual TTS system designed to support over 7,000 languages with a single unified framework. The toolkit focuses on being fast and controllable while not requiring huge amounts of compute, making it practical for research labs and smaller teams. It...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Unrivaled Embedded Payments Solutions | NMI Icon
    Unrivaled Embedded Payments Solutions | NMI

    For SaaS builders, software companies, ISVs and ISOs who want to embed payments into their tech stack

    NMI Payments is an embedded payments solution that lets SaaS platforms, Software companies and ISVs integrate, brand, and manage payment acceptance directly within their software—without becoming a PayFac or building complex infrastructure. As a full-stack processor, acquirer, and technology partner, NMI handles onboarding, compliance, and risk so you can stay focused on growth. The modular, white-label platform supports omnichannel payments, from online, mobile and in-app to in-store and unattended. Choose from full-code, low-code, or no-code integration paths and launch in weeks, not months. Built-in risk tools, flexible monetization, and customizable branding help you scale faster while keeping full control of your experience. With NMI’s developer-first tools, sandbox testing, and modern APIs, you can embed payments quickly and confidently.
    Learn More
  • 10
    OSS-Fuzz Gen

    OSS-Fuzz Gen

    LLM powered fuzzing via OSS-Fuzz

    ...Reports highlight what functions were targeted, how coverage evolved, and where manual hints could unlock more paths. The goal is pragmatic: shrink the gap between “we should fuzz this” and “we have robust fuzzing running in CI,” especially for understaffed maintainers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    DeepSeek VL2

    DeepSeek VL2

    Mixture-of-Experts Vision-Language Models for Advanced Multimodal

    ...While the internal architecture details are not fully documented publicly, the repo suggests that VL2 introduces enhancements over prior vision-language models (e.g. better scaling, cross-modal attention, more robust alignment) to improve grounding and multimodal understanding.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Kaldi

    Kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project

    ...The toolkit is widely used in both academia and industry due to its flexibility, extensibility, and strong community support. Kaldi is designed for researchers who need a highly customizable environment to experiment with new algorithms, as well as for practitioners who want robust, production-ready ASR pipelines. It includes extensive tools for data preparation, feature extraction, acoustic and language modeling, decoding, and evaluation. With its modular design, Kaldi allows users to adapt the system to a wide range of languages and domains. As one of the most influential projects in speech recognition, it has become a foundation for much of the modern work in ASR.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    rate.sx

    rate.sx

    Curl cryptocurrencies exchange rates

    ...The service supports multiple coins, fiat conversions, and historical lookups so you can compare prices over time without leaving the terminal. Under the hood it aggregates price feeds and normalizes the results for robust querying, yet keeps the interface dead simple. Because it’s HTTP-based, you can integrate it into shell prompts, tmux status lines, CI logs, dashboards, or chat bots. It’s ideal for developers and power users who want quick answers and scriptable endpoints without installing a heavy client.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Watermark Anything

    Watermark Anything

    Official implementation of Watermark Anything with Localized Messages

    Watermark Anything (WAM) is an advanced deep learning framework for embedding and detecting localized watermarks in digital images. Developed by Facebook Research, it provides a robust, flexible system that allows users to insert one or multiple watermarks within selected image regions while maintaining visual quality and recoverability. Unlike traditional watermarking methods that rely on uniform embedding, WAM supports spatially localized watermarks, enabling targeted protection of specific image regions or objects. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Nevergrad

    Nevergrad

    A Python toolbox for performing gradient-free optimization

    Nevergrad is a Python library for derivative-free optimization, offering robust implementations of many algorithms suited for black-box functions (i.e. functions where gradients are unavailable or unreliable). It targets hyperparameter search, architecture search, control problems, and experimental tuning—domains in which gradient-based methods may fail or be inapplicable. The library provides an easy interface to define an optimization problem (parameter space, loss function, budget) and then experiment with multiple strategies—evolutionary algorithms, Bayesian optimization, bandit methods, genetic algorithms, etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    whisper-timestamped

    whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence. Whisper is a set of multi-lingual, robust speech recognition models trained by OpenAI that achieve state-of-the-art results in many languages. Whisper models were trained to predict approximate timestamps on speech segments (most of the time with 1-second accuracy), but they cannot originally predict word timestamps. This repository proposes an implementation to predict word timestamps and provide a more accurate estimation of speech segments when transcribing with Whisper models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Agently 4

    Agently 4

    Build GenAI application quick and easy

    Agently is a Python framework for building generative-AI (“GenAI”) applications; it focuses on enabling developers to orchestrate AI agents, workflows, and event-driven logic in a robust, reusable way. With Agently, one can define agents that call different models, chain tasks, trigger workflows based on events, and switch models with minimal code changes. It abstracts away boilerplate around model API calls, tool usage, prompt management, and workflow state. The project aims at production-grade GenAI application development rather than just one-off scripts — you’ll find examples of news gathering, agentic workflows, control systems, etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Double Conversion

    Double Conversion

    Efficient binary-decimal & decimal-binary conversion routines for IEEE

    Double Conversion is a high-performance C++ library that provides precise and efficient binary-decimal and decimal-binary conversion routines for IEEE 754 double-precision floating-point numbers. Originally extracted from the V8 JavaScript engine, it was refactored into a standalone library to make its robust number conversion algorithms easily reusable in other projects. The library ensures consistent and accurate results for converting between double values and their string representations, avoiding rounding errors and performance bottlenecks common in standard conversion routines. It is optimized for both speed and correctness, making it ideal for numerical computation libraries, serialization systems, and scripting engines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    UpTrain

    UpTrain

    Your open-source LLM evaluation toolkit

    ...UpTrain continuously monitors your application's performance on multiple evaluation criterions and alerts you in case of any regressions with automatic root cause analysis. UpTrain enables fast and robust experimentation across multiple prompts, model providers, and custom configurations, by calculating quantitative scores for direct comparison and optimal prompt selection. Hallucinations have plagued LLMs since their inception. By quantifying degree of hallucination and quality of retrieved context, UpTrain helps to detect responses with low factual accuracy and prevent them before serving to the end-users. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    requests-cache

    requests-cache

    Persistent HTTP cache for python requests

    ...Use Cache-Control and other standard HTTP headers, define your own expiration schedule, and keep your cache clutter-free with backends that natively support TTL or any combination of strategies. Works out of the box with zero config, but with a robust set of features for configuring and extending the library to suit your needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DreamO

    DreamO

    A Unified Framework for Image Customization

    DreamO is a unified, open-source framework from ByteDance for advanced image customization and generation that consolidates multiple “image manipulation” tasks into a single system, rather than requiring separate specialized models. Built on a diffusion-transformer (DiT) backbone, it supports a diverse set of tasks — including identity preservation, virtual “try-on” (e.g. clothing, accessories), style transfer, IP adaptation (objects/characters), and layout/condition-aware customizations —...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and text-to-image search and analytics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    UI-TARS

    UI-TARS

    UI-TARS-desktop version that can operate on your local personal device

    ...This allows it to perform complex, multi-step tasks such as filling forms, downloading files, navigating applications, and even controlling in-game actions — all by understanding the UI as a human would. The project is open-source, supports deployment locally or remotely, and offers a foundation for building GUI automation agents that are more robust, and adaptable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    ...This allows users to modify not only what is said (the text) but also how it's said: emotion, tone, speaking style, prosody, accent, even paralinguistic cues. Because the model is trained with a “large-margin learning” objective over many synthesized and natural speech samples, it gains robust control over expressive attributes, and can perform iterative editing: e.g. you could record a line, then ask the model to “make it sadder,” “speak slower,” or “change accent to X.”
    Downloads: 0 This Week
    Last Update:
    See Project