Showing 447 open source projects for "processing"

View related business solutions
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Keras Hub

    Keras Hub

    Pretrained model hub for Keras 3

    Keras Hub is a repository of pre-trained models for Keras 3, offering a collection of ready-to-use models for various machine-learning tasks. KerasHub is an extension of the core Keras API; KerasHub components are provided as Layer and Model implementations. If you are familiar with Keras, congratulations. You already understand most of KerasHub.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Pyreft

    Pyreft

    ReFT: Representation Finetuning for Language Models

    PyreFT is a tool by Stanford NLP for fine-tuning transformer models with an emphasis on efficient, resource-conserving training and customizability for NLP tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    ...It also contains training code and recipes, so researchers can fine-tune on custom data or explore new objectives without building infrastructure from scratch. Example notebooks, CLI tools, and audio utilities help with prompt design, conditioning on reference audio, and post-processing to produce ready-to-share outputs.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    AutoClip

    AutoClip

    AI-powered video clipping and highlight generation

    ...Once highlights are identified, AutoClip can automatically cut those segments and optionally assemble them into a compilation, thus greatly reducing manual video editing effort. It uses a modern web application stack with a front end (React + TypeScript) for user interaction and a back end that handles downloading, processing, clipping, and queue management, allowing real-time progress feedback and easy deployment, e.g. via Docker.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    DOLMA

    DOLMA

    Data and tools for generating and inspecting OLMo pre-training data

    DOLMA (Data Optimization and Learning for Model Alignment) is a framework designed to manage large-scale datasets for training and fine-tuning language models efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    STORM

    STORM

    An LLM-powered knowledge curation system that researches topics

    STORM is an open-source virtual assistant framework developed by Stanford's OVAL lab. It is designed for creating natural language interfaces and assistants that can interact with APIs, databases, and services in a modular way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    llama.cpp Python Bindings

    llama.cpp Python Bindings

    Python bindings for llama.cpp

    llama-cpp-python provides Python bindings for llama.cpp, enabling the integration of LLaMA (Large Language Model Meta AI) language models into Python applications. This facilitates the use of LLaMA's capabilities in natural language processing tasks within Python environments.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Lingua-Py

    Lingua-Py

    The most accurate natural language detection library for Python

    Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    Optax

    Optax

    Optax is a gradient processing and optimization library for JAX

    Optax is a gradient processing and optimization library for JAX. It is designed to facilitate research by providing building blocks that can be recombined in custom ways in order to optimize parametric models such as, but not limited to, deep neural networks. We favor focusing on small composable building blocks that can be effectively combined into custom solutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    ...Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Chinese-XLNet

    Chinese-XLNet

    Chinese XLNet pre-trained model

    Chinese-XLNet is a Chinese language pre-trained model based on the XLNet architecture, providing an advanced foundation for natural language processing tasks in Mandarin and other Chinese dialects. Unlike traditional masked language modeling, XLNet uses a permutation language modeling objective that captures bidirectional context more effectively by training over all possible token orderings, yielding richer contextual representations. This model is trained on large-scale Chinese text datasets to learn linguistic patterns, long-range dependencies, and semantic nuance typical of Chinese writing, making it useful for tasks like text classification, question answering, named entity recognition, and language generation. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    LOTUS

    LOTUS

    AI-Powered Data Processing: Use LOTUS to process all of your datasets

    LOTUS is an open-source framework and query engine designed to enable efficient processing of structured and unstructured datasets using large language models. The system provides a declarative programming model that allows developers to express complex AI data operations using high-level commands rather than manually orchestrating model calls. It offers a Python interface with a Pandas-like API, making it familiar for data scientists and engineers already working with data analysis libraries. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DeepSparse

    DeepSparse

    Sparsity-aware deep learning inference runtime for CPUs

    A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Detoxify

    Detoxify

    Trained models & code to predict toxic comments

    Detoxify is a deep learning-based tool for detecting and filtering toxic language in online conversations, leveraging Transformer models for high accuracy.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    OpenMed

    OpenMed

    Open source healthcare AI

    ...Its core purpose is to provide specialized medical entity extraction, PII detection and de-identification, assertion-aware analysis, and related healthcare text processing capabilities without locking users into a proprietary platform. The project includes a curated registry of more than a dozen medical NER models focused on areas such as diseases, drugs, anatomy, genes, and protected health information, and it is built to support both research and deployment scenarios. OpenMed can be used in three main ways: as a simple Python API for scripts and notebooks, as a Docker-friendly FastAPI service for backend integration, and as a batch-processing system for multi-document workflows.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Google Research: Language

    Google Research: Language

    Shared repository for open-sourced projects from the Google AI Lang

    Google Research: Language is a shared repository maintained by Google Research that contains open-source projects developed by the Google AI Language team. The repository hosts multiple subprojects related to natural language processing, machine learning, and large-scale language understanding systems. Many of the projects included in the repository correspond to research papers released by Google researchers and provide implementations of new NLP algorithms or experimental frameworks. These implementations often explore advanced techniques such as language modeling, semantic understanding, information retrieval, and multilingual text processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    python-small-examples

    python-small-examples

    Focus on creating classic Python small examples and cases

    python-small-examples is an open-source educational repository that contains hundreds of concise Python programming examples designed to illustrate practical coding techniques. The project focuses on teaching programming concepts through small, focused scripts that demonstrate common tasks in data processing, visualization, and general programming. Each example highlights a specific function or programming pattern so that learners can quickly understand how to apply Python features in real-world scenarios. The repository includes examples covering topics such as file processing, JSON manipulation, data visualization, and library usage. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    ...It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. It separates client-side media handling from backend AI processing, reducing data exposure while still enabling transcription and document generation. AI-Media2Doc supports flexible customization through prompts, allowing users to tailor output styles based on their needs. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    MineContext

    MineContext

    MineContext is your proactive context-aware AI partner

    ...Unlike traditional chat-based assistants, MineContext operates in the background and delivers proactive outputs such as daily summaries, task suggestions, and contextual reminders without requiring explicit prompts. It is built around a context engineering framework that manages the full lifecycle of data, including capture, processing, storage, retrieval, and consumption. The platform emphasizes privacy through a local-first architecture, allowing users to keep their data stored and processed on their own device rather than relying on external cloud services.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Natural Language Toolkit
    The Natural Language Toolkit (NLTK) is a widely used open-source Python library designed for working with human language data and building natural language processing (NLP) applications. It provides a comprehensive suite of modules, datasets, and tutorials that support both symbolic and statistical approaches to language processing. The toolkit includes implementations of many foundational NLP algorithms and utilities, enabling developers to perform tasks such as tokenization, stemming, parsing, classification, and semantic reasoning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 24
    BEIR

    BEIR

    A Heterogeneous Benchmark for Information Retrieval

    BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Haystack

    Haystack

    Haystack is an open source NLP framework to interact with your data

    Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines. Perform semantic search and retrieve ranked documents according to meaning,...
    Downloads: 10 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB