Showing 127 open source projects for "wavelets image processing"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    POT

    POT

    Python Optimal Transport

    This open source Python library provides several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DreamCraft3D

    DreamCraft3D

    Official implementation of DreamCraft3D

    DreamCraft3D is DeepSeek’s generative 3D modeling framework / model family that likely extends their earlier 3D efforts (e.g. Shap-E or Point-E style models) with more capability, control, or expression. The name suggests a “dream crafting” metaphor—users probably supply textual or image prompts and generate 3D assets (point clouds, meshes, scenes). The repository includes model code, inference scripts, sample prompts, and possibly dataset preparation pipelines. It may integrate rendering or post-processing modules (e.g. mesh smoothing, texturing) to make the outputs more output-ready. Because 3D generation is hardware‐intensive, the repository likely also includes optimizations like quantization, pruning, or inference accelerations (e.g. using FlashMLA or DeepEP) to make the generation pipeline faster or more efficient. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4

    Image To Text tools

    ITTT is a Free tool designed to Scan and extract Text from Images.

    Image To Text Tools is a 100% Free user-friendly tool designed to Scan and extract containing text in images into editable text formats. Whether you need to extract text from scanned documents, photographs, or other image files, Image To Text Tools provides accurate and reliable Optical Character Recognition (OCR) capabilities to meet your needs.
    Downloads: 29 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 5
    Readest

    Readest

    Readest is a modern, feature-rich ebook reader

    Readest is a project meant to facilitate reading, studying, or consuming content by integrating reading tools with AI-powered assistance. Although the repository is not as widely documented or popular as some, the idea is that Readest supports features to help with reading comprehension — likely combining OCR / text retrieval, translation, note-taking, or summarization for reading materials (eBooks, articles, PDFs). The goal appears to be to let users feed in arbitrary reading material and...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 6
    Depth Pro

    Depth Pro

    Sharp Monocular Metric Depth in Less Than a Second

    Depth Pro is a foundation model for zero-shot metric monocular depth estimation, producing sharp, high-frequency depth maps with absolute scale from a single image. Unlike many prior approaches, it does not require camera intrinsics or extra metadata, yet still outputs metric depth suitable for downstream 3D tasks. Apple highlights both accuracy and speed: the model can synthesize a ~2.25-megapixel depth map in around 0.3 seconds on a standard GPU, enabling near real-time applications. The...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Weaviate

    Weaviate

    Weaviate is a cloud-native, modular, real-time vector search engine

    Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer-Extraction, Classification, Customizable...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    AutoGluon

    AutoGluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    AutoGluon enables easy-to-use and easy-to-extend AutoML with a focus on automated stack ensembling, deep learning, and real-world applications spanning image, text, and tabular data. Intended for both ML beginners and experts, AutoGluon enables you to quickly prototype deep learning and classical ML solutions for your raw data with a few lines of code. Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning, model selection/ensembling, architecture search, and data processing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MMDeploy

    MMDeploy

    OpenMMLab Model Deployment Framework

    ...It is a part of the OpenMMLab project. Models can be exported and run in several backends, and more will be compatible. All kinds of modules in the SDK can be extended, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on. Install and build your target backend. ONNX Runtime is a cross-platform inference and training accelerator compatible with many popular ML/DNN frameworks. Please read getting_started for the basic usage of MMDeploy.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    Replicate Flux MCP

    Replicate Flux MCP

    MCP for Replicate Flux Model

    The Replicate Flux MCP is an advanced Model Context Protocol server that empowers AI assistants to generate high-quality images and vector graphics. It leverages Black Forest Labs' Flux Schnell model for raster images and Recraft's V3 SVG model for vector graphics via the Replicate API. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    ...Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. Datachain is especially helpful if batch operations can be optimized – for instance, when synchronous API calls can be parallelized or where an LLM API offers batch processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    HunyuanVideo

    HunyuanVideo

    HunyuanVideo: A Systematic Framework For Large Video Generation Model

    HunyuanVideo is a cutting-edge framework designed for large-scale video generation, leveraging advanced AI techniques to synthesize videos from various inputs. It is implemented in PyTorch, providing pre-trained model weights and inference code for efficient deployment. The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content. Release of FP8 model weights to reduce GPU...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13

    BoofCV

    BoofCV is an open source Java library for real-time computer vision.

    ...Written from scratch for ease of use and high performance, it provides both basic and advanced features needed for creating a computer vision system. Functionality include optimized low level image processing routines (e.g. convolution, interpolation, gradient) to high level functionality such as image stabilization. Released under an Apache 2.0 license for both academic and commercial use.
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • 14
    GLM-4.5V

    GLM-4.5V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    MegEngine

    MegEngine

    Easy-to-use deep learning framework with 3 key features

    MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR algorithm. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Dolphin

    Dolphin

    Document Image Parsing via Heterogeneous Anchor Prompting”

    Dolphin — maintained by ByteDance — is a project aimed at providing a high-performance, robust, and extensible media or multimedia framework / player infrastructure (or possibly a streaming media solution), intended to meet modern demands for efficiency, flexibility, and integration in media-heavy applications. It seeks to combine performant media playback or handling (audio/video decoding, streaming, buffering) with a modular, developer-friendly API that allows easy embedding into larger...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    OAGI Python SDK

    OAGI Python SDK

    Python SDK for the Computer Use model Lux, developed by OpenAGI

    OAGI Python SDK is a Python client library for the Lux computer-use model that turns Lux into a programmable automation layer for operating human-facing software via vision and actions. It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives. The SDK is designed around “computer use” as a paradigm, where the AI actually navigates...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SMILI

    SMILI

    Scientific Visualisation Made Easy

    ...The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with graphical user interfaces and/or via the command-line. See our YouTube channel for tutorial videos via the homepage. The applications are all built out of a uniform user-interface framework that provides a very high level (Qt) interface to powerful image processing and scientific visualisation algorithms from the Insight Toolkit (ITK) and Visualisation Toolkit (VTK). ...
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    AI File Sorter

    AI File Sorter

    Local AI file organization with image-based rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI to organize all files and suggest better file names for image files, based on their visual content. The app can analyze picture files locally and suggest meaningful, human-readable names. For example, a generic file like IMG_2048.jpg can be renamed to clouds_over_lake.jpg. All rename and categorization suggestions are optional and must be reviewed and approved before anything is applied.
    Downloads: 280 This Week
    Last Update:
    See Project
  • 22
    PromptSniffer

    PromptSniffer

    View Extract & Remove AI generation metadata with right click

    A powerful tool for reading, extracting, and removing AI generation metadata from image files. Specifically designed to handle metadata from AI image generation tools like ComfyUI, Stable Diffusion, SwarmUI, InvokeAI, and more. Core Functionality Read EXIF/Metadata: Extract and display comprehensive metadata from images AI Metadata Detection: Automatically identify and highlight AI generation metadata Metadata Removal: Strip AI generation metadata while preserving image quality Batch Processing: Handle multiple files with wildcard patterns Cross-Platform: Works on Windows, macOS, and Linux AI Tool Support ComfyUI: Detects and extracts workflow JSON data Stable Diffusion: Identifies prompts, parameters, and generation settings SwarmUI/StableSwarmUI: Handles JSON-formatted metadata Midjourney, DALL-E, NovelAI: Recognizes generation signatures Automatic1111, InvokeAI: Extracts generation parameters
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    ADAMS

    ADAMS

    ADAMS is a workflow engine for building complex knowledge workflows.

    ...This allows rapid development and easy maintenance of large workflows, with hundreds or thousands of operators. Operators include machine learning (WEKA, MOA, MEKA) and image processing (ImageJ, JAI, BoofCV, LIRE and Gnuplot). R available using Rserve. WEKA webservice allows other frameworks to use WEKA models. Fast prototyping with Groovy and Jython. Read/write support for various databases and spreadsheet applications.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    cleanvideo-cli

    cleanvideo-cli

    CLI tool for removing watermarks from AI-generated videos using frame-

    cleanvideo-cli is a command-line tool designed to remove visible watermarks from AI-generated videos. It works by analyzing video frames and reconstructing the underlying pixels in watermark regions, without cropping or blurring the original content. This project is intended for developers, researchers, and creators who need a lightweight utility for cleaning preview or draft videos before further processing. Note: This tool does not bypass platform restrictions and should be used...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Aphantasia

    Aphantasia

    CLIP + FFT/DWT/RGB = text to image/video

    This is a collection of text-to-image tools, evolved from the artwork of the same name. Based on CLIP model and Lucent library, with FFT/DWT/RGB parameterizes (no-GAN generation). Illustrip (text-to-video with motion and depth) is added. DWT (wavelets) parameterization is added. Check also colabs below, with VQGAN and SIREN+FFM generators. Tested on Python 3.7 with PyTorch 1.7.1 or 1.8.
    Downloads: 0 This Week
    Last Update:
    See Project