Search Results for "image processing" - Page 3

Sort By:

Showing 691 open source projects for "image processing"

View related business solutions

Linux Clear Filters & Widen Search

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
1

reverse-SynthID

Reverse engineering Gemini's SynthID detection

Reverse-SynthID is a research-focused project that analyzes and reverse-engineers Google’s SynthID watermarking system used in AI-generated images. It leverages signal processing and spectral analysis techniques to identify hidden watermark patterns without access to proprietary encoding methods. The project introduces a multi-resolution “SpectralCodebook” that maps watermark characteristics across different image sizes. Using this approach, it can detect SynthID watermarks with high accuracy and selectively reduce or remove them through frequency-domain manipulation. ...

Downloads: 4 This Week

Last Update: 2 days ago
See Project
2

Scrimage

JVM - Java, Kotlin, Scala image processing library

Scrimage is an immutable, functional, and performant JVM library for the manipulation of images. The aim of this library is to provide a simple and concise way to do common image operations, such as resizing to fit a required width and height, converting between formats, applying filters, and so on. It is easy to use from any language on the JVM. A typical use case for this library would be creating thumbnails of images uploaded by users in a web app, bounding a set of product images so that...

Downloads: 4 This Week

Last Update: 2026-03-20
See Project
3

PaddleNLP

Easy-to-use and powerful NLP library with Awesome model zoo

PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities...

Downloads: 4 This Week

Last Update: 2025-05-21
See Project
4

Benthos

Fancy stream processing made operationally mundane

Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads. It comes with a powerful mapping language, is easy to deploy and monitor, and ready to drop into your pipeline either as a static binary, docker image, or serverless function, making it cloud native as heck. Delivery guarantees can be a dodgy subject. Benthos processes and...

Downloads: 38 This Week

Last Update: 2 days ago
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

POT

Python Optimal Transport

This open source Python library provides several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Downloads: 20 This Week

Last Update: 2025-09-22
See Project
6

OpenAI Quickstart Node

Node.js example app from the OpenAI API quickstart tutorial

...The repository provides structured sample code for a variety of API endpoints, including chat completions, assistants, embeddings, fine-tuning, moderation, batch processing, and image generation. Each folder contains runnable scripts that demonstrate both basic usage and more advanced scenarios. By following the examples, developers can quickly understand how to authenticate with an API key, send requests, and handle responses within a Node.js environment. The project is a practical starting point for building AI-powered applications, serving as a foundation for experimentation and integration into larger projects. ...

Downloads: 5 This Week

Last Update: 1 day ago
See Project
7

Google Highway

Performance-portable, length-agnostic SIMD with runtime dispatch

...This portability is achieved through dynamic or static dispatch mechanisms that select the best available instruction set at runtime or compile time. The library is designed for developers who need to maximize CPU performance in domains such as image processing, compression, cryptography, and scientific computing.

Downloads: 4 This Week

Last Update: 4 days ago
See Project
8

ComfyUI Examples

Examples of ComfyUI workflows

ComfyUI_examples is the companion repository for ComfyUI that collects ready-made example workflows, nodes, and compositions to help users learn the node-based interface for AI image generation. Instead of starting from an empty graph, you can open an example and see how prompts, samplers, models, and image processing steps are wired together. This makes ComfyUI more approachable for people coming from “one text box” generators, because they can reverse-engineer complex pipelines visually. The examples also serve as references for best practices like model loading order, latent handling, upscaling chains, and conditioning. ...

Downloads: 4 This Week

Last Update: 2025-11-26
See Project
9

Python API for JMComic

Python crawler and API for downloading JMComic albums and images

JMComic-Crawler-Python is a Python library and crawler framework designed to programmatically access and download comic content from the JMComic platform. It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes. It supports both web-based and mobile API interfaces, enabling flexible interaction with the platform depending on the available...

Downloads: 11 This Week

Last Update: 4 days ago
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

Depth Anything 3

Recovering the Visual Space from Any Views

Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity.

Downloads: 2 This Week

Last Update: 2026-03-21
See Project
11

Computer Vision in Action

A computer vision closed-loop learning platform

Computer Vision in Action is a practical, example-rich repository that demonstrates real-world applications of computer vision techniques and algorithms in Python, often using OpenCV, deep learning models, and related tooling. It serves as a hands-on companion for learners and engineers who want to understand not just the theory, but how computer vision is actually implemented for tasks like object detection, image classification, feature tracking, optical flow, and image segmentation. The...

Downloads: 3 This Week

Last Update: 2026-02-17
See Project
12

AI App Lab

Implementing large models into scenario-based applications

...The project focuses on helping developers bridge the gap between AI models and practical business use cases by offering a structured environment for creating production-ready AI systems. It includes a high-level SDK called Arkitect, which provides workflows and tools for integrating models, plugins, and multimodal capabilities such as text, image, and voice processing. The repository also contains a large collection of prototype applications that demonstrate how AI can be applied to scenarios such as customer service, education, content generation, and mobile automation. These examples allow developers to quickly replicate and customize solutions for their own business needs.

Downloads: 3 This Week

Last Update: 2026-03-17
See Project
13

fastdup

An unsupervised and free tool for image and video dataset analysis

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

Downloads: 1 This Week

Last Update: 2024-08-16
See Project
14

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentations. ...

Downloads: 1 This Week

Last Update: 2026-02-19
See Project
15

DeepSeek-OCR

Contexts Optical Compression

...It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. It supports local deployment, enabling organizations concerned about privacy or latency to run the pipeline on-premises rather than send sensitive documents to third-party cloud services. The codebase is written in Python with a focus on modularity: you can swap preprocessing, recognition, and post-processing components as needed for custom workflows.

Downloads: 3 This Week

Last Update: 2026-01-27
See Project
16

StableSwarmUI

Multi-user UI for managing and running Stable Diffusion workflows tool

StableSwarmUI is a web-based interface designed to manage and coordinate Stable Diffusion image generation workflows in a multi-user environment. It focuses on enabling multiple users to interact with shared resources, making it suitable for collaborative or server-based deployments. It provides a centralized system where users can submit, monitor, and manage generation tasks through a browser interface. It abstracts much of the complexity involved in running diffusion models by offering a structured environment for handling prompts, outputs, and processing queues. ...

Downloads: 5 This Week

Last Update: 2026-03-18
See Project
17

LocalAI

The free, Open Source alternative to OpenAI, Claude and others

...It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a GPU, making it accessible for local development and private deployments. It integrates with multiple backends like llama.cpp, transformers, and diffusers to support different AI workloads. With its self-hosted architecture and OpenAI-compatible API, LocalAI enables developers to build secure, local-first AI applications.

Downloads: 31 This Week

Last Update: 5 days ago
See Project
18

BookStack

Simple & Free Wiki Software

BookStack is a free and open source platform for storing and organising information and documentation. A self-hosted and opinionated wiki system, BookStack is simple and easy to use, giving even new users with just basic word-processing skills a pleasant out of the box experience. BookStack offers a relaxed, open and positive approach. While the platform can provide advanced power features to those who want them, it is primarily designed not to be extensible outside of its core purpose. That being said, BookStack already comes with plenty of powerful features, such as search and linking, cross-book sorting, image management and more. ...

Downloads: 4 This Week

Last Update: 6 days ago
See Project
19

RestorePhotos.io

Restoring old and blurry face photos with AI

RestorePhotos.io is an AI web app for restoring old, blurry, or low-quality face photos and bringing them back to life. It wraps the GFPGAN model (served via Replicate) behind a friendly Next.js front end, so non-technical users can upload an image and receive an enhanced version without ever touching ML code. The workflow is straightforward: you upload a photo, the serverless API route sends it to Replicate, and the restored image is returned and displayed in the UI. The project is production-oriented, not just a toy: it uses Bytescale for storage and image processing, Vercel for hosting and serverless functions, Auth.js + Neon for authentication and database, and Upstash Redis for rate limiting. ...

Downloads: 1 This Week

Last Update: 2025-11-19
See Project
20

MDCx

Movie metadata scraper and organizer for media libraries and NFO

...MDCx can download information such as titles, cast data, artwork, and other metadata, then generate standardized NFO files compatible with media management systems. It also supports image processing tasks such as downloading and cropping artwork used by media centers. It includes several interfaces, allowing users to operate it through a graphical desktop application, a browser-based web interface, or command-line utilities depending on their workflow. Its architecture separates core scraping logic from the user interfaces, allowing the same metadata processing system to be reused across different modes.

Downloads: 10 This Week

Last Update: 2026-03-10
See Project
21

HunyuanDiT

Diffusion Transformer with Fine-Grained Chinese Understanding

HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions. LoRA, ControlNet (pose, depth,...

Downloads: 0 This Week

Last Update: 2025-11-27
See Project
22

AutoGluon

AutoGluon: AutoML for Image, Text, and Tabular Data

AutoGluon enables easy-to-use and easy-to-extend AutoML with a focus on automated stack ensembling, deep learning, and real-world applications spanning image, text, and tabular data. Intended for both ML beginners and experts, AutoGluon enables you to quickly prototype deep learning and classical ML solutions for your raw data with a few lines of code. Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning, model selection/ensembling, architecture search, and data processing. ...

Downloads: 2 This Week

Last Update: 2025-12-19
See Project
23

LandPPT

An LLM-based presentation generation platform

LandPPT is an open-source AI platform that automatically generates professional presentation slides using large language models. The system allows users to create complete PowerPoint presentations simply by entering a topic or uploading source documents such as PDFs, Word files, or Markdown notes. Using natural language processing and structured content generation, the platform produces presentation outlines and converts them into fully formatted slide decks. The application integrates...

Downloads: 3 This Week

Last Update: 2026-03-06
See Project
24

LearnOpenCV

C++ and Python Examples

LearnOpenCV is a large educational repository that provides practical computer vision and deep learning examples in both Python and C++. The project accompanies the LearnOpenCV blog and contains hundreds of hands-on tutorials covering topics such as object detection, image processing, pose estimation, and neural networks. It is structured as a learning resource where each directory corresponds to a specific article or technical walkthrough. The repository supports beginners and advanced practitioners by offering reproducible code that demonstrates real-world computer vision techniques. Many examples integrate popular frameworks like PyTorch, OpenCV, and ONNX to reflect modern AI workflows. ...

Downloads: 4 This Week

Last Update: 5 days ago
See Project
25

MATLAB Deep Learning Model Hub

Discover pretrained models for deep learning in MATLAB

Discover pre-trained models for deep learning in MATLAB. Pretrained image classification networks have already learned to extract powerful and informative features from natural images. Use them as a starting point to learn a new task using transfer learning. Inputs are RGB images, the output is the predicted label and score.

Downloads: 1 This Week

Last Update: 2024-10-11
See Project