Search Results for "video image extractor" - Page 4

Showing 178 open source projects for "video image extractor"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    Dolphin

    Dolphin

    Document Image Parsing via Heterogeneous Anchor Prompting”

    Dolphin — maintained by ByteDance — is a project aimed at providing a high-performance, robust, and extensible media or multimedia framework / player infrastructure (or possibly a streaming media solution), intended to meet modern demands for efficiency, flexibility, and integration in media-heavy applications. It seeks to combine performant media playback or handling (audio/video decoding, streaming, buffering) with a modular, developer-friendly API that allows easy embedding into larger...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Paper2GUI

    Paper2GUI

    Convert AI papers to GUI

    ...让每个人都简单方便的使用前沿人工智能技术 Paper2GUI: An AI desktop APP toolbox for ordinary people. It can be used immediately without installation. It already supports 40+ AI models, covering AI painting, speech synthesis, video frame complementing, video super-resolution, object detection, and image stylization. , OCR recognition and other fields. Support Windows, Mac, Linux systems. Paper2GUI: 一款面向普通人的 AI 桌面 APP 工具箱,免安装即开即用,已支持 40+AI 模型,内容涵盖 AI 绘画、语音合成、视频补帧、视频超分、目标检测、图片风格化、OCR 识别等领域。支持 Windows、Mac、Linux 系统。
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    ComfyUI-3D-Pack

    ComfyUI-3D-Pack

    An extensive node suite that enables ComfyUI to process 3D inputs

    ...ComfyUI itself is a node-based interface for designing and executing generative AI pipelines, and this extension expands its capabilities by introducing nodes specifically designed for working with three-dimensional data. The package allows the platform to process inputs such as meshes and UV textures and integrate them into generative workflows similar to those used for image and video generation. It incorporates modern 3D generation technologies including neural radiance fields, Gaussian splatting, and other AI-driven reconstruction techniques. Through these nodes, users can convert images into 3D models, manipulate geometry, and experiment with generative 3D workflows inside the visual pipeline editor.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Lyra 2

    Lyra 2

    Project Lyra: Open Generative 3D World Models

    The Lyra 2 project is a research-driven framework developed by NVIDIA that focuses on building open generative 3D world models using advanced diffusion-based techniques. It enables the creation of fully explorable 3D environments from minimal inputs such as a single image or video, leveraging self-distillation methods to generate consistent spatial representations. The system evolves across versions, with newer iterations introducing long-horizon generation and improved 3D consistency across frames. It combines elements of computer vision, generative modeling, and spatial intelligence to produce dynamic and navigable virtual worlds. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    JEPA

    JEPA

    PyTorch code and models for V-JEPA self-supervised learning from video

    JEPA (Joint-Embedding Predictive Architecture) captures the idea of predicting missing high-level representations rather than reconstructing pixels, aiming for robust, scalable self-supervised learning. A context encoder ingests visible regions and predicts target embeddings for masked regions produced by a separate target encoder, avoiding low-level reconstruction losses that can overfit to texture. This makes learning focus on semantics and structure, yielding features that transfer well...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos. Deep Lake is used by Google, Waymo,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. Fast deployment to Kubernetes, Docker Compose and Jina Cloud. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework developed by Tencent, extending the capabilities of HunyuanVideo. It allows for high-quality video creation from still images, using PyTorch and providing pre-trained model weights, inference code, and customizable training options. The system includes a LoRA training code for adding special effects and enhancing video realism, aiming to offer versatile and scalable solutions for generating videos from static image inputs.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    linux-file-converter-addon

    linux-file-converter-addon

    Convert various image, audio and video formats from your context menu.

    Convert between various image, audio and video formats using the context menu. The addon is written in Python and available for Nautilus, Nemo, Thunar and Dolphin file viewers. It adds a new option to the context menu to create an easy way to convert between a huge amount of file types. The program offers many options to customize the appearance of its context menu.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Kemono Downloader

    Kemono Downloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 1,542 This Week
    Last Update:
    See Project
  • 12
    VideoCrafter2

    VideoCrafter2

    Overcoming Data Limitations for High-Quality Video Diffusion Models

    VideoCrafter is an open-source video generation and editing toolbox designed to create high-quality video content. It features models for both text-to-video and image-to-video generation. The system is optimized for generating videos from textual descriptions or still images, leveraging advanced diffusion models. VideoCrafter2, an upgraded version, improves on its predecessor by enhancing motion dynamics and concept combinations, especially in low-data scenarios. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Warlock-Studio

    Warlock-Studio

    AI Suite for upscaling, interpolating & restoring images/videos

    v6.0. Warlock-Studio is a Windows application that uses Real-ESRGAN, BSRGAN, IRCNN, GFPGAN, RealESRNet, RealESRAnime and RIFE Artificial Intelligence models to upscale, restore faces, interpolate frames and reduce noise in images and videos. the application supports GPU acceleration (including multi-GPU setups) and offers batch processing for large workloads. It includes drag-and-drop handling for single or multiple files, optional pre-resize functions, and an automatic tiling system...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 14
    xSTUDIO

    xSTUDIO

    xSTUDIO is a high performance playback and review tool.

    xSTUDIO is a high performance playback and review tool designed by and for Visual Effects, Animation and Post Production professionals. The application can load and play large collections of media files. The efficient playback engine allows you to quickly load and play high resolution image formats with a wide range of file formats and encoding. Intuitive tools allow you to create and organise playlists and media sub-sets within playlists to build interactive review sessions, image and video reference libraries. A multi-track timeline editing interface provides the facility for loading or creating edits from simple to complex.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    MLT Multimedia Framework
    A multimedia authoring and processing framework and a video playout server for television broadcasting.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    Auto Movie Assembler

    Auto Movie Assembler

    Automating making many trailer-like videos with a single click!

    This program can mass create multiple promotional movies at once using only these elements: - Pre-recorded .mp4 video clips. - Title card .png image file. - Ending card .png image file. - Sound effect 1 that plays during the Title card. - Sound effect 2 that plays during the Ending card. It will join the video clips in a alphabetical order, apply a Fade from Black transitions to all them individually, place a Title Card + sound effect after the first clip and an Ending Card with stylish Fade from White effect, also with its own sound effect. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Qvid

    Qvid

    Stream low latency video from your desktop or webcam over TCP/IP

    Qvid is a demo video streaming application for Windows (MacOs support limited and currently broken), written in Python. It allows you to capture screenshots of your desktop, webcam, and selected windows programs. The captured images are compressed and sent as a continuous stream over a TCP connection to a single machine. Developed alongside https://sourceforge.net/projects/netjoy/ to allow for off-site, single and multiplayer, game play over an internet connection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PyExe:YT thumbnail downloader (b) [ISA]

    PyExe:YT thumbnail downloader (b) [ISA]

    PyExe: YouTube thumbnail downloader (type-b) [I.S.A]

    PyExe: YouTube thumbnail downloader (type-b) [Improved.Simplified.Alternative] Download YouTube video thumbnails. Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Hiera

    Hiera

    A fast, powerful, and simple hierarchical vision transformer

    Hiera is a hierarchical vision transformer designed to be fast, simple, and strong across image and video recognition tasks. The core idea is to use straightforward hierarchical attention with a minimal set of architectural “bells and whistles,” achieving competitive or superior accuracy while being markedly faster at inference and often faster to train. The repository provides installation options (from source or Torch Hub), a model zoo with pre-trained checkpoints, and code for evaluation and fine-tuning on standard benchmarks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Conscious Artificial Intelligence

    Conscious Artificial Intelligence

    It's possible for machines to become self-aware.

    This project is a quest for conscious artificial intelligence. A number of prototypes will be developed as the project progresses. This project has 2 subprojects: Object Pascal based CAI NEURAL API - https://github.com/joaopauloschuler/neural-api Python based K-CAI NEURAL API - https://github.com/joaopauloschuler/k-neural-api A video from the first prototype has been made: http://www.youtube.com/watch?v=qH-IQgYy9zg Above video shows a popperian agent collecting mining ore from 3...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Ascoos Web Extended Studio

    Ascoos Web Extended Studio

    Is a portable web server suite for windows 64Bit, for Web Development.

    Ascoos Web Extended Studio (AWES) is a portable, free 64-bit web server environment for Windows, designed for professional web developers and designers who need flexibility, modularity, and multi-version testing capabilities. It provides a complete local development stack based on technologies such as Apache, PHP, Node.js, Python, MariaDB, MongoDB, FileZilla, and other essential tools. 🔧 Key Features: - Multi-version support for PHP and MariaDB - Modular and upgrade-friendly...
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    garysfm

    garysfm

    An advanced file manager with qss themes and iso and folder previews

    garysfm which stands for Gary's File Manager is a file manager with some advanced features. Those features include bulk renaming and folder image previews. I has rather advanced search functions, tab browsing with persistence between launches. It remembers your folder sorting and view options in icon view. It also remembers your active tabs between sessions. It has progress dialog while doing large operations like copying large files, and folders with many files. python version works on...
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Computer vision projects

    Computer vision projects

    computer vision projects | Fun AI projects related to computer vision

    ...The repository provides examples that combine machine learning models with real-world applications such as robotic arms, video analysis, and automated visual measurement systems.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Aphantasia

    Aphantasia

    CLIP + FFT/DWT/RGB = text to image/video

    This is a collection of text-to-image tools, evolved from the artwork of the same name. Based on CLIP model and Lucent library, with FFT/DWT/RGB parameterizes (no-GAN generation). Illustrip (text-to-video with motion and depth) is added. DWT (wavelets) parameterization is added. Check also colabs below, with VQGAN and SIREN+FFM generators. Tested on Python 3.7 with PyTorch 1.7.1 or 1.8.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    YoloV3 Implemented in TensorFlow 2.0

    YoloV3 Implemented in TensorFlow 2.0

    YoloV3 Implemented in Tensorflow 2.0

    YoloV3 Implemented in TensorFlow 2.0 is built using TensorFlow 2.0. The project provides a modern deep learning implementation of the popular YOLOv3 algorithm, which is widely used for real-time object detection in images and video streams. YOLOv3 works by dividing an image into grid regions and predicting bounding boxes and class probabilities simultaneously, allowing objects to be detected quickly and efficiently. The repository includes training scripts, inference tools, and configuration files that make it possible to train custom object detection models on user-defined datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB