pytorch free download

Showing 11 open source projects for "pytorch"

View related business solutions

AI Video Generators Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Host LLMs in Production With On-Demand GPUs
NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.

Try Free
1

Phenaki - Pytorch

Implementation of Phenaki Video, which uses Mask GIT

Implementation of Phenaki Video, which uses Mask GIT to produce text-guided videos of up to 2 minutes in length, in Pytorch. It will also combine another technique involving a token critic for potentially even better generations. A new paper suggests that instead of relying on the predicted probabilities of each token as a measure of confidence, one can train an extra critic to decide what to iteratively mask during sampling. This repository will also endeavor to allow the researcher to train on text-to-image and then text-to-video. ...

Downloads: 1 This Week

Last Update: 2024-07-29
See Project
2

Video Diffusion - Pytorch

Implementation of Video Diffusion Models

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at Imagen-pytorch. ...

Downloads: 1 This Week

Last Update: 2024-05-03
See Project
3

Wan2.2

Wan2.2: Open and Advanced Large-Scale Video Generative Model

Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...

1 Review

Downloads: 111 This Week

Last Update: 2026-03-17
See Project
4

Make-A-Video - Pytorch (wip)

Implementation of Make-A-Video, new SOTA text to video generator

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning points would easily apply to Imagen), make a few minor modifications for attention across time and other ways to skimp on the compute cost, do frame interpolation correctly, get a great video model out. ...

Downloads: 1 This Week

Last Update: 2024-05-03
See Project
99.99% Uptime for MySQL and PostgreSQL Databases
Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.

Try Free
5

HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

HunyuanVideo is a cutting-edge framework designed for large-scale video generation, leveraging advanced AI techniques to synthesize videos from various inputs. It is implemented in PyTorch, providing pre-trained model weights and inference code for efficient deployment. The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content. Release of FP8 model weights to reduce GPU memory usage / improve efficiency. ...

1 Review

Downloads: 6 This Week

Last Update: 2025-09-23
See Project
6

HunyuanVideo-I2V

A Customizable Image-to-Video Model based on HunyuanVideo

HunyuanVideo-I2V is a customizable image-to-video generation framework developed by Tencent, extending the capabilities of HunyuanVideo. It allows for high-quality video creation from still images, using PyTorch and providing pre-trained model weights, inference code, and customizable training options. The system includes a LoRA training code for adding special effects and enhancing video realism, aiming to offer versatile and scalable solutions for generating videos from static image inputs.

1 Review

Downloads: 5 This Week

Last Update: 2025-03-10
See Project
7

Recurrent Interface Network (RIN)

Implementation of Recurrent Interface Network (RIN)

Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch. The author unawaredly reinvented the induced set-attention block from the set transformers paper. They also combine this with the self-conditioning technique from the Bit Diffusion paper, specifically for the latents. The last ingredient seems to be a new noise function based around the sigmoid, which the author claims is better than cosine scheduler for larger images. ...

Downloads: 2 This Week

Last Update: 2024-02-14
See Project
8

Aphantasia

CLIP + FFT/DWT/RGB = text to image/video

...Illustrip (text-to-video with motion and depth) is added. DWT (wavelets) parameterization is added. Check also colabs below, with VQGAN and SIREN+FFM generators. Tested on Python 3.7 with PyTorch 1.7.1 or 1.8. Generating massive detailed textures, a la deepdream, fullHD/4K resolutions and above, various CLIP models (including multi-language from SBERT), continuous mode to process phrase lists (e.g. illustrating lyrics), pan/zoom motion with smooth interpolation. Direct RGB pixels optimization (very stable) depth-based 3D look (courtesy of deKxi, based on AdaBins), complex queries: text and/or image as main prompts, separate text prompts for style and to subtract (avoid) topics. ...

Downloads: 0 This Week

Last Update: 2023-10-19
See Project
9

NÜWA - Pytorch

Implementation of NÜWA, attention network for text to video synthesis

Implementation of NÜWA, state of the art attention network for text-to-video synthesis, in Pytorch. It also contains an extension into video and audio generation, using a dual decoder approach. It seems as though a diffusion-based method has taken the new throne for SOTA. However, I will continue on with NUWA, extending it to use multi-headed codes + hierarchical causal transformer. I think that direction is untapped for improving on this line of work.

Downloads: 1 This Week

Last Update: 2023-03-22
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
10

NWT - Pytorch (wip)

Implementation of NWT, audio-to-video generation, in Pytorch

Implementation of NWT, audio-to-video generation, in Pytorch. The paper proposes a new discrete latent representation named Memcodes, which can be succinctly described as a type of multi-head hard-attention to learned memory (codebook) key/values. They claim the need for less codes and smaller codebook dimensions in order to achieve better reconstructions.

Downloads: 1 This Week

Last Update: 2023-03-22
See Project
11

DCVGAN

DCVGAN: Depth Conditional Video Generation, ICIP 2019.

This paper proposes a new GAN architecture for video generation with depth videos and color videos. The proposed model explicitly uses the information of depth in a video sequence as additional information for a GAN-based video generation scheme to make the model understands scene dynamics more accurately. The model uses pairs of color video and depth video for training and generates a video using the two steps. Generate the depth video to model the scene dynamics based on the geometrical...

Downloads: 0 This Week

Last Update: 2023-03-22
See Project