Implementation of NWT, audio-to-video generation, in Pytorch. The paper proposes a new discrete latent representation named Memcodes, which can be succinctly described as a type of multi-head hard-attention to learned memory (codebook) key/values. They claim the need for less codes and smaller codebook dimensions in order to achieve better reconstructions.
Features
- Implementation of NWT
- Audio-to-video generation
- For Pytorch
- Multi-head hard-attention to learned memory (codebook) key / values
- Smaller codebook dimension
- Achieve better reconstructions
License
MIT LicenseFollow NWT - Pytorch (wip)
Other Useful Business Software
Forever Free Full-Stack Observability | Grafana Cloud
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of NWT - Pytorch (wip)!