Implementation of a U-net complete with efficient attention
A simple but complete full-attention transformer
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
A series of math-specific large language models of our Qwen2 series
Capable of understanding text, audio, vision, video
Qwen2.5-VL is the multimodal large language model series
A Systematic Framework for Interactive World Modeling
Easily turn large sets of image urls to an image dataset
Open source large language model by Alibaba
The most powerful and modular diffusion model GUI, api and backend
State-of-the-art 2D and 3D Face Analysis Project
Claude Code skill that researches any topic across Reddit + X
2^x Image Super-Resolution
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Book about interpretable machine learning
Python library for defining and optimizing mathematical expressions
A framework to enable multimodal models to operate a computer
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
1 min voice data can also be used to train a good TTS model
C++ library for high performance inference on NVIDIA GPUs
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Client-side indecent content checking powered by TensorFlow.js
A no-frills ChatGPT client for Emacs
Java interface to OpenCV, FFmpeg, and more