learning vector quantization free download

turbovec

A vector index built on TurboQuant, written in Rust with Python

turbovec is a Rust-based vector index with Python bindings for fast similarity search. It is built around TurboQuant, a quantization approach designed to reduce vector storage while preserving useful distance information. The project targets workloads where embedding search needs to be compact, efficient, and practical to integrate into Python applications. It avoids a separate training phase for the quantizer, which can simplify setup compared with systems that require codebook learning. ...

Downloads: 0 This Week

Last Update: 2026-06-14

See Project

TurboQuant+

Implementation of TurboQuant (ICLR 2026)

...It is designed to be used in conjunction with modern machine learning workflows, particularly those involving large models that require optimization for deployment. TurboQuant Plus focuses on experimentation and performance tuning, allowing developers to test different configurations and evaluate trade-offs. Its architecture supports extensibility, enabling further development of quantization methods and integration with existing ML pipelines.

Downloads: 7 This Week

Last Update: 2026-05-04

See Project

TurboQuant PyTorch

From-scratch PyTorch implementation of Google's TurboQuant

TurboQuant PyTorch is a specialized deep learning optimization framework designed to accelerate neural network inference and training through advanced quantization techniques within the PyTorch ecosystem. The project focuses on reducing the computational and memory footprint of models by converting floating-point representations into lower-precision formats while preserving performance.

Downloads: 0 This Week

Last Update: 2026-04-23

See Project

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch

bitsandbytes is an open-source library designed to make training and inference of large neural networks more efficient by dramatically reducing memory usage. Built primarily for the PyTorch ecosystem, the library introduces advanced quantization techniques that allow models to operate using reduced numerical precision while maintaining high accuracy. These optimizations enable large language models and other deep learning architectures to run on hardware with limited memory resources, including consumer-grade GPUs. The project includes specialized optimizers and quantized matrix operations that significantly reduce the memory footprint of training and inference workloads. ...

Downloads: 0 This Week

Last Update: 5 days ago

See Project

SparseML

Libraries for applying sparsification recipes to neural networks

SparseML is an optimization toolkit for training and deploying deep learning models using sparsification techniques like pruning and quantization to improve efficiency.

Downloads: 0 This Week

Last Update: 2025-06-02

See Project

AIMET

AIMET is a library that provides advanced quantization and compression

...Plus, an 8-bit model also has a 4x smaller memory footprint relative to a 32-bit model. However, often when quantizing a machine learning model (e.g., from 32-bit floating point to an 8-bit fixed point value), the model accuracy is sacrificed.

Downloads: 1 This Week

Last Update: 2 days ago

See Project

Machine learning algorithms

Minimal and clean examples of machine learning algorithms

Machine learning algorithms is an open-source repository that provides minimal and clean implementations of machine learning algorithms written primarily in Python. The project focuses on demonstrating how fundamental machine learning methods work internally by implementing them from scratch rather than relying on high-level libraries. This approach allows learners to study the mathematical and algorithmic details behind widely used models in a transparent and readable way. The repository...

Downloads: 0 This Week

Last Update: 2026-05-07

See Project

DLRM

An implementation of a deep learning recommendation model (DLRM)

...The implementation is optimized for performance at scale, supporting multi-GPU and multi-node execution, quantization, embedding partitioning, and pipelined I/O to feed huge embeddings efficiently. It includes data loaders for standard benchmarks (like Criteo), training scripts, evaluation tools, and capabilities like mixed precision, gradient compression, and memory fusion to maximize throughput.

Downloads: 0 This Week

Last Update: 2026-01-12

See Project

Tile Kernels

A kernel library written in tilelang

Tile Kernels is a DeepSeek kernel library written with TileLang for high-performance AI and machine-learning workloads. It contains specialized kernels for areas such as mixture-of-experts routing, quantization, batched transpose operations, Engram gating, and Manifold HyperConnection components. The project includes both optimized kernel implementations and PyTorch reference versions for comparison and validation. It is aimed at developers and researchers who work close to model internals and need efficient low-level building blocks. ...

Downloads: 0 This Week

Last Update: 2026-05-21

See Project

NNCF

Neural Network Compression Framework for enhanced OpenVINO

NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.

Downloads: 0 This Week

Last Update: 2026-06-01

See Project

Tencent-Hunyuan-Large

Open-source large language model family from Tencent Hunyuan

Tencent-Hunyuan-Large is the flagship open-source large language model family from Tencent Hunyuan, offering both pre-trained and instruct (fine-tuned) variants. It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference. FP8 quantization support to reduce memory usage...

Downloads: 2 This Week

Last Update: 2025-09-24

See Project

SWIFT LLM

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs

SWIFT LLM is a comprehensive framework developed within the ModelScope ecosystem for training, fine-tuning, evaluating, and deploying large language models and multimodal models. The platform provides a full machine learning pipeline that supports tasks ranging from model pre-training to reinforcement learning alignment techniques. It integrates with popular inference engines such as vLLM and LMDeploy to accelerate deployment and runtime performance. The framework also includes support for...

Downloads: 5 This Week

Last Update: 2026-07-21

See Project

hls4ml

Machine learning on FPGAs using HLS

hls4ml is an open-source framework that enables machine learning models to be implemented directly on hardware such as FPGAs and ASICs using high-level synthesis techniques. The system converts trained neural network models from common machine learning frameworks into hardware description code suitable for ultra-low-latency inference. This approach allows machine learning algorithms to run directly on specialized hardware, making them suitable for applications that require extremely fast...

Downloads: 0 This Week

Last Update: 2026-03-20

See Project

NVIDIA Model Optimizer

A unified library of SOTA model optimization techniques

Model Optimizer is a unified library that provides state-of-the-art techniques for compressing and optimizing deep learning models to improve inference efficiency and deployment performance. It brings together multiple optimization strategies such as quantization, pruning, distillation, and speculative decoding into a single cohesive framework. The library is designed to reduce model size and computational requirements while maintaining accuracy, making it particularly valuable for deploying large models in production environments. ...

Downloads: 0 This Week

Last Update: 2026-07-06

See Project

TorchRec

Pytorch domain library for recommendation systems

TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale recommender systems (RecSys). It allows authors to train models with large embedding tables sharded across many GPUs. Parallelism primitives that enable easy authoring of large, performant multi-device/multi-node models using hybrid data-parallelism/model-parallelism. The TorchRec sharder can shard embedding tables with different sharding strategies including data-parallel,...

Downloads: 0 This Week

Last Update: 2026-06-22

See Project

Chronos Forecasting

Pretrained (Language) Models for Probabilistic Time Series Forecasting

Chronos is a family of pretrained time series forecasting models based on language model architectures. A time series is transformed into a sequence of tokens via scaling and quantization, and a language model is trained on these tokens using the cross-entropy loss. Once trained, probabilistic forecasts are obtained by sampling multiple future trajectories given the historical context. Chronos models have been trained on a large corpus of publicly available time series data, as well as...

Downloads: 1 This Week

Last Update: 2026-07-02

See Project

FL4Health

Library to facilitate federated learning research

FL4Health is a Vector Institute toolkit for building modular, clinically-focused FL pipelines. Tailored for healthcare, it supports privacy-preserving FL, heterogeneous data settings, integrated reporting, and clear API design.

Downloads: 0 This Week

Last Update: 2026-01-21

See Project

MiniSom

MiniSom is a minimalistic implementation of the Self Organizing Maps

MiniSom is a minimalistic and Numpy-based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details. The project initially aimed for a minimalistic implementation of...

Downloads: 0 This Week

Last Update: 2026-01-14

See Project

Z80-μLM

Z80-μLM is a 2-bit quantized language model

...A key deliverable is producing CP/M-compatible .COM binaries, enabling a genuinely vintage “chat with your computer” experience on real hardware or accurate emulators. The project sits at the intersection of machine learning and systems constraints, showing how model architecture, quantization, and inference code generation can be adapted to extreme memory and compute limits. It also functions as an educational reference for how to reduce inference to operations that fit an old-school instruction set and runtime environment.

1 Review

Downloads: 0 This Week

Last Update: 2026-01-27

See Project

DocArray

The data structure for multimodal data

...Data in transit: optimized for network communication, ready-to-wire at anytime with fast and compressed serialization in Protobuf, bytes, base64, JSON, CSV, DataFrame. Perfect for streaming and out-of-memory data. One-stop k-NN: Unified and consistent API for mainstream vector databases.

Downloads: 0 This Week

Last Update: 2025-03-21

See Project

txtai

Build AI-powered semantic search applications

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings).

Downloads: 0 This Week

Last Update: 2026-07-01

See Project

Jina

Build cross-modal and multimodal applications on the cloud

Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP,...

Downloads: 0 This Week

Last Update: 2024-11-12

See Project

Intel Extension for PyTorch

A Python package for extending the official PyTorch

Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel...

Downloads: 0 This Week

Last Update: 2025-08-08

See Project

Ludwig AI

Low-code framework for building custom LLMs, neural networks

...Think building blocks for deep learning.

Downloads: 0 This Week

Last Update: 3 days ago

See Project

MatMul-Free LM

Implementation for MatMul-free LM

...Since matrix multiplication is one of the most computationally expensive components of modern language models, the project explores alternative computational strategies that reduce hardware requirements while maintaining comparable performance. The architecture relies on quantization-aware training and lightweight operations to replace conventional dense matrix multiplications with more efficient alternatives. These optimizations can significantly reduce memory consumption and potentially improve computational efficiency during both training and inference. The repository provides implementations of models at several parameter scales and includes tools for experimenting with the architecture using modern machine learning frameworks.

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

Search Results for "learning vector quantization"

Showing 79 open source projects for "learning vector quantization"

turbovec

TurboQuant+

TurboQuant PyTorch

bitsandbytes

SparseML

AIMET

Machine learning algorithms

DLRM

Tile Kernels

NNCF

Tencent-Hunyuan-Large

SWIFT LLM

hls4ml

NVIDIA Model Optimizer

TorchRec

Chronos Forecasting

FL4Health

MiniSom

Z80-μLM

DocArray

txtai

Jina

Intel Extension for PyTorch

Ludwig AI

MatMul-Free LM

Search Results for "learning vector quantization"

Showing 79 open source projects for "learning vector quantization"

turbovec

TurboQuant+

TurboQuant PyTorch

bitsandbytes

SparseML

AIMET

Machine learning algorithms

DLRM

Tile Kernels

NNCF

Tencent-Hunyuan-Large

SWIFT LLM

hls4ml

NVIDIA Model Optimizer

TorchRec

Chronos Forecasting

FL4Health

MiniSom

Z80-μLM

DocArray

txtai

Jina

Intel Extension for PyTorch

Ludwig AI

MatMul-Free LM

Related Searches

Related Categories