cuda gpu free download

how-to-optim-algorithm-in-cuda

How to optimize some algorithm in cuda

...These examples show how different optimization techniques influence performance on modern GPU hardware and allow readers to experiment with real implementations. The repository also contains extensive learning notes that summarize CUDA programming concepts, GPU architecture details, and performance engineering strategies.

Downloads: 0 This Week

Last Update: 2 days ago

See Project

llama.cpp

LLM inference in C/C++

...It provides command-line tools, a server mode with an OpenAI-compatible API style, model conversion utilities, and extensive backend acceleration options. llama.cpp runs on CPUs and GPUs, with support for Apple silicon, x86, RISC-V, CUDA, HIP, Vulkan, SYCL, Metal, and hybrid CPU-GPU execution. Its main value is making practical LLM inference accessible across consumer machines, servers, and specialized deployment environments.

Downloads: 15 This Week

Last Update: 8 hours ago

See Project

Infinity

Low-latency REST API for serving text-embeddings

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.

Downloads: 0 This Week

Last Update: 2025-08-22

See Project

Punica

Serving multiple LoRA finetuned LLM as one

...The system includes specialized CUDA kernels that enable batched GPU operations across different LoRA models simultaneously. This design allows a single GPU cluster to host many task-specific models while maintaining high throughput and minimal latency. The architecture also includes scheduling mechanisms that coordinate requests from multiple tenants and distribute workloads efficiently across available resources.

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

DomE

Implements a reference architecture for creating information systems

DomE Experiment is an implementation of a reference architecture for creating information systems from the automated evolution of the domain model. The architecture comprises elements that guarantee user access through automatically generated interfaces for various devices, integration with external information sources, data and operations security, automatic generation of analytical information, and automatic control of business processes. All these features are generated from the domain...

Downloads: 0 This Week

Last Update: 2023-03-22

See Project

Search Results for "cuda gpu"

Showing 5 open source projects for "cuda gpu"

how-to-optim-algorithm-in-cuda

llama.cpp

Infinity

Punica

DomE

Search Results for "cuda gpu"

Showing 5 open source projects for "cuda gpu"

how-to-optim-algorithm-in-cuda

llama.cpp

Infinity

Punica

DomE

Related Categories