Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Large Language Models (LLM)
Search Results

Search Results for "compute"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 16
Mac 16
Windows 16
More...
BSD 8
ChromeOS 8

Category

Artificial Intelligence 16
Business 1
Software Development 1

License

OSI-Approved Open Source 14

Programming Language

Python 8
Rust 2
TypeScript 2
C++ 1

Showing 16 open source projects for "compute"

View related business solutions

Large Language Models (LLM) Linux Clear Filters & Widen Search

Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Earn up to 16% annual interest with Nexo.
Access competitive interest rates on your digital assets.

Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

tt-metal

TT-NN operator library, and TT-Metalium low level kernel programming

...The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.

Downloads: 2 This Week

Last Update: 4 days ago
See Project
2

MiniMax-M1

Open-weight, large-scale hybrid-attention reasoning model

...It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
3

WebLLM

Bringing large-language models and chat to web browsers

WebLLM is a modular, customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration. WebLLM offers a minimalist and modular interface to access the chatbot in the browser. The WebLLM package itself does not come...

Downloads: 4 This Week

Last Update: 2026-04-24
See Project
4

GLM-4.5

GLM-4.5: Open-source LLM for intelligent agents by Z.ai

GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...

1 Review

Downloads: 54 This Week

Last Update: 2026-02-01
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

Paddler

Open-source LLM load balancer and serving platform for hosting LLMs

...The system acts as a specialized load balancer and serving layer for language models, enabling organizations to run inference workloads without relying on external API providers. It supports running models locally through engines such as llama.cpp while distributing requests across multiple compute nodes to improve performance and reliability. The architecture is designed with privacy and cost control in mind, making it suitable for organizations that handle sensitive data or require predictable operational costs. Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. ...

Downloads: 5 This Week

Last Update: 2026-04-30
See Project
6

node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama

...By using native bindings and optimized model execution, the framework allows developers to integrate advanced language model capabilities into desktop applications, server software, and command-line tools. The system automatically detects the available hardware on a machine and selects the most appropriate compute backend, including CPU or GPU acceleration. Developers can use the library to perform tasks such as text generation, conversational chat, embedding generation, and structured output generation. Because it runs models locally, the platform is particularly useful for privacy-sensitive environments or offline AI deployments.

Downloads: 2 This Week

Last Update: 2026-03-17
See Project
7

Ling

Ling is a MoE LLM provided and open-sourced by InclusionAI

Ling is a Mixture-of-Experts (MoE) large language model (LLM) provided and open-sourced by inclusionAI. The project offers different sizes (Ling-lite, Ling-plus) and emphasizes flexibility and efficiency: being able to scale, adapt expert activation, and perform across a range of natural language/reasoning tasks. Example scripts, inference pipelines, and documentation. The codebase includes inference, examples, models, documentation, and model download infrastructure. As more developers and...

Downloads: 1 This Week

Last Update: 2025-09-30
See Project
8

NVIDIA NeMo

Toolkit for conversational AI

...Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.

Downloads: 3 This Week

Last Update: 2026-04-22
See Project
9

BertViz

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

...The model view shows a bird's-eye view of attention across all layers and heads. The neuron view visualizes individual neurons in the query and key vectors and shows how they are used to compute attention.

Downloads: 0 This Week

Last Update: 2025-06-01
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

VibeThinker

Diversity-driven optimization and large-model reasoning ability

VibeThinker is a compact but high-capability open-source language model released by WeiboAI (Sina AI Lab). It contains about 1.5 billion parameters, far smaller than many “frontier” models, yet it is explicitly optimized for reasoning, mathematics, and code generation tasks rather than general open-domain chat. The innovation lies in its training methodology: the team uses what they call the Spectrum-to-Signal Principle (SSP), where a first stage emphasizes diversity of reasoning paths (the...

Downloads: 1 This Week

Last Update: 2025-11-19
See Project
11

LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

LLMs-from-scratch is an educational codebase that walks through implementing modern large-language-model components step by step. It emphasizes building blocks—tokenization, embeddings, attention, feed-forward layers, normalization, and training loops—so learners understand not just how to use a model but how it works internally. The repository favors clear Python and NumPy or PyTorch implementations that can be run and modified without heavyweight frameworks obscuring the logic. Chapters...

Downloads: 0 This Week

Last Update: 2026-04-16
See Project
12

Granite 3.0 Language Models

New set of lightweight state-of-the-art, open foundation models

This repository introduces Granite 3.0 language models as lightweight, state-of-the-art open foundation models built to natively support multilinguality, coding, reasoning, and tool usage. A central goal is efficient deployment, including the potential to run on constrained compute resources while remaining useful for a broad span of enterprise tasks. The repo positions the models for both research and commercial use under an Apache-2.0 license, signaling permissive adoption paths. Documentation highlights the capability mix (reasoning, tool use, code) and points to model artifacts and guidance for evaluation. ...

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
13

SentenceTransformers

Multilingual sentence & image embeddings with BERT

SentenceTransformers is a Python framework for state-of-the-art sentence, text and image embeddings. The initial work is described in our paper Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. You can use this framework to compute sentence / text embeddings for more than 100 languages. These embeddings can then be compared e.g. with cosine-similarity to find sentences with a similar meaning. This can be useful for semantic textual similar, semantic search, or paraphrase mining. The framework is based on PyTorch and Transformers and offers a large collection of pre-trained models tuned for various tasks. ...

Downloads: 0 This Week

Last Update: 2026-04-14
See Project
14

llm

An ecosystem of Rust libraries for working with large language models

...Text generation can be done as a one-off based on a prompt, or interactively, through REPL or chat modes. The CLI can also be used to serialize (print) decoded models, quantize GGML files, or compute the perplexity of a model. It can be downloaded from the latest GitHub release or by installing it from crates.io.

Downloads: 0 This Week

Last Update: 2023-08-21
See Project
15

DeepSeek-V4-Pro

Flagship MoE model for advanced reasoning, coding, and agents

...The model supports an ultra-long context window of up to 1 million tokens, making it highly suitable for long-document reasoning, large codebases, and complex multi-step tasks. Architecturally, it introduces optimizations to reduce compute and memory costs while improving stability across long sequences. DeepSeek-V4-Pro is positioned as the high-end variant of the V4 family, outperforming most open-source models in areas such as agentic coding, STEM reasoning, and world knowledge, and approaching the performance of leading closed-source systems. It also supports advanced reasoning modes and tool-based workflows, enabling autonomous task execution.

Downloads: 0 This Week

Last Update: 2026-04-24
See Project
16

ZAYA1-8B

Efficient MoE reasoning model for coding and math workloads

...The model contains 8.4B total parameters with around 760M active during inference, allowing it to achieve strong reasoning, mathematics, and coding performance while remaining lightweight enough for efficient local or on-device deployment. ZAYA1-8B is optimized for long-form reasoning and test-time compute workflows, making it particularly effective for mathematical problem solving, coding tasks, and advanced reasoning chains. It introduces architectural innovations such as Compressed Convolutional Attention, a novel MLP-based expert router, and learned residual scaling to improve routing stability and inference efficiency. The model was trained entirely on AMD infrastructure and refined through supervised fine-tuning and multi-stage reinforcement learning focused on reasoning and coding.

Downloads: 0 This Week

Last Update: 7 hours ago
See Project

Previous
You're on page 1
Next

Related Searches

chatbot code

llm

gml-4.5

nvidia

ai

vb6 web browser

deep web browser

nocodb

Related Categories

Artificial Intelligence

Business

Software Development

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise