Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "transformers" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Windows 173
Linux 166
Mac 163
More...
BSD 74
ChromeOS 72
Desktop Operating Systems 1
Mobile Operating Systems 1
Server Operating Systems 1

Category

Artificial Intelligence 138
Software Development 31
Business 8
Internet 5
Scientific/Engineering 5
System 5
Education 3
Formats and Protocols 3
Games 2
Database 1
Productivity 1
Security 1
Text Editors 1

License

OSI-Approved Open Source 136

Translations

English 4
Brazilian Portuguese 1

Programming Language

Python 106
Java 6
JavaScript 6
Go 3
More...
Julia 3
PHP 3
Rust 3
C++ 2
Scala 2
Dart 1
Objective C 1
Ruby 1
TypeScript 1

Status

Production/Stable 5
Beta 3
Alpha 1
Mature 1

Showing 173 open source projects for "transformers"

View related business solutions

Windows Clear Filters & Widen Search

Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
1

BitNet

BitNet: Scaling 1-bit Transformers for Large Language Models

BitNet is a machine learning research implementation that explores extremely low-precision neural network architectures designed to dramatically reduce the computational cost of large language models. The project implements the BitNet architecture described in research on scaling transformer models using extremely low-bit quantization techniques. In this approach, neural network weights are quantized to approximately one bit per parameter, allowing models to operate with far lower memory...

Downloads: 6 This Week

Last Update: 2026-03-12
See Project
2

Karpathy

An agentic Machine Learning Engineer

karpathy is an experimental agentic machine learning engineer framework designed to automate many aspects of the ML development workflow. The project sets up a sandboxed environment where an AI agent can access datasets, run experiments, and generate machine learning artifacts through a web interface. Its startup script automatically prepares the environment by creating a sandbox directory, installing key ML libraries, and launching the agent interface. The system is tightly integrated with...

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
3

Transformer Engine

A library for accelerating Transformer models on NVIDIA GPUs

...TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code. TE also includes a framework-agnostic C++ API that can be integrated with other deep-learning libraries to enable FP8 support for Transformers. As the number of parameters in Transformer models continues to grow, training and inference for architectures such as BERT, GPT, and T5 become very memory and compute-intensive. Most deep learning frameworks train with FP32 by default. This is not essential, however, to achieve full accuracy for many deep learning models.

Downloads: 3 This Week

Last Update: 3 days ago
See Project
4

LLM-Finetuning

LLM Finetuning with peft

LLM-Finetuning is an open educational repository that provides practical notebooks and tutorials for fine-tuning large language models using modern machine learning frameworks. The project focuses on parameter-efficient fine-tuning methods such as LoRA and QLoRA, which allow large models to be adapted to new tasks without requiring full retraining. Instead of requiring specialized hardware or complex training pipelines, many examples are designed to run in cloud notebook environments such as...

Downloads: 1 This Week

Last Update: 2026-03-05
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

Flower

Flower: A Friendly Federated Learning Framework

...Different machine learning frameworks have different strengths. Flower can be used with any machine learning framework, for example, PyTorch, TensorFlow, Hugging Face Transformers, PyTorch Lightning, scikit-learn, JAX, TFLite, MONAI, fastai, MLX, XGBoost, Pandas for federated analytics, or even raw NumPy for users who enjoy computing gradients by hand.

Downloads: 1 This Week

Last Update: 2026-04-12
See Project
6

Giskard

Collaborative & Open-Source Quality Assurance for all AI models

...Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.

Downloads: 1 This Week

Last Update: 2026-04-10
See Project
7

MiniMax-M2.1

MiniMax M2.1, a SOTA model for real-world dev & agents.

MiniMax-M2.1 is an open-source, state-of-the-art agentic language model released to democratize high-performance AI capabilities. It goes beyond a simple parameter upgrade, delivering major gains in coding, tool use, instruction following, and long-horizon planning. The model is designed to be transparent, controllable, and accessible, enabling developers to build autonomous systems without relying on closed platforms. MiniMax-M2.1 excels in real-world software engineering tasks, including...

Downloads: 5 This Week

Last Update: 2026-01-28
See Project
8

TorchDistill

A coding-free framework built on PyTorch

torchdistill (formerly kdkit) offers various state-of-the-art knowledge distillation methods and enables you to design (new) experiments simply by editing a declarative yaml config file instead of Python code. Even when you need to extract intermediate representations in teacher/student models, you will NOT need to reimplement the models, which often change the interface of the forward, but instead specify the module path(s) in the yaml file. In addition to knowledge distillation, this...

Downloads: 0 This Week

Last Update: 2025-12-24
See Project
9

flair

A very simple framework for state-of-the-art NLP

...A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings and various transformers. A PyTorch NLP framework. Our framework builds directly on PyTorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes.

Downloads: 0 This Week

Last Update: 2025-02-05
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

Super comprehensive deep learning notes

Super Comprehensive Deep Learning Notes

...The repository contains hundreds of Jupyter notebooks that are richly annotated and organized by topic, progressing from basic Python and PyTorch fundamentals to advanced neural network designs like ResNet, transformers, and object detection algorithms. It’s not just a dry code repository; it includes theoretical explanations alongside hands-on examples, loss function explorations, optimization routines, and full end-to-end experiments on real datasets, making it highly suitable for both self-study and classroom use.

Downloads: 0 This Week

Last Update: 4 days ago
See Project
11

HY-MT

Hunyuan Translation Model Version 1.5

HY-MT (Hunyuan Translation) is a high-quality multilingual machine translation model suite developed to support mutual translation across dozens of languages with strong performance even at smaller model scales. It ships with both an 1.8 B parameter model and a larger 7 B model, the latter optimized not only for direct translation but also for formatted and contextualized output, allowing better handling of terminology and mixed-language content. The project emphasizes both speed and...

Downloads: 0 This Week

Last Update: 2026-03-23
See Project
12

LoggingExtras.jl

Composable Loggers for the Julia Logging StdLib

LoggingExtras allows routing logged information to different places when constructing complicated "log plumbing" systems. Built upon the concept of simple parts composed together, subtyping AbstractLogger provides a powerful and flexible definition for your logging system without a need to define any custom loggers. When we talk about composability, the composition of any set of Loggers is itself a Logger, and LoggingExtras is a composable logging system.

Downloads: 0 This Week

Last Update: 2025-10-03
See Project
13

Tokenizers

Fast State-of-the-Art Tokenizers optimized for Research and Production

Fast State-of-the-art tokenizers, optimized for both research and production. Tokenizers provides an implementation of today’s most used tokenizers, with a focus on performance and versatility. These tokenizers are also used in Transformers. Train new vocabularies and tokenize, using today’s most used tokenizers. Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server’s CPU. Easy to use, but also extremely versatile. Designed for both research and production. Full alignment tracking. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
14

Laravel Fractal

An easy to use Fractal wrapper built for Laravel and Lumen

...Imagine you want to add some stats to the metadata of your request, you can do so without cluttering your code. You can run the make:transformer command to quickly generate a dummy transformer. By default it will be stored in the app\Transformers directory.

Downloads: 0 This Week

Last Update: 2026-02-21
See Project
15

DeepSeed

Deep learning optimization library making distributed training easy

...With just a single GPU, ZeRO-Offload of DeepSpeed can train models with over 10B parameters, 10x bigger than the state of arts, democratizing multi-billion-parameter model training such that many deep learning scientists can explore bigger and better models. Sparse attention of DeepSpeed powers an order-of-magnitude longer input sequence and obtains up to 6x faster execution comparing with dense transformers.

Downloads: 0 This Week

Last Update: 2026-03-30
See Project
16

Deep-Learning-Interview-Book

Interview guide for machine learning, mathematics, and deep learning

Deep-Learning-Interview-Book collects structured notes, Q&A, and concept summaries tailored to deep-learning interviews, turning scattered study into a coherent playbook. It spans the core math (linear algebra, probability, optimization) and the practitioner topics candidates actually face, like CNNs, RNNs/Transformers, attention, regularization, and training tricks. Explanations emphasize intuition first, then key formulas and common pitfalls, so you can reason through unseen questions rather than memorize trivia. Many entries connect theory to implementation details, including how choices in activation, initialization, or normalization affect convergence and stability. ...

Downloads: 0 This Week

Last Update: 2025-11-13
See Project
17

LLaMA-Mesh

Unifying 3D Mesh Generation with Language Models

LLaMA-Mesh is a research framework that extends large language models so they can understand and generate 3D mesh data alongside text. The system introduces a method for representing 3D meshes in a textual format by encoding vertex coordinates and face definitions as sequences that can be processed by a language model. By serializing 3D geometry into text tokens, the approach allows existing transformer architectures to generate and interpret 3D models without requiring specialized visual...

Downloads: 1 This Week

Last Update: 2026-03-09
See Project
18

ESPnet

End-to-end speech processing toolkit

ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while...

Downloads: 1 This Week

Last Update: 3 days ago
See Project
19

SHAP

A game theoretic approach to explain the output of ml models

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions. While SHAP can explain the output of any machine learning model, we have developed a high-speed exact algorithm for tree ensemble methods. Fast C++ implementations are supported for XGBoost, LightGBM, CatBoost, scikit-learn and pyspark...

Downloads: 1 This Week

Last Update: 2026-03-03
See Project
20

rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models

rust-bert is a Rust-based implementation of transformer-based natural language processing models that provides ready-to-use pipelines for tasks such as text classification, summarization, and question answering. The project ports many capabilities of the Hugging Face Transformers ecosystem into the Rust programming language. It allows developers to run state-of-the-art NLP models like BERT, GPT-2, and DistilBERT directly within Rust applications while maintaining high performance and memory efficiency. The library integrates with Rust machine learning infrastructure using crates such as tch-rs and ONNX Runtime for model execution. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
21

MatMul-Free LM

Implementation for MatMul-free LM

MatMul-Free LM is an experimental implementation of a large language model architecture designed to eliminate traditional matrix multiplication operations used in transformer networks. Since matrix multiplication is one of the most computationally expensive components of modern language models, the project explores alternative computational strategies that reduce hardware requirements while maintaining comparable performance. The architecture relies on quantization-aware training and...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
22

Intel LLM Library for PyTorch

Accelerate local LLM inference and finetuning

...IPEX-LLM supports a wide range of popular models, including architectures such as LLaMA, Mistral, Qwen, and other transformer-based systems. The library can integrate with common AI frameworks and serving tools such as Hugging Face Transformers, LangChain, and vLLM, allowing developers to incorporate optimized inference into existing pipelines.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
23

Vision Transformer Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA

...Because it stays close to vanilla PyTorch, you can integrate custom datasets and training loops without framework lock-in. It’s widely used as an educational reference for people learning transformers in vision and as a lightweight baseline for research prototypes. The project encourages experimentation—swap optimizers, change augmentations, or plug the transformer backbone into downstream tasks.

Downloads: 0 This Week

Last Update: 2026-02-11
See Project
24

xFormers

Hackable and optimized Transformers building blocks

xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily. One of its key goals is efficient attention: it supports dense, sparse, low-rank, and...

Downloads: 0 This Week

Last Update: 2026-02-20
See Project
25

NeuralForecast

Scalable and user friendly neural forecasting algorithms.

NeuralForecast offers a large collection of neural forecasting models focusing on their performance, usability, and robustness. The models range from classic networks like RNNs to the latest transformers: MLP, LSTM, GRU, RNN, TCN, TimesNet, BiTCN, DeepAR, NBEATS, NBEATSx, NHITS, TiDE, DeepNPTS, TSMixer, TSMixerx, MLPMultivariate, DLinear, NLinear, TFT, Informer, AutoFormer, FedFormer, PatchTST, iTransformer, StemGNN, and TimeLLM. There is a shared belief in Neural forecasting methods' capacity to improve forecasting pipeline's accuracy and efficiency. ...

Downloads: 0 This Week

Last Update: 2026-04-09
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

cuda machine learning

minimax

fractal

ai

artificial neural network

self-learning ai

ai code

Related Categories

Artificial Intelligence

Software Development

Business

Internet

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise