Showing 278 open source projects for "transformer"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Intel LLM Library for PyTorch

    Intel LLM Library for PyTorch

    Accelerate local LLM inference and finetuning

    Intel LLM Library for PyTorch is an open-source acceleration library developed to optimize large language model inference and fine-tuning on Intel hardware platforms. Built as an extension of the PyTorch ecosystem, the library enables developers to run modern transformer models efficiently on Intel CPUs, GPUs, and specialized AI accelerators. The framework provides hardware-aware optimizations and low-precision computation techniques that significantly improve the performance of large language models while reducing memory consumption. IPEX-LLM supports a wide range of popular models, including architectures such as LLaMA, Mistral, Qwen, and other transformer-based systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    xFormers

    xFormers

    Hackable and optimized Transformers building blocks

    xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    HunyuanDiT

    HunyuanDiT

    Diffusion Transformer with Fine-Grained Chinese Understanding

    HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    TRIBE v2

    TRIBE v2

    A multimodal model for brain response prediction

    ...It is designed for in-silico neuroscience, enabling researchers to model how the brain responds to complex real-world inputs. The system integrates state-of-the-art encoders—including LLaMA for text, V-JEPA for video, and Wav2Vec-BERT for audio—into a unified Transformer architecture. This combined representation is mapped onto the cortical surface to predict fMRI responses across thousands of brain regions. TRIBE v2 allows researchers to simulate and analyze brain activity without requiring direct human experiments. Overall, it provides a powerful tool for studying perception, cognition, and multimodal processing in the brain.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    LLM101n

    LLM101n

    LLM101n: Let's build a Storyteller

    LLM101n is an educational repository that walks you through building and understanding large language models from first principles. It emphasizes intuition and hands-on implementation, guiding you from tokenization and embeddings to attention, transformer blocks, and sampling. The materials favor compact, readable code and incremental steps, so learners can verify each concept before moving on. You’ll see how data pipelines, batching, masking, and positional encodings fit together to train a small GPT-style model end to end. The repo often complements explanations with runnable notebooks or scripts, encouraging experimentation and modification. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    alphageometry

    alphageometry

    AI-driven neuro-symbolic solver for high-school geometry problems

    ...The repository provides the full implementation of DDAR (Deductive Difference and Abductive Reasoning) and AlphaGeometry, two automated geometry solvers described in the 2024 Nature paper “Solving Olympiad Geometry without Human Demonstrations.” AlphaGeometry integrates a symbolic deduction engine with a transformer-based language model to propose and validate geometric constructions in a stepwise proof process. The DDAR solver focuses purely on rule-based reasoning, while AlphaGeometry enhances this by using a learned model to suggest auxiliary constructions when logical reasoning alone is insufficient. The repository includes pre-trained weights, vocabulary files, and detailed configuration options for reproducing experiments.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    VOID

    VOID

    Video Object and Interaction Deletion

    ...Unlike traditional inpainting methods that only erase pixels or simple artifacts, VOID models the full interaction dynamics between objects and their environment, including shadows, reflections, and even physical consequences such as movement or balance changes. Built on top of transformer-based architectures and fine-tuned for video inpainting tasks, the system uses interaction-aware mask conditioning to ensure temporal consistency across frames. One of its most notable capabilities is its ability to simulate realistic scene behavior after object removal, such as causing an object to fall naturally if its support is removed, which significantly enhances realism.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    imodelsX

    imodelsX

    Interpretable prompting and models for NLP

    Interpretable prompting and models for NLP (using large language models). Generates a prompt that explains patterns in data (Official) Explain the difference between two distributions. Find a natural-language prompt using input-gradients. Fit a better linear model using an LLM to extract embeddings. Fit better decision trees using an LLM to expand features. Finetune a single linear layer on top of LLM embeddings. Use these just a like a sci-kit-learn model. During training, they fit better...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    ...The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use. Hybrid architecture combining multimodal transformer blocks and unimodal refinement blocks. Temporal alignment via frame-level synchronization modules (e.g. Synchformer).
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    ...It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. ...
    Downloads: 73 This Week
    Last Update:
    See Project
  • 11
    Attention Residuals (AttnRes)

    Attention Residuals (AttnRes)

    Drop-in replacement for standard residual connections in Transformers

    Attention Residuals is a research-driven architectural innovation for transformer-based models that replaces traditional residual connections with an attention-based mechanism to improve information flow across layers. In standard transformers, residual connections simply sum outputs from previous layers, which can lead to uncontrolled growth of hidden states and dilution of early-layer information in deep networks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    x-transformers

    x-transformers

    A simple but complete full-attention transformer

    A simple but complete full-attention transformer with a set of promising experimental features from various papers. Proposes adding learned memory key/values prior to attending. They were able to remove feedforwards altogether and attain a similar performance to the original transformers. I have found that keeping the feedforwards and adding the memory key/values leads to even better performance.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    ...The repository favors clear Python and NumPy or PyTorch implementations that can be run and modified without heavyweight frameworks obscuring the logic. Chapters and notebooks progress from tiny toy models to more capable transformer stacks, including sampling strategies and evaluation hooks. The focus is on readability, correctness, and experimentation, making it ideal for students and practitioners transitioning from theory to working systems. By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    TabPFN

    TabPFN

    Foundation Model for Tabular Data

    TabPFN is an open-source machine learning system that introduces a foundation model designed specifically for tabular data analysis. The model is based on transformer architectures and implements a prior-data fitted network that can perform supervised learning tasks such as classification and regression with minimal configuration. Unlike many traditional machine learning workflows that require extensive hyperparameter tuning and training cycles, TabPFN is pre-trained to perform inference directly on tabular datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    RWKV Runner

    RWKV Runner

    A RWKV management and startup tool, full automation, only 8MB

    RWKV (pronounced as RwaKuv) is an RNN with GPT-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free. Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility issues, go to the Configs page and turn off Use Custom CUDA kernel to Accelerate.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    llm.c is a minimalist, systems-level implementation of a small transformer-based language model in C that prioritizes clarity and educational value. By stripping away heavy frameworks, it exposes the core math and memory flows of embeddings, attention, and feed-forward layers. The code illustrates how to wire forward passes, losses, and simple training or inference loops with direct control over arrays and buffers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    ML Retreat

    ML Retreat

    Machine Learning Journal for Intermediate to Advanced Topics

    ...Rather than functioning as a traditional tutorial series, the repository is organized as a learning journey that progressively explores increasingly advanced subjects. Topics include large language models, graph neural networks, mechanistic interpretability, transformer architectures, and emerging research areas such as quantum machine learning. The repository includes references to influential research papers, lectures, and educational content from well-known machine learning educators.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    llm_interview_note

    llm_interview_note

    Mainly record the knowledge and interview questions

    ...The project compiles structured notes, conceptual explanations, and curated interview questions related to modern NLP and generative AI systems. It covers fundamental topics such as the historical evolution of language models, tokenization methods, word embeddings, and the architectural foundations of transformer-based models. The repository also explores practical engineering concerns including distributed training strategies, dataset construction, model parameters, and scaling techniques used in large-scale machine learning systems. By organizing topics in a hierarchical documentation format, it enables readers to progress from basic NLP concepts to advanced topics like mixture-of-experts architectures and large-scale training frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Heretic

    Heretic

    Fully automatic censorship removal for language models

    Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities. Designed for researchers and advanced users, Heretic makes it possible to study and experiment with uncensored model responses in a reproducible, automated way. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    ...Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while simultaneously producing corresponding audio elements such as speech, music, ambient sound, or effects. This unified approach allows creators to generate complete multimedia sequences where motion, timing, and sound are aligned automatically. LTX-2 is designed for both research and production workflows and can generate high-resolution video clips with precise control over structure, motion, and camera behavior.
    Downloads: 189 This Week
    Last Update:
    See Project
  • 21
    ModernBERT

    ModernBERT

    Bringing BERT into modernity via both architecture changes and scaling

    ModernBERT is an open-source research project that modernizes the classic BERT encoder architecture by incorporating recent advances in transformer design, training techniques, and efficiency improvements. The goal of the project is to bring BERT-style models up to date with the capabilities of modern large language models while preserving the strengths of bidirectional encoder architectures used for tasks such as classification, retrieval, and semantic search. ModernBERT introduces architectural improvements that enhance both training efficiency and inference performance, making the model more suitable for modern large-scale machine learning pipelines. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    TigerBot

    TigerBot

    TigerBot: A multi-language multi-task LLM

    ...The project focuses on building high-performance models capable of handling both English and Chinese tasks while maintaining strong reasoning and conversational abilities. TigerBot models are based on modern transformer architectures and are trained on large datasets that cover multiple domains and languages. The project provides both base models and chat-optimized variants that can be used for dialogue systems, question answering, and general language understanding tasks. In addition to model weights, the repository includes training scripts, inference tools, and configuration files that allow researchers and developers to reproduce experiments or fine-tune the models for specific applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Oasis

    Oasis

    Inference script for Oasis 500M

    Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. Because it’s an inference-focused repository, it’s especially useful as a practical reference for running the model, wiring inputs, and producing the autoregressive sequence of gameplay frames. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    HY-Motion 1.0

    HY-Motion 1.0

    HY-Motion model for 3D character animation generation

    HY-Motion 1.0 is an open-source, large-scale AI model suite developed by Tencent’s Hunyuan team that generates high-quality 3D human motion from simple text prompts, enabling the automatic production of fluid, diverse, and semantically accurate animations without manual keyframing or rigging. Built on advanced deep learning architectures that combine Diffusion Transformer (DiT) and flow matching techniques, HY-Motion scales these approaches to the billion-parameter level, resulting in strong instruction-following capabilities and richer motion outputs compared to existing open-source models. The training strategy for the HY-Motion series includes extensive pre-training on thousands of hours of varied motion data, fine-tuning on curated high-quality datasets, and reinforcement learning with human feedback, which improves both the plausibility and adaptability of generated motion sequences.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces, enabling multiple characters to be animated in a scene. ...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB