gpt free download - SourceForge

Megatron

Ongoing research training transformer models at scale

...This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision. Megatron is also used in NeMo Megatron, a framework to help enterprises overcome the challenges of building and training sophisticated natural language processing models with billions and trillions of parameters. Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.

Downloads: 0 This Week

Last Update: 4 days ago

See Project

Colossal-AI

Making large AI models cheaper, faster and more accessible

...This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine. There is an urgent demand to train models in a distributed environment. However, distributed training, especially model parallelism, often requires domain expertise in computer systems and architecture. It remains a challenge for AI researchers to implement complex distributed training solutions for their models. ...

Downloads: 0 This Week

Last Update: 2025-05-28

See Project

RWKV

RNN with great LLM performance

...It presents RWKV as an attention-free RNN-style model that aims to reach transformer-level language model performance. The project is built around the idea that a model can be trained in a parallelizable way like a GPT-style transformer while running inference with recurrent efficiency. This gives RWKV important advantages for long-context use, including lower memory pressure and no traditional key-value cache requirement. The repository includes training code, model notes, research material, and references to current RWKV weights. Its main value is providing the foundation for experimenting with efficient large language models that combine transformer-like scalability with RNN-like runtime behavior.

Downloads: 0 This Week

Last Update: 2026-06-10

See Project

GPT-NeoX

Implementation of model parallel autoregressive transformers on GPUs

...If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face transformers library instead which supports GPT-NeoX models.

Downloads: 3 This Week

Last Update: 2023-03-23

See Project

Search Results for "gpt"

Showing 4 open source projects for "gpt"

Megatron

Colossal-AI

RWKV

GPT-NeoX

Search Results for "gpt"

Showing 4 open source projects for "gpt"

Megatron

Colossal-AI

RWKV

GPT-NeoX

Related Searches

Related Categories