Showing 2 open source projects for "wise memory optimizer"

View related business solutions
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end. Migrate from on-prem or other clouds with free migration tools.
    Try Free
  • $300 in Free Credit Across 150+ Cloud Services Icon
    $300 in Free Credit Across 150+ Cloud Services

    VMs, containers, AI, databases, storage | build anything. No commitment to start.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale with Google Cloud.
    Start Building Free
  • 1
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    FairScale is a collection of PyTorch performance and scaling primitives that pioneered many of the ideas now used for large-model training. It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. Its components are modular, so teams can adopt just the sharding optimizer or the pipeline engine without rewriting their training loop. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Fairseq

    Fairseq

    Facebook AI Research Sequence-to-Sequence Toolkit written in Python

    Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB