6 projects for "server performance" with 2 filters applied:

  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    DramaBox

    DramaBox

    super expressive prompting model based on ltx2.3

    ...It generates speech from prompts that control not only the spoken text, but also speaker identity, emotion, delivery style, laughs, sighs, pauses, and transitions. Users can optionally provide a voice reference of around 10 seconds or more to clone the target timbre while still guiding performance through scene-style prompting. The project includes a warm inference server, a CLI workflow, and a Gradio app for interactive generation. It also supports additional LoRA training on top of DramaBox, making it possible to adapt the model for a specific speaker, language flavor, or performance style. DramaBox is aimed at developers, researchers, and audio creators who need highly expressive English TTS for character dialogue, narrative audio, prototyping, or voice experimentation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    MiMo-V2-Flash

    MiMo-V2-Flash

    MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation

    MiMo-V2-Flash is a large Mixture-of-Experts language model designed to deliver strong reasoning, coding, and agentic-task performance while keeping inference fast and cost-efficient. It uses an MoE setup where a very large total parameter count is available, but only a smaller subset is activated per token, which helps balance capability with runtime efficiency. The project positions the model for workflows that require tool use, multi-step planning, and higher throughput, rather than only...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    HY-MT

    HY-MT

    Hunyuan Translation Model Version 1.5

    HY-MT (Hunyuan Translation) is a high-quality multilingual machine translation model suite developed to support mutual translation across dozens of languages with strong performance even at smaller model scales. It ships with both an 1.8 B parameter model and a larger 7 B model, the latter optimized not only for direct translation but also for formatted and contextualized output, allowing better handling of terminology and mixed-language content. The project emphasizes both speed and quality, with the smaller model able to be quantized and deployed on edge devices for real-time translation tasks without requiring large server infrastructure. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GLM-130B

    GLM-130B

    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

    ...It is designed for large-scale inference and supports both left-to-right generation and blank filling, making it versatile across NLP tasks. Trained on over 400 billion tokens (200B English, 200B Chinese), it achieves performance surpassing GPT-3 175B, OPT-175B, and BLOOM-176B on multiple benchmarks, while also showing significant improvements on Chinese datasets compared to other large models. The model supports efficient inference via INT8 and INT4 quantization, reducing hardware requirements from 8× A100 GPUs to as little as a single server with 4× RTX 3090s. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    Mistral Large 3 675B Instruct 2512 Eagle

    Mistral Large 3 675B Instruct 2512 Eagle

    Speculative-decoding accelerator for the 675B Mistral Large 3

    ...Built on the same frontier-scale multimodal Mixture-of-Experts architecture, it complements a system featuring 41B active parameters and a 2.5B-parameter vision encoder. The Eagle variant is specialized rather than standalone, serving as a performance accelerator for production-grade assistants, agentic workflows, long-context applications, and retrieval-augmented reasoning pipelines. It supports the same multilingual, system-prompt-aligned, and function-calling behavior as the main instruct model when used in the recommended server-client configuration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Grok-2.5

    Grok-2.5

    Large-scale xAI model for local inference with SGLang, Grok-2.5

    Grok-2.5 is a large-scale AI model developed and released by xAI in 2024, made available through Hugging Face for research and experimentation. The model is distributed as raw weights that require specialized infrastructure to run, rather than being hosted by inference providers. To use it, users must download over 500 GB of files and set them up locally with the SGLang inference engine. Grok-2.5 supports advanced inference with multi-GPU configurations, requiring at least 8 GPUs with more...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB