Showing 154 open source projects for "gpu max performance"

View related business solutions
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    Grok-2.5

    Grok-2.5

    Large-scale xAI model for local inference with SGLang, Grok-2.5

    ...The model is distributed as raw weights that require specialized infrastructure to run, rather than being hosted by inference providers. To use it, users must download over 500 GB of files and set them up locally with the SGLang inference engine. Grok-2.5 supports advanced inference with multi-GPU configurations, requiring at least 8 GPUs with more than 40 GB of memory each for optimal performance. It integrates with the SGLang framework to enable serving, testing, and chat-style interactions. The model comes with a post-training architecture and requires the correct chat template to function properly. It is released under the Grok 2 Community License Agreement, encouraging community experimentation and responsible use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    granite-timeseries-ttm-r2

    granite-timeseries-ttm-r2

    Tiny pre-trained IBM model for multivariate time series forecasting

    granite-timeseries-ttm-r2 is part of IBM’s TinyTimeMixers (TTM) series—compact, pre-trained models for multivariate time series forecasting. Unlike massive foundation models, TTM models are designed to be lightweight yet powerful, with only ~805K parameters, enabling high performance even on CPU or single-GPU machines. The r2 version is pre-trained on ~700M samples (r2.1 expands to ~1B), delivering up to 15% better accuracy than the r1 version. TTM supports both zero-shot and fine-tuned forecasting, handling minutely, hourly, daily, and weekly resolutions. It can integrate exogenous variables, static categorical features, and perform channel-mixing for richer multivariate forecasting. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Mistral Large 3 675B Instruct 2512 NVFP4

    Mistral Large 3 675B Instruct 2512 NVFP4

    Quantized 675B multimodal instruct model optimized for NVFP4

    Mistral Large 3 675B Instruct 2512 NVFP4 is a frontier-scale multimodal Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. This NVFP4 checkpoint is a post-training-activation quantized version of the original instruct model, created through a collaboration between Mistral AI, vLLM, and Red Hat using llm-compressor. It retains the same instruction-tuned behavior as the FP8 model, making it ideal for production...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Ministral 3 3B Instruct 2512

    Ministral 3 3B Instruct 2512

    Ultra-efficient 3B multimodal instruct model built for edge deployment

    ...As an FP8 instruct-fine-tuned model, it is optimized for chat, instruction following, and compact agentic tasks while maintaining strong adherence to system prompts. Despite its small size, it delivers efficient real-time performance and can run locally on a single 8GB GPU, with further memory reductions through quantization. It supports dozens of languages across major global regions, making it well-suited for multilingual and embedded applications. The model also provides function calling, clean JSON output, and stable tool-use behavior, enabling it to serve as a small but effective agentic system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
MongoDB Logo MongoDB