• $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 1
    How to Train Your GPT

    How to Train Your GPT

    Build a modern LLM from scratch. Every line commented

    ...The project covers the same broad family of architecture behind systems such as GPT-style models, LLaMA-style models, Claude-style systems, and Mistral-style models. It includes chapters and topic explainers on tokenizers, embeddings, attention, RoPE, RMSNorm, SwiGLU, KV cache, AdamW, mixed precision, training loops, and inference. The guide emphasizes writing every important component manually rather than only calling high-level APIs. Its purpose is to make the internals of language models understandable through runnable code and step-by-step explanations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    tiny-llm

    tiny-llm

    A course of learning LLM inference serving on Apple Silicon

    tiny-llm is an educational open-source project designed to teach system engineers how large language model inference and serving systems work by building them from scratch. The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques. Rather than relying on high-level machine learning frameworks, the codebase uses...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Qwen2.5-Omni

    Qwen2.5-Omni

    Capable of understanding text, audio, vision, video

    Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    DeepSeek LLM

    DeepSeek LLM

    DeepSeek LLM: Let there be answers

    ...The model is trained from scratch, reportedly on a vast multilingual + code + reasoning dataset, and competes with other open or open-weight models. The architecture mirrors established decoder-only transformer families: pre-norm structure, rotational embeddings (RoPE), grouped query attention (GQA), and mixing in languages and tasks. It supports both “Base” (foundation model) and “Chat” (instruction / conversation tuned) variants.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    QwQ-32B

    QwQ-32B

    QwQ-32B is a reasoning-focused language model for complex tasks

    ...Recommended usage involves prompts starting with <think>\n, non-greedy sampling strategies, and support for standardized outputs on math and multiple-choice tasks. For long input handling, it supports YaRN (Yet another RoPE Namer) for context scaling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Qwen2.5-14B-Instruct

    Qwen2.5-14B-Instruct

    Powerful 14B LLM with strong instruction and long-text handling

    ...It demonstrates improved performance in coding, mathematics, and multilingual understanding across over 29 languages. Qwen2.5-14B-Instruct is built on a transformer backbone with RoPE, SwiGLU, RMSNorm, and attention QKV bias. It’s resilient to varied prompt styles and is especially effective for JSON and tabular data generation. The model is instruction-tuned and supports chat templating, making it ideal for chatbot and assistant use cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Qwen3-Next

    Qwen3-Next

    Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens

    ...With 80B total parameters and 3B activated at a time, it leverages hybrid attention (Gated DeltaNet + Gated Attention) and a high-sparsity Mixture-of-Experts architecture to achieve exceptional efficiency. The model natively supports a context length of 262K tokens and can be extended up to 1 million tokens using RoPE scaling (YaRN), making it highly capable for processing large documents and extended conversations. Multi-Token Prediction (MTP) boosts both training and inference, while stability optimizations such as weight-decayed and zero-centered layernorm ensure robustness. Benchmarks show it performs comparably to larger models like Qwen3-235B on reasoning, coding, multilingual, and alignment tasks while requiring only a fraction of the training cost.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Devstral Small 2

    Devstral Small 2

    Lightweight 24B agentic coding model with vision and long context

    Devstral Small 2 is a compact agentic language model designed for software engineering workflows, excelling at tool usage, codebase exploration, and multi-file editing. With 24B parameters and FP8 instruct tuning, it delivers strong instruction following while remaining lightweight enough for local and on-device deployment. The model achieves competitive performance on SWE-bench, validating its effectiveness for real-world coding and automation tasks. It introduces vision capabilities,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • Previous
  • You're on page 1
  • Next
Auth0 Logo