Showing 22 open source projects for "math training"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 1
    LLM Datasets

    LLM Datasets

    Curated list of datasets and tools for post-training

    ...The repository aims to make datasets easy to inspect and transform, with scripts for downloading, deduping, cleaning, and converting to formats like JSONL that slot into training pipelines. It highlights instruction-tuning and conversation-style corpora while also pointing to code, math, or domain-specific sets for targeted capabilities. Quality is a recurring theme: examples and utilities help filter low-value samples, enforce length limits, and split train/validation consistently so results are comparable. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    DeepSeek V2

    DeepSeek V2

    Strong, Economical, and Efficient Mixture-of-Experts Language Model

    DeepSeek-V2 is the second major iteration of DeepSeek’s foundation language model (LLM) series. This version likely includes architectural improvements, training enhancements, and expanded dataset coverage compared to V1. The repository includes model weight artifacts, evaluation benchmarks across a broad suite (e.g. reasoning, math, multilingual), configuration files, and possibly tokenization / inference scripts. The V2 model is expected to support more advanced features like better context window handling, more efficient inference, better performance on challenging tasks, and stronger alignment with human feedback. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 3
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    llm.c is a minimalist, systems-level implementation of a small transformer-based language model in C that prioritizes clarity and educational value. By stripping away heavy frameworks, it exposes the core math and memory flows of embeddings, attention, and feed-forward layers. The code illustrates how to wire forward passes, losses, and simple training or inference loops with direct control over arrays and buffers. Its compact design makes it easy to trace execution, profile hotspots, and understand the cost of each operation. Portability is a goal: it aims to compile with common toolchains and run on modest hardware for small experiments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Tencent-Hunyuan-Large

    Tencent-Hunyuan-Large

    Open-source large language model family from Tencent Hunyuan

    Tencent-Hunyuan-Large is the flagship open-source large language model family from Tencent Hunyuan, offering both pre-trained and instruct (fine-tuned) variants. It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference. FP8 quantization support to reduce memory usage...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Deep Learning Is Nothing

    Deep Learning Is Nothing

    Deep learning concepts in an approachable style

    Deep-Learning-Is-Nothing presents deep learning concepts in an approachable, from-scratch style that demystifies the stack behind modern models. It typically begins with linear algebra, calculus, and optimization refreshers before moving to perceptrons, multilayer networks, and gradient-based training. Implementations favor small, readable examples—often NumPy first—to show how forward and backward passes work without depending solely on high-level frameworks. Once the fundamentals are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    LLMs-from-scratch is an educational codebase that walks through implementing modern large-language-model components step by step. It emphasizes building blocks—tokenization, embeddings, attention, feed-forward layers, normalization, and training loops—so learners understand not just how to use a model but how it works internally. The repository favors clear Python and NumPy or PyTorch implementations that can be run and modified without heavyweight frameworks obscuring the logic. Chapters...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Kimi K2

    Kimi K2

    Kimi K2 is the large language model series developed by Moonshot AI

    ...With its high-dimensional attention mechanisms and expert routing, Kimi-K2 excels across benchmarks in live coding, math reasoning, and problem solving.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 8
    Deep-Learning-Interview-Book

    Deep-Learning-Interview-Book

    Interview guide for machine learning, mathematics, and deep learning

    Deep-Learning-Interview-Book collects structured notes, Q&A, and concept summaries tailored to deep-learning interviews, turning scattered study into a coherent playbook. It spans the core math (linear algebra, probability, optimization) and the practitioner topics candidates actually face, like CNNs, RNNs/Transformers, attention, regularization, and training tricks. Explanations emphasize intuition first, then key formulas and common pitfalls, so you can reason through unseen questions rather than memorize trivia. Many entries connect theory to implementation details, including how choices in activation, initialization, or normalization affect convergence and stability. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Kimi k1.5

    Kimi k1.5

    Scaling Reinforcement Learning with LLMs

    Kimi-k1.5 is an advanced open-source multimodal large-language model project that explores scaling reinforcement learning with long-context chains of thought, achieving performance that rivals or surpasses state-of-the-art models on benchmarks like LiveCodeBench, AIME, and MATH-500. The project emphasizes a simplistic yet powerful framework where the context window scales up to 128k tokens, enabling reasoning that resembles planning, reflection, and correction over a much longer sequence of data than typical models. By using techniques like partial rollouts to improve training efficiency and applying sophisticated policy optimization methods, the developers demonstrate that strong ability can emerge without relying on complex solutions like Monte Carlo tree search or value functions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Super comprehensive deep learning notes

    Super comprehensive deep learning notes

    Super Comprehensive Deep Learning Notes

    Super comprehensive deep learning notes is a massive and well-structured collection of deep learning notebooks that serve as a comprehensive study resource for anyone wanting to learn or reinforce concepts in computer vision, natural language processing, deep learning architectures, and even large-model agents. The repository contains hundreds of Jupyter notebooks that are richly annotated and organized by topic, progressing from basic Python and PyTorch fundamentals to advanced neural...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Gen.jl

    Gen.jl

    A general-purpose probabilistic programming system

    ...Users can also hand-code parts of their models that demand better performance. Neural network inference is fast, but can be inaccurate on out-of-distribution data, and requires expensive training.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Skywork-R1V4

    Skywork-R1V4

    Skywork-R1V is an advanced multimodal AI model series

    Skywork-R1V is an open-source multimodal reasoning model designed to extend the capabilities of large language models into vision-language tasks that require complex logical reasoning. The project introduces a model architecture that transfers the reasoning abilities of advanced text-based models into visual domains so the system can interpret images and perform multi-step reasoning about them. Instead of retraining both language and vision models from scratch, the framework uses a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    DeepSeek Prover V2

    DeepSeek Prover V2

    Advancing Formal Mathematical Reasoning via Reinforcement Learning

    DeepSeek-Prover-V2 is DeepSeek’s specialized model for formal theorem proving, particularly targeting proof in Lean 4. The repository describes how they use recursive proof decomposition by prompting DeepSeek-V3 to break complex theorems into subgoals, synthesize proof sketches, and then combine them to bootstrap training data. They then fine-tune via reinforcement learning with binary correct/incorrect feedback to integrate informal reasoning with formal proof behavior. The repo releases...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ToRA

    ToRA

    Tool-integrated Reasoning LLM Agents

    ToRA is an open-source framework developed by Microsoft for building tool-integrated reasoning agents powered by large language models. The project focuses on improving the ability of AI systems to solve complex mathematical and analytical problems by combining natural language reasoning with external computational tools. Instead of relying solely on text generation, the system dynamically invokes tools such as symbolic solvers or programming libraries when deeper computation is required....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    DALL-E in Pytorch

    DALL-E in Pytorch

    Implementation / replication of DALL-E, OpenAI's Text to Image

    ...In contrast to OpenAI's VAE, it also has an extra layer of downsampling, so the image sequence length is 256 instead of 1024 (this will lead to a 16 reduction in training costs, when you do the math).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Grade School Math

    Grade School Math

    8.5K high quality grade school math problems

    The grade-school-math repository (sometimes called GSM8K) is a curated dataset of 8,500 high-quality grade school math word problems intended for evaluating mathematical reasoning capabilities of language models. It is structured into 7,500 training problems and 1,000 test problems. These aren’t trivial exercises — many require multi-step reasoning, combining arithmetic operations, and handling intermediate steps (e.g.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    micrograd

    micrograd

    A tiny scalar-valued autograd engine and a neural net library

    micrograd is a tiny, educational automatic differentiation engine focused on scalar values, built to show how backpropagation works end to end with minimal code. It constructs a dynamic computation graph as you perform math operations and then computes gradients by walking that graph backward, making it an approachable “from scratch” autograd reference. On top of the core autograd “Value” concept, the project includes a small neural network library that lets you define and train simple...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Hermes 4

    Hermes 4

    Hermes 4 FP8: hybrid reasoning Llama-3.1-405B model by Nous Research

    ...It introduces a hybrid reasoning mode with explicit <think> segments, enabling the model to deliberate deeply when needed and switch to faster responses when desired. Post-training improvements include a vastly expanded corpus with ~60B tokens, boosting performance across math, code, STEM, logic, creativity, and structured outputs. The model is designed for schema adherence, producing valid JSON and repairing malformed outputs, making it highly suitable for tool use and function calling. Hermes 4 is engineered for superior steerability with reduced refusal rates, aligning responses to user values while preserving assistant quality. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Ministral 3 8B Reasoning 2512

    Ministral 3 8B Reasoning 2512

    Efficient 8B multimodal model tuned for advanced reasoning tasks.

    ...It combines an 8.4B-parameter language model with a 0.4B vision encoder, enabling it to process both text and images for advanced reasoning tasks. This version is specifically post-trained for reasoning, making it well-suited for math, coding, and STEM applications requiring multi-step logic and problem-solving. Despite its reasoning-focused training, the model remains edge-optimized and can run locally on a single 24GB GPU in BF16, or under 12GB when quantized. It supports dozens of languages, adheres reliably to system prompts, and provides native function calling and structured JSON output—key capabilities for agentic and automation workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    VaultGemma

    VaultGemma

    VaultGemma: 1B DP-trained Gemma variant for private NLP tasks

    VaultGemma is a sub-1B parameter variant of Google’s Gemma family that is pre-trained from scratch with Differential Privacy (DP), providing mathematically backed guarantees that its outputs do not reveal information about any single training example. Using DP-SGD with a privacy budget across a large English-language corpus (web documents, code, mathematics), it prioritizes privacy over raw utility. The model follows a Gemma-2–style architecture, outputs text from up to 1,024 input tokens,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB