Search Results for "learning language" - Page 2

Showing 384 open source projects for "learning language"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    Machine Learning Engineering Open Book

    Machine Learning Engineering Open Book

    Machine Learning Engineering Open Book

    Machine Learning Engineering Open Book is an open “living book” that captures practical methodologies, tooling advice, and operational knowledge for successfully training and deploying large language models and multimodal systems. The repository functions as a field guide compiled from real-world experience, particularly from work on large-scale models such as BLOOM-176B and IDEFICS-80B.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Advanced NLP with spaCy

    Advanced NLP with spaCy

    Advanced NLP with spaCy: A free online course

    ...It also demonstrates how spaCy pipelines work and how developers can extend them with custom components and training data. The course is structured as a hands-on learning environment where students can run code examples, experiment with NLP techniques, and build practical language processing applications. Because spaCy is widely used in production environments, the course emphasizes industrial-strength NLP workflows and best practices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    AiLearning-Theory-Applying

    AiLearning-Theory-Applying

    Quickly get started with AI theory and practical applications

    AiLearning-Theory-Applying is a comprehensive educational repository designed to help learners quickly understand artificial intelligence theory and apply it in practical machine learning and deep learning projects. The repository provides extensive tutorials covering mathematical foundations, machine learning algorithms, deep learning concepts, and modern large language model architectures. It includes well-commented notebooks, datasets, and implementation examples that allow learners to reproduce experiments and understand the inner workings of various algorithms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Ludwig AI

    Ludwig AI

    Low-code framework for building custom LLMs, neural networks

    Declarative deep learning framework built for scale and efficiency. Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks. Declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures. Automatic batch size selection, distributed training (DDP, DeepSpeed),...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    ReCall

    ReCall

    Learning to Reason with Search for LLMs via Reinforcement Learning

    ...Instead of relying purely on static knowledge stored inside the model, ReCall allows the language model to dynamically decide when it should retrieve information or invoke external capabilities during the reasoning process. The framework uses reinforcement learning to train models to perform these tool calls effectively while solving multi-step reasoning tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PRIME

    PRIME

    Scalable RL solution for advanced reasoning of language models

    PRIME is an open-source reinforcement learning framework designed to improve the reasoning capabilities of large language models through process-level rewards rather than relying only on final outputs. The system introduces the concept of process reinforcement through implicit rewards, allowing models to receive feedback on intermediate reasoning steps instead of evaluating only the final answer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Colossal-AI

    Colossal-AI

    Making large AI models cheaper, faster and more accessible

    The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing. Together with better performance come larger model sizes. This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Training Data is the art of supervising machines through data. This includes the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    How to Train Your GPT

    How to Train Your GPT

    Build a modern LLM from scratch. Every line commented

    How to Train Your GPT is an interactive textbook that teaches users how to build, train, and run a modern language model from scratch. It is written for learners with minimal machine-learning background, using simple explanations, commented code, and practical examples. The project covers the same broad family of architecture behind systems such as GPT-style models, LLaMA-style models, Claude-style systems, and Mistral-style models. It includes chapters and topic explainers on tokenizers, embeddings, attention, RoPE, RMSNorm, SwiGLU, KV cache, AdamW, mixed precision, training loops, and inference. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    llms-from-scratch-cn

    llms-from-scratch-cn

    Build a large language model from 0 only with Python foundation

    llms-from-scratch-cn is an educational open-source project designed to teach developers how to build large language models step by step using practical code and conceptual explanations. The repository provides a hands-on learning path that begins with the fundamentals of natural language processing and gradually progresses toward implementing full GPT-style architectures from the ground up. Rather than focusing on using pre-trained models through APIs, the project emphasizes understanding the internal mechanisms of modern language models, including tokenization, attention mechanisms, transformer architecture, and training workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Agents 2.0

    Agents 2.0

    An Open-source Framework for Data-centric Language Agents

    Agents is an open-source framework designed to build and train autonomous language agents through a data-centric and learning-oriented architecture. The project introduces a concept known as agent symbolic learning, which treats an agent pipeline similarly to a neural network computational graph. In this framework, each node in the pipeline represents a step in the reasoning or action process, while prompts and tools act as adjustable parameters analogous to neural network weights. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    PyTorch-Tutorial-2nd

    PyTorch-Tutorial-2nd

    CV, NLP, LLM project applications, and advanced engineering deployment

    PyTorch-Tutorial-2nd is an open-source educational repository that provides structured tutorials for learning deep learning with the PyTorch framework. The project serves as a practical companion to a second edition of a PyTorch learning guide and is designed to help learners understand neural network concepts through hands-on coding examples. The repository covers a wide range of topics including tensor operations, neural network construction, model training workflows, and optimization...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    machine_learning_examples

    machine_learning_examples

    A collection of machine learning examples and tutorials

    machine_learning_examples is an open-source repository that provides a large collection of machine learning tutorials and practical code examples. The project aims to teach machine learning concepts through hands-on programming rather than purely theoretical explanations. It includes implementations of many machine learning algorithms and neural network architectures using Python and popular libraries such as TensorFlow and NumPy. The repository covers a wide range of topics including supervised learning, unsupervised learning, reinforcement learning, and natural language processing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    LLMs-Zero-to-Hero

    LLMs-Zero-to-Hero

    From nobody to big model (LLM) hero

    LLMs-Zero-to-Hero is an open-source educational project designed to guide learners through the complete process of understanding and building large language models from the ground up. The repository presents a structured learning pathway that begins with fundamental concepts in machine learning and progresses toward advanced topics such as model pre-training, fine-tuning, and deployment. Rather than relying entirely on existing frameworks, the project encourages readers to implement important components themselves in order to gain a deeper understanding of how modern language models work internally. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    RLHF-Reward-Modeling

    RLHF-Reward-Modeling

    Recipes to train reward model for RLHF

    RLHF-Reward-Modeling is an open-source research framework focused on training reward models used in reinforcement learning from human feedback for large language models. In RLHF pipelines, reward models are responsible for evaluating generated responses and assigning scores that guide the model toward outputs that better match human preferences. The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Llama-Chinese

    Llama-Chinese

    Llama Chinese community, real-time aggregation

    ...The community maintains educational materials and technical documentation that help researchers understand the process of training and deploying Chinese-optimized large language models. In addition to model development, the project collects learning resources and open research contributions related to LLM technology in Chinese environments. Overall, Llama-Chinese acts as both a technical ecosystem and knowledge hub dedicated to advancing Chinese-language large model development.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SHAP

    SHAP

    A game theoretic approach to explain the output of ml models

    SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions. While SHAP can explain the output of any machine learning model, we have developed a high-speed exact algorithm for tree ensemble methods. Fast C++ implementations are supported for XGBoost, LightGBM, CatBoost, scikit-learn and pyspark...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Key-book

    Key-book

    Proofs, cases, concept supplements, and reference explanations

    The book "Introduction to Machine Learning Theory" (hereinafter referred to as "Introduction") written by Zhou Zhihua, Wang Wei, Gao Wei, and other teachers fills the regret of the lack of introductory works on machine learning theory in China. This book attempts to provide an introductory guide for readers interested in learning machine learning theory and researching machine learning theory in an easy-to-understand language.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    PKU Beaver

    PKU Beaver

    Constrained Value Alignment via Safe Reinforcement Learning

    PKU Beaver is an open-source research project focused on improving the safety alignment of large language models through reinforcement learning from human feedback under explicit safety constraints. The framework introduces techniques that separate helpfulness and harmlessness signals during training, allowing models to optimize for useful responses while minimizing harmful behavior. To support this process, the project provides datasets containing human-labeled examples that encode both performance preferences and safety constraints across multiple dimensions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Retrobios

    Retrobios

    Complete BIOS and firmware packs for RetroArch, Batocera, Recalbox

    ...It emphasizes simplicity and transparency, allowing developers to study and modify core boot processes at a granular level. The architecture is typically close to bare metal, requiring knowledge of assembly language and hardware interfaces. It may also serve as a foundation for building custom operating systems or experimenting with low-level system design. Overall, retrobios functions as a learning and experimentation platform for understanding foundational computing concepts.
    Downloads: 158 This Week
    Last Update:
    See Project
  • 25
    FlashAttention

    FlashAttention

    Fast and memory-efficient exact attention

    FlashAttention is a high-performance deep learning optimization library that reimplements the attention mechanism used in transformer models to be significantly faster and more memory-efficient than standard implementations. It achieves this by using IO-aware algorithms that minimize memory reads and writes, reducing the quadratic memory overhead typically associated with attention operations.
    Downloads: 56 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB