Showing 103 open source projects for "learning language"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    OpenOutreach

    OpenOutreach

    Linkedin Automation Tool

    ...The system generates search queries, evaluates candidate profiles, and learns over time which contacts best match the ideal customer profile. According to the repository, it combines large language model classification with a Bayesian machine learning layer based on profile embeddings, which helps it shift from broad exploration to more confident qualification as it gathers more decisions. It is designed to automate personalized outreach as well, including connection requests and follow-up messaging, while keeping deployment under the user’s control through a local or self-hosted setup.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    ModernBERT

    ModernBERT

    Bringing BERT into modernity via both architecture changes and scaling

    ModernBERT is an open-source research project that modernizes the classic BERT encoder architecture by incorporating recent advances in transformer design, training techniques, and efficiency improvements. The goal of the project is to bring BERT-style models up to date with the capabilities of modern large language models while preserving the strengths of bidirectional encoder architectures used for tasks such as classification, retrieval, and semantic search. ModernBERT introduces architectural improvements that enhance both training efficiency and inference performance, making the model more suitable for modern large-scale machine learning pipelines. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    xLSTM

    xLSTM

    Neural Network architecture based on ideas of the original LSTM

    xLSTM is an open-source machine learning architecture that reimagines the classic Long Short-Term Memory (LSTM) network for modern large-scale language modeling and sequence processing tasks. The project introduces a new recurrent neural network design that incorporates exponential gating mechanisms and enhanced memory structures to overcome limitations of traditional LSTM models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    HuixiangDou

    HuixiangDou

    Overcoming Group Chat Scenarios with LLM-based Technical Assistance

    ...This design allows the system to participate in group discussions without flooding the chat with unnecessary messages. The assistant uses retrieval and ranking methods along with language model reasoning to produce accurate answers for technical topics such as computer vision and machine learning projects. It can be integrated into messaging platforms such as WeChat or other team collaboration tools to assist developer communities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    MatMul-Free LM

    MatMul-Free LM

    Implementation for MatMul-free LM

    MatMul-Free LM is an experimental implementation of a large language model architecture designed to eliminate traditional matrix multiplication operations used in transformer networks. Since matrix multiplication is one of the most computationally expensive components of modern language models, the project explores alternative computational strategies that reduce hardware requirements while maintaining comparable performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Sparrow

    Sparrow

    Structured data extraction and instruction calling with ML, LLM

    Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to identify and extract meaningful data fields from heterogeneous document layouts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Agent Behavior Monitoring

    Agent Behavior Monitoring

    The open source post-building layer for agents

    Agent Behavior Monitoring is an open-source framework designed to monitor, evaluate, and improve the behavior of AI agents operating in real or simulated environments. The system focuses on agent behavior monitoring by collecting interaction data and analyzing how agents perform across different scenarios and tasks. Developers can use the framework to observe agent actions in both online production environments and offline evaluation settings, making it useful for debugging and performance...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    MetaScreener

    MetaScreener

    AI-powered tool for efficient abstract and PDF screening

    ...The system helps researchers analyze large collections of academic abstracts and research papers to determine which studies are relevant for inclusion in evidence synthesis projects. Instead of manually reviewing hundreds or thousands of documents, researchers can use MetaScreener to apply machine learning techniques that assist with classification and prioritization of candidate papers. The platform can analyze both abstracts and full PDF documents, enabling automated filtering based on research criteria defined by the user. By incorporating natural language processing techniques, the system can identify potentially relevant studies and reduce the workload associated with manual screening.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    nano-graphrag

    nano-graphrag

    A simple, easy-to-hack GraphRAG implementation

    nano-graphrag is a lightweight implementation of the GraphRAG approach designed to simplify experimentation with graph-based retrieval-augmented generation systems. GraphRAG expands traditional RAG pipelines by constructing knowledge graphs from documents and using relationships between entities to improve the quality and reasoning of AI responses. The nano-GraphRAG project focuses on reducing complexity by providing a compact and readable codebase that preserves the core functionality of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    Anything to NotebookLM

    Anything to NotebookLM

    Multi-source content processor for NotebookLM

    ...It is best suited for researchers, students, content curators, and knowledge workers who regularly turn scattered information into organized learning assets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    LLMs-from-scratch is an educational codebase that walks through implementing modern large-language-model components step by step. It emphasizes building blocks—tokenization, embeddings, attention, feed-forward layers, normalization, and training loops—so learners understand not just how to use a model but how it works internally. The repository favors clear Python and NumPy or PyTorch implementations that can be run and modified without heavyweight frameworks obscuring the logic. Chapters...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Agentic RAG for Dummies

    Agentic RAG for Dummies

    A modular Agentic RAG built with LangGraph

    Agentic RAG for Dummies is an educational repository that demonstrates how to build retrieval-augmented generation systems combined with autonomous AI agents. The project explains the principles behind agentic retrieval pipelines where language models can dynamically decide when to retrieve information, analyze results, and plan further actions. Instead of relying on static retrieval pipelines, the system shows how agents can orchestrate retrieval, reasoning, and tool usage in a more...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    All-in-RAG

    All-in-RAG

    Big Model Application Development Practice 1

    ...Alongside theoretical explanations, the repository includes hands-on exercises and example projects that demonstrate how to build production-ready RAG systems. These projects guide developers through the process of integrating vector databases, embedding models, and large language models into a unified application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SimpleLLM

    SimpleLLM

    950 line, minimal, extensible LLM inference engine built from scratch

    SimpleLLM is a minimal, extensible large language model inference engine implemented in roughly 950 lines of code, built from scratch to serve both as a learning tool and a research platform for novel inference techniques. It provides the core components of an LLM runtime—such as tokenization, batching, and asynchronous execution—without the abstraction overhead of more complex engines, making it easier for developers and researchers to understand and modify.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    handy-ollama

    handy-ollama

    Implement CPU from scratch and play with large model deployments

    handy-ollama is an open-source educational project designed to help developers and AI enthusiasts learn how to deploy and run large language models locally using the Ollama platform. The repository serves as a structured tutorial that explains how to install, configure, and use Ollama to run modern language models on personal hardware without requiring advanced infrastructure. A key focus of the project is enabling users to run large models even without GPUs by leveraging optimized CPU-based...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...
    Downloads: 47 This Week
    Last Update:
    See Project
  • 17
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    ai-cookbook

    ai-cookbook

    Examples and tutorials to help developers build AI systems

    ai-cookbook is an open-source repository that provides practical tutorials, code examples, and reusable snippets designed to help developers build real-world artificial intelligence applications quickly. The project focuses on delivering hands-on engineering guidance rather than theoretical explanations, allowing developers to copy, adapt, and integrate working code directly into their own systems. The repository contains examples that demonstrate how to build AI workflows using modern tools...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    ...The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. M1 is further trained with large-scale reinforcement learning over diverse tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI-Powered Knowledge Graph is an open-source project focused on building knowledge graph systems that integrate artificial intelligence and machine learning to represent complex relationships between data entities. Knowledge graphs organize information as networks of nodes and relationships, allowing applications to analyze connections between concepts, datasets, or real-world entities. By incorporating AI techniques such as natural language processing and semantic reasoning, the project enables systems to automatically extract relationships and insights from large volumes of data. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Bespoke Curator

    Bespoke Curator

    Synthetic data curation for post-training and data extraction

    Curator is an open-source Python library designed to build synthetic data pipelines for training and evaluating machine learning models, particularly large language models. The system helps developers generate, transform, and curate high-quality datasets by combining automated generation with structured validation and filtering. It supports workflows where models are used to produce synthetic examples that can later be refined into reliable training datasets for reasoning, question answering, or structured information extraction tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MaiBot

    MaiBot

    Maimaibot, a (more focused) multi-platform intelligent agent

    MaiBot is an open-source conversational AI agent designed to participate in group chats and behave like a socially aware digital persona. The project focuses on creating a more human-like interactive experience by combining large language models with behavioral planning and contextual awareness. Instead of functioning as a traditional command-driven chatbot, the system attempts to simulate natural social participation within group conversations. It can generate responses that imitate human...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    WeClone

    WeClone

    One-stop solution for creating your digital avatar from chat history

    WeClone is an open source AI project designed to replicate a person’s conversational style and personality by training models on chat history data. The system analyzes message patterns, linguistic style, and contextual behavior in order to generate responses that resemble the original user’s communication style. It is intended primarily as an experimental exploration of digital personality modeling and conversational AI personalization. By processing large volumes of conversation data,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    ...Portability is a goal: it aims to compile with common toolchains and run on modest hardware for small experiments. Rather than delivering a production-grade stack, it serves as a reference and learning scaffold for people who want to “see the metal” behind LLMs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    E2B Cookbook

    E2B Cookbook

    Examples of using E2B

    E2B Cookbook is an open-source collection of example projects, guides, and reference implementations demonstrating how to build applications using the E2B platform. The repository acts as a practical learning resource for developers who want to integrate AI agents with secure cloud execution environments that allow large language models to run code and interact with tools. The examples illustrate how developers can build AI workflows capable of performing tasks such as data analysis, code execution, and application generation inside isolated sandbox environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB