Showing 440 open source projects for "natural language processing"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    ROSA

    ROSA

    I Agent designed to interact with ROS1- and ROS2-based robotics system

    ROSA, short for Robot Operating System Agent, is an AI-powered software assistant developed by NASA’s Jet Propulsion Laboratory to simplify interaction with robotic systems that use the Robot Operating System (ROS). The project provides a natural language interface that allows developers and operators to interact with robots by issuing commands or queries in conversational language. Built on top of frameworks such as LangChain and modern large language models, ROSA translates user instructions into actions that can be executed within ROS1 or ROS2 environments. This capability enables users to inspect system status, diagnose issues, and control robot behavior without manually navigating complex command-line tools or configuration files. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    AppAgent

    AppAgent

    Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

    AppAgent is an open-source multimodal agent framework designed to enable large language models to operate smartphone applications through natural interactions with graphical user interfaces. The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same way a human user would, making it compatible with a wide variety of mobile applications. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Skyvern

    Skyvern

    Automate browser-based workflows with LLMs and Computer Vision

    Skyvern uses a combination of computer vision and AI to understand content on a webpage, making it adaptable to any website. Skyvern takes instructions in natural language, allowing it to execute complex objectives with simple commands. Skyvern is an API-first product. Workflows execute in the cloud, allowing it to run hundreds of workflows at the same time. Skyvern's AI decisions come with built-in explanations, providing clear summaries and justifications for every action. Support for proxies, with support for country, state, or even precise zip-code level targeting. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Vibe-Trading

    Vibe-Trading

    Vibe-Trading: Your Personal Trading Agent

    Vibe-Trading is an AI-powered multi-agent financial workspace that converts natural language inputs into executable trading strategies and market analysis. It allows users to describe investment ideas in plain language, which are then translated into code, backtested, and evaluated across global markets. The platform integrates multiple data sources, including equities, crypto, and derivatives, with automatic fallback mechanisms.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • 5
    LISA

    LISA

    LISA: Reasoning Segmentation via Large Language Model

    LISA is an open-source multimodal AI system designed to enable language models to perform pixel-level reasoning and segmentation tasks on images. The project introduces a framework where a large language model can interpret natural language instructions and produce segmentation masks that highlight relevant regions in an image. Instead of relying solely on predefined object categories, the model is capable of reasoning about complex textual queries and translating them into visual segmentation outputs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AI-Researcher

    AI-Researcher

    AI-Researcher: Autonomous Scientific Innovation

    AI-Researcher is an open-source system designed to automate complex research tasks end-to-end using large language models and structured workflows, aiming to replicate parts of a human research assistant’s capabilities. It lets users input high-level research goals or questions in natural language and then automatically plans, decomposes, and executes tasks such as literature surveying, summarization, synthesis, experiment design, and draft generation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    VoxCPM2

    VoxCPM2

    Tokenizer-Free TTS for Multilingual Speech Generation

    ...It also introduces voice design capabilities, allowing users to generate entirely new voices from natural language descriptions without requiring reference audio.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 8
    Paperless-AI

    Paperless-AI

    AI-powered document analysis and tagging for Paperless-ngx

    ...It integrates with multiple OpenAI-compatible services as well as local models, giving users flexibility in how document intelligence is handled. A key capability is its use of retrieval-augmented generation, which enables semantic search and natural language interaction across an entire document archive. Users can ask contextual questions about their files and receive precise answers based on full document understanding rather than simple keyword matching. Paperless-AI also includes a web interface for manual review and tagging, allowing greater control when handling sensitive or complex documents.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. ...
    Downloads: 74 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    PandasAI

    PandasAI

    PandasAI is a Python library that integrates generative AI

    ...It is designed to be used in conjunction with pandas, and is not a replacement for it. PandasAI makes pandas (and all the most used data analyst libraries) conversational, allowing you to ask questions to your data in natural language. For example, you can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return a DataFrame containing only those rows.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    LOTUS

    LOTUS

    AI-Powered Data Processing: Use LOTUS to process all of your datasets

    LOTUS is an open-source framework and query engine designed to enable efficient processing of structured and unstructured datasets using large language models. The system provides a declarative programming model that allows developers to express complex AI data operations using high-level commands rather than manually orchestrating model calls. It offers a Python interface with a Pandas-like API, making it familiar for data scientists and engineers already working with data analysis libraries. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Sparrow

    Sparrow

    Structured data extraction and instruction calling with ML, LLM

    ...The architecture is modular, allowing developers to build customizable processing pipelines that integrate with external tools and data extraction frameworks. Sparrow also includes workflow orchestration tools that allow multiple extraction tasks to be combined into automated pipelines for large-scale document processing.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    ComfyUI-Copilot

    ComfyUI-Copilot

    AI assistant for ComfyUI workflow generation, debugging, and tuning

    ComfyUI-Copilot is an AI-powered assistant designed to extend the capabilities of ComfyUI by simplifying and automating complex workflow development tasks. It functions as a custom node integrated directly into the ComfyUI environment, allowing users to interact with workflows through natural language and intelligent suggestions. ComfyUI-Copilot focuses on reducing the complexity of building node-based pipelines for generative AI tasks such as image generation, making it more accessible to both beginners and experienced users. It supports the entire workflow lifecycle, including generation, debugging, rewriting, and parameter optimization, helping users iterate more efficiently. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Index

    Index

    The SOTA Open-Source Browser Agent

    Index is an open-source browser automation agent designed to autonomously perform complex tasks across websites by transforming web interfaces into programmable APIs. The system enables developers to instruct an AI agent to interact with web pages using natural language rather than traditional automation scripts. Instead of writing detailed browser automation code, users can describe the desired task and allow the agent to interpret the page structure, interact with elements, and complete multi-step workflows automatically. The project is built to integrate easily with applications through a simple programming interface, allowing developers to embed browser automation capabilities directly into their software systems. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    AutoAgent

    AutoAgent

    AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework

    AutoAgent is a fully automated, zero-code LLM agent framework that lets users create agents and workflows using natural language instead of manual coding and configuration. It is structured around modes that cover both “use” and “build” scenarios: a user mode for running a ready-made multi-agent research assistant, plus editors for creating individual agents or multi-agent workflows from conversational requirements. The framework emphasizes self-managing workflow generation, where it can infer steps, refine them, and adapt plans even when users cannot fully specify implementation details up front. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Devon

    Devon

    Open source AI pair programmer for coding, debugging, automation

    Devon is an open source AI-powered pair programming tool designed to assist developers with software engineering tasks through natural language interaction. It operates as an agent-based system that can explore codebases, edit files, and execute development workflows with minimal manual intervention. Devon uses a client-server architecture with a Python backend and multiple user interfaces, including a terminal interface and an Electron-based desktop application.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    LLM Aided OCR is an open-source system designed to improve optical character recognition accuracy by combining traditional OCR tools with large language models. The project addresses common OCR challenges such as distorted text, unusual fonts, historical documents, and complex layouts that often produce inaccurate results with standard OCR pipelines. The system first extracts raw text using OCR engines and then applies language models to analyze and correct recognition errors based on context. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    FastAgency

    FastAgency

    The fastest way to bring multi-agent workflows to production

    FastAgency is a framework that simplifies the creation and deployment of AI-driven automation agents. It provides a structured environment for developing AI assistants capable of handling various business and technical tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    MCP for Unity

    MCP for Unity

    AI bridge enabling assistants to control and automate Unity Editor

    ...It exposes Unity functionality as callable tools so that AI systems can understand and manipulate game development workflows programmatically. This approach allows developers to control Unity using natural language prompts and automated workflows rather than manual editor interaction. Unity MCP supports various AI assistants and development tools that implement MCP clients, enabling flexible integration with existing AI development environments.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    LLaMA-Mesh

    LLaMA-Mesh

    Unifying 3D Mesh Generation with Language Models

    ...The project includes a supervised fine-tuning dataset composed of interleaved text and mesh data, allowing the model to learn relationships between textual descriptions and 3D structures. As a result, the model can generate mesh models directly from text prompts, explain mesh structures in natural language, or output mixed text-and-mesh sequences. This unified representation enables a single model to operate across both textual and spatial domains.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Kaggle Solutions

    Kaggle Solutions

    Collection of Kaggle Solutions and Ideas

    ...Each competition entry typically includes information about the dataset, evaluation metrics, modeling strategies, and techniques used by high-ranking competitors. The repository also highlights important machine learning concepts such as feature engineering, cross-validation strategies, ensemble modeling, and post-processing methods commonly used in winning solutions. Because the content is organized by competition categories such as computer vision, natural language processing, tabular data, and time-series forecasting, users can explore techniques relevant to specific problem types.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    npcpy

    npcpy

    The AI toolkit for the AI developer

    npcpy is a Python-based agent framework and command-line toolkit (the NPC Shell) for developers to build, test, and integrate AI agents into their workflows, including both command-line and GUI interfaces via NPC Studio. Welcome to npcpy, the core library of the NPC Toolkit that supercharges natural language processing pipelines and agent tooling. npcpy is a flexible framework for building state-of-the-art applications and conducting novel research with LLMs. The structure of npcpy also allows one to pass an npc to get_llm_response in addition to using the NPC's wrapped method, allowing you to be flexible in your implementation and testing.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    SuggestArr

    SuggestArr

    Request recommended movies, TV shows and anime to Jellyseer/Overseer

    ...The application includes a web interface that allows users to configure integrations, schedule automated recommendation jobs, and monitor system logs in real time. More recent versions also introduce optional large language model integration, enabling AI-driven personalized recommendations and natural language search for discovering content.
    Downloads: 1 This Week
    Last Update:
    See Project
Auth0 Logo