natural language processing free download

STORM

An LLM-powered knowledge curation system that researches topics

STORM is an open-source virtual assistant framework developed by Stanford's OVAL lab. It is designed for creating natural language interfaces and assistants that can interact with APIs, databases, and services in a modular way.

Downloads: 3 This Week

Last Update: 2025-01-23

See Project

llama.cpp Python Bindings

Python bindings for llama.cpp

llama-cpp-python provides Python bindings for llama.cpp, enabling the integration of LLaMA (Large Language Model Meta AI) language models into Python applications. This facilitates the use of LLaMA's capabilities in natural language processing tasks within Python environments.

Downloads: 3 This Week

Last Update: 11 hours ago

See Project

NVIDIA NeMo

Toolkit for conversational AI

NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. ...

Downloads: 1 This Week

Last Update: 2026-04-22

See Project

MetaScreener

AI-powered tool for efficient abstract and PDF screening

...Instead of manually reviewing hundreds or thousands of documents, researchers can use MetaScreener to apply machine learning techniques that assist with classification and prioritization of candidate papers. The platform can analyze both abstracts and full PDF documents, enabling automated filtering based on research criteria defined by the user. By incorporating natural language processing techniques, the system can identify potentially relevant studies and reduce the workload associated with manual screening.

Downloads: 1 This Week

Last Update: 2026-05-08

See Project

DeepBI

LLM based data scientist, AI native data application

DeepBI is an AI-native data analysis platform. DeepBI leverages the power of large language models to explore, query, visualize, and share data from any data source. Users can use DeepBI to gain data insight and make data-driven decisions.

Downloads: 3 This Week

Last Update: 2024-11-30

See Project

Megatron

Ongoing research training transformer models at scale

Megatron is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision. Megatron is also used in NeMo Megatron, a framework to help enterprises overcome the challenges of building and training sophisticated natural language processing models with billions and trillions of parameters. ...

Downloads: 1 This Week

Last Update: 2026-05-25

See Project

CodeGen

Open-source model for program synthesis

CodeGen is a family of open-source large language models designed specifically for program synthesis and code generation tasks. Developed by Salesforce Research, the models are trained on large datasets containing both natural language and programming language content. This allows them to translate natural language descriptions into functional code across a variety of programming languages.

Downloads: 3 This Week

Last Update: 2026-06-02

See Project

WeClone

One-stop solution for creating your digital avatar from chat history

WeClone is an open source AI project designed to replicate a person’s conversational style and personality by training models on chat history data. The system analyzes message patterns, linguistic style, and contextual behavior in order to generate responses that resemble the original user’s communication style. It is intended primarily as an experimental exploration of digital personality modeling and conversational AI personalization. By processing large volumes of conversation data,...

Downloads: 2 This Week

Last Update: 2026-03-04

See Project

chatd

Chat with your documents using local AI

chatd is an open-source desktop application that allows users to interact with their documents through a locally running large language model. The software focuses on privacy and security by ensuring that all document processing and inference occur entirely on the user’s computer without sending data to external cloud services. It includes a built-in integration with the Ollama runtime, which provides a cross-platform environment for running large language models locally. ...

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

SkillOpt

Text-space optimizer that trains reusable natural-language skills

SkillOpt is a Microsoft research project for improving frozen LLM agents by optimizing reusable natural-language skill documents. Instead of changing model weights, it treats a compact skill file as the trainable state of the agent. The system learns from agent rollouts, reflection, bounded edits, and validation gates to produce better instructions over time. Its output is a deployable best_skill.md artifact that can be reused across agent tasks.

Downloads: 1 This Week

Last Update: 2026-06-02

See Project

Fun Audio Chat

Large Audio Language Model built for natural interactions

Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. ...

Downloads: 0 This Week

Last Update: 2026-02-27

See Project

Happy-LLM

Large Language Model Principles and Practice Tutorial from Scratch

Happy-LLM is an open-source educational project created by the Datawhale AI community that provides a structured and comprehensive tutorial for understanding and building large language models from scratch. The project guides learners through the entire conceptual and practical pipeline of modern LLM development, starting with foundational natural language processing concepts and gradually progressing to advanced architectures and training techniques. It explains the Transformer architecture, pre-training paradigms, and model scaling strategies while also providing hands-on coding examples so readers can implement and experiment with their own models. ...

Downloads: 0 This Week

Last Update: 2026-03-04

See Project

Qwen

The official repo of Qwen chat & pretrained large language model

Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub.

1 Review

Downloads: 7 This Week

Last Update: 2026-03-05

See Project

HolmesGPT

CNCF Sandbox Project

HolmesGPT is an open-source AI agent designed to help DevOps and site reliability engineering teams diagnose and resolve production incidents. The system aggregates signals from observability tools such as logs, metrics, alerts, and distributed traces, then analyzes them using large language models to identify potential root causes. Rather than requiring engineers to manually correlate large volumes of monitoring data, HolmesGPT automatically synthesizes evidence and presents explanations in natural language. The project is developed by Robusta and has been accepted as a Cloud Native Computing Foundation Sandbox project, highlighting its relevance to the cloud-native ecosystem. ...

Downloads: 6 This Week

Last Update: 1 day ago

See Project

HumanEval

Code for the paper "Evaluating Large Language Models Trained on Code"

human-eval is a benchmark dataset and evaluation framework created by OpenAI for measuring the ability of language models to generate correct code. It consists of hand-written programming problems with unit tests, designed to assess functional correctness rather than superficial metrics like text similarity. Each task includes a natural language prompt and a function signature, requiring the model to generate an implementation that passes all provided tests.

Downloads: 3 This Week

Last Update: 2 days ago

See Project

Vanna 2.0

Chat with your SQL database

Vanna is an open-source Python framework that enables natural language interaction with databases by converting user questions into executable SQL queries using large language models. The framework uses a retrieval-augmented generation architecture that learns from database schemas, documentation, and past query examples to generate accurate queries tailored to a specific dataset. Vanna can be integrated into many environments, including notebooks, web applications, messaging platforms, and data dashboards, making it flexible for analytics and data exploration workflows. ...

Downloads: 5 This Week

Last Update: 2026-03-04

See Project

TigerBot

TigerBot: A multi-language multi-task LLM

TigerBot is an open-source family of large language models designed to support multilingual and multi-task natural language processing applications. The project focuses on building high-performance models capable of handling both English and Chinese tasks while maintaining strong reasoning and conversational abilities. TigerBot models are based on modern transformer architectures and are trained on large datasets that cover multiple domains and languages. ...

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

SAG

SQL-Driven RAG Engine

SAG is an open-source SQL-driven retrieval-augmented generation engine that dynamically constructs knowledge graphs during query processing. Instead of relying on a static knowledge graph prepared in advance, the system automatically builds relational structures between entities while processing user queries. Documents are first decomposed into atomic semantic events, which are then represented using multidimensional natural language vectors. These vectors allow the system to identify relationships between concepts and construct a graph representation of knowledge at runtime. ...

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

UFO³

Weaving the Digital Agent Galaxy

UFO is an open-source framework developed by Microsoft for building intelligent agents that automate interactions with graphical user interfaces on the Windows operating system. The system allows users to issue natural language instructions that are translated into automated actions across multiple desktop applications. Using a dual-agent architecture, the framework analyzes both visual interface elements and system control structures in order to understand how applications should be manipulated. This enables the agent to navigate complex software environments and perform tasks that normally require manual interaction. ...

Downloads: 8 This Week

Last Update: 4 days ago

See Project

Shell-AI

LangChain powered shell command generator and runner CLI

Shell-AI is an open-source command-line interface utility that allows users to generate and execute shell commands using natural language prompts. Instead of requiring users to remember complex command syntax, the tool lets them describe their intent in plain English and automatically suggests commands that accomplish the task. The system is powered by large language models and integrates with frameworks such as LangChain to interpret user requests and translate them into executable shell instructions. ...

Downloads: 4 This Week

Last Update: 2026-03-09

See Project

llms-from-scratch-cn

Build a large language model from 0 only with Python foundation

llms-from-scratch-cn is an educational open-source project designed to teach developers how to build large language models step by step using practical code and conceptual explanations. The repository provides a hands-on learning path that begins with the fundamentals of natural language processing and gradually progresses toward implementing full GPT-style architectures from the ground up. Rather than focusing on using pre-trained models through APIs, the project emphasizes understanding the internal mechanisms of modern language models, including tokenization, attention mechanisms, transformer architecture, and training workflows. ...

Downloads: 0 This Week

Last Update: 2026-03-26

See Project

DATAGEN

AI-driven multi-agent research assistant automating hypothesis

...Instead of requiring users to manually orchestrate each stage of a research process, the platform allows these agents to coordinate automatically and handle the workflow end-to-end. The project integrates several modern AI frameworks including LangChain, LangGraph, and large language models to manage reasoning and data processing tasks. Through this architecture, the system can combine structured data analysis with natural language reasoning to generate insights and research outputs. The platform is designed for researchers, analysts, and developers who want to accelerate data exploration and automate parts of the research lifecycle.

Downloads: 0 This Week

Last Update: 1 day ago

See Project

ROSA

I Agent designed to interact with ROS1- and ROS2-based robotics system

ROSA, short for Robot Operating System Agent, is an AI-powered software assistant developed by NASA’s Jet Propulsion Laboratory to simplify interaction with robotic systems that use the Robot Operating System (ROS). The project provides a natural language interface that allows developers and operators to interact with robots by issuing commands or queries in conversational language. Built on top of frameworks such as LangChain and modern large language models, ROSA translates user instructions into actions that can be executed within ROS1 or ROS2 environments. This capability enables users to inspect system status, diagnose issues, and control robot behavior without manually navigating complex command-line tools or configuration files. ...

Downloads: 4 This Week

Last Update: 2026-03-17

See Project

AppAgent

Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

AppAgent is an open-source multimodal agent framework designed to enable large language models to operate smartphone applications through natural interactions with graphical user interfaces. The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same way a human user would, making it compatible with a wide variety of mobile applications. ...

Downloads: 3 This Week

Last Update: 2026-03-04

See Project

LISA

LISA: Reasoning Segmentation via Large Language Model

LISA is an open-source multimodal AI system designed to enable language models to perform pixel-level reasoning and segmentation tasks on images. The project introduces a framework where a large language model can interpret natural language instructions and produce segmentation masks that highlight relevant regions in an image. Instead of relying solely on predefined object categories, the model is capable of reasoning about complex textual queries and translating them into visual segmentation outputs. ...

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

Search Results for "natural language processing"

Showing 103 open source projects for "natural language processing"

STORM

llama.cpp Python Bindings

NVIDIA NeMo

MetaScreener

DeepBI

Megatron

CodeGen

WeClone

chatd

SkillOpt

Fun Audio Chat

Happy-LLM

Qwen

HolmesGPT

HumanEval

Vanna 2.0

TigerBot

SAG

UFO³

Shell-AI

llms-from-scratch-cn

DATAGEN

ROSA

AppAgent

LISA

Search Results for "natural language processing"

Showing 103 open source projects for "natural language processing"

STORM

llama.cpp Python Bindings

NVIDIA NeMo

MetaScreener

DeepBI

Megatron

CodeGen

WeClone

chatd

SkillOpt

Fun Audio Chat

Happy-LLM

Qwen

HolmesGPT

HumanEval

Vanna 2.0

TigerBot

SAG

UFO³

Shell-AI

llms-from-scratch-cn

DATAGEN

ROSA

AppAgent

LISA

Related Searches

Related Categories