Showing 440 open source projects for "natural language processing"

View related business solutions
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    Functionary

    Functionary

    Chat language model that can use tools and interpret the results

    Functionary is an open-source large language model specifically designed for interpreting and executing structured functions or external tools within conversational AI systems. The model extends traditional chat-based language models by enabling them to determine when external functions should be called and how to extract the necessary parameters from natural language input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    WebArena

    WebArena

    Code repo for "WebArena to build Autonomous Agents

    WebArena is a realistic web environment designed for building and testing autonomous agents, providing a platform for developing web-based AI agents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    RWKV

    RWKV

    RNN with great LLM performance

    RWKV-LM is the main research and training repository for the RWKV language model architecture. It presents RWKV as an attention-free RNN-style model that aims to reach transformer-level language model performance. The project is built around the idea that a model can be trained in a parallelizable way like a GPT-style transformer while running inference with recurrent efficiency. This gives RWKV important advantages for long-context use, including lower memory pressure and no traditional...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    DB-GPT-Hub

    DB-GPT-Hub

    A repository that contains models, datasets, and fine-tuning

    DB-GPT-Hub is an open-source repository that provides datasets, models, and training tools designed to improve large language models for database interaction tasks, particularly Text-to-SQL. The project serves as a specialized extension of the broader DB-GPT ecosystem, focusing on the preparation and evaluation of models capable of translating natural language questions into structured database queries. It offers a modular framework that supports data preparation, model fine-tuning, benchmarking, and inference for Text-to-SQL systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    EvaDB

    EvaDB

    Database system for building simpler and faster AI-powered application

    Over the last decade, AI models have radically changed the world of natural language processing and computer vision. They are accurate on various tasks ranging from question answering to object tracking in videos. To use an AI model, the user needs to program against multiple low-level libraries, like PyTorch, Hugging Face, Open AI, etc. This tedious process often leads to a complex AI app that glues together these libraries to accomplish the given task.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Langcorn

    Langcorn

    Serving LangChain LLM apps automagically with FastApi

    LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    UniEM

    UniEM

    Unified embedding model

    UniEM is a unified embedding model designed to create high-quality text embeddings for various natural language processing tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion dictionaries, stopwords). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    GPT-2 Output Dataset

    GPT-2 Output Dataset

    Dataset of GPT-2 outputs for research in detection, biases, and more

    The GPT-2 Output Dataset is a large collection of model-generated text, released by OpenAI alongside the GPT-2 research paper to study the behaviors and limitations of large language models. It contains 250,000 samples of GPT-2 outputs, generated with different sampling strategies such as top-k truncation, to highlight the diversity and quality of model completions. The dataset also includes corresponding human-written text for comparison, enabling researchers to explore methods for...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    vits_chinese

    vits_chinese

    Best practice TTS based on BERT and VITS

    ...VITS is a model combining variational autoencoders (VAEs), normalizing flows, adversarial learning, and a stochastic duration predictor — a design that enables generation of natural, expressive speech, capturing variations in rhythm and prosody. By customizing or porting VITS for Chinese, this project aims to produce high-quality TTS outputs in a language that can be challenging due to tones, pronunciation variability, and prosody. The repository offers full training and inference pipelines: preprocessing, mel-spectrogram generation, training scripts, and audio synthesis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    find-similar

    find-similar

    User-friendly library to find similar objects

    The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers developers to efficiently find similar objects and perform comparisons across a variety of data types. Whether dealing with texts, images, audio, or more, our project aims to simplify the process of identifying similarities and enhancing decision-making. https://github.com/findsimilar/find-similar - GitHub repo http://demo.findsimilar.org/ - Demo project and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Promptify

    Promptify

    se GPT or other prompt based models to get structured output

    Promptify is an open-source Python library designed to simplify prompt engineering and the development of natural language processing pipelines using large language models. The project provides tools that help developers generate structured prompts for different NLP tasks and apply them across multiple generative AI systems. Instead of manually crafting prompts for each task, Promptify introduces a unified architecture that combines prompt templates, language model interfaces, and processing pipelines into a single framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    CogView

    CogView

    Text-to-Image generation. The repo for NeurIPS 2021 paper

    CogView is a large-scale pretrained text-to-image transformer model, introduced in the NeurIPS 2021 paper CogView: Mastering Text-to-Image Generation via Transformers. With 4 billion parameters, it was one of the earliest transformer-based models to successfully generate high-quality images from natural language descriptions in Chinese, with partial support for English via translation. The model incorporates innovations such as PB-relax and Sandwich-LN to enable stable training of very deep transformers without NaN loss issues. CogView supports multiple tasks beyond text-to-image, including image captioning, post-selection (ranking candidate images by relevance to a prompt), and super-resolution (upscaling model-generated images). ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Medusa

    Medusa

    Framework for Accelerating LLM Generation with Multiple Decoding Heads

    Medusa is a framework aimed at accelerating the generation capabilities of Large Language Models (LLMs) by employing multiple decoding heads. This approach allows for parallel processing during text generation, significantly enhancing throughput and reducing response times. Medusa is designed to be simple to implement and integrates with existing LLM infrastructures, making it a practical solution for scaling LLM applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Prime QA

    Prime QA

    State-of-the-art Multilingual Question Answering research

    PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    LangChain Apps on Production with Jina

    LangChain Apps on Production with Jina

    Langchain Apps on Production with Jina & FastAPI

    Jina is an open-source framework for building scalable multi-modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs. long-chain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AI-powered enterprise search engine

    AI-powered enterprise search engine

    AI-powered enterprise search engine

    ...It enables users to search across sources such as Slack, Confluence, Jira, Google Drive, and other enterprise systems, consolidating fragmented knowledge into a single, unified search experience. By leveraging natural language processing, Gerev allows users to query information in plain English, making it easier to find answers without needing exact keywords or knowing where the data is stored. The platform indexes content from connected systems rather than relying on their native search capabilities, resulting in faster and more relevant results across large datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    wukong-robot

    wukong-robot

    Chinese voice dialogue robot/smart speaker project

    wukong-robot is a Chinese voice assistant / smart speaker project built to let makers and hackers design highly customizable voice-controlled devices. It combines wake-word detection, automatic speech recognition, natural language understanding, and text-to-speech into a single framework aimed at the Chinese-speaking ecosystem. The project is positioned as a simple, flexible, and elegant platform that can run on devices like Raspberry Pi and other Linux-based boards, making it suitable for DIY smart speakers and home-automation hubs. It supports multi-turn conversational capabilities powered by ChatGPT or other large language models, letting users have continuous dialogues rather than one-shot commands. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    textacy

    textacy

    NLP, before and after spaCy

    textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Transformers-Interpret

    Transformers-Interpret

    Model explainability that works seamlessly with Hugging Face

    Transformers-Interpret is an interpretability tool for Transformer-based NLP models, providing insights into attention mechanisms and feature importance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    picoGPT

    picoGPT

    An unnecessarily tiny implementation of GPT-2 in NumPy

    ...The project uses a small amount of code to illustrate the essential mathematical operations involved in training and running a transformer-based neural network. Because the code is intentionally lightweight, it is often used as a teaching resource for students learning about natural language processing and deep learning architectures. Developers can explore the repository to understand how language models generate text and how transformer components interact within the architecture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Chameleon LLM

    Chameleon LLM

    Codes for "Chameleon: Plug-and-Play Compositional Reasoning

    Discover Chameleon, our cutting-edge compositional reasoning framework designed to enhance large language models (LLMs) and overcome their inherent limitations, such as outdated information and lack of precise reasoning. By integrating various tools such as vision models, web search engines, Python functions, and rule-based modules, Chameleon delivers more accurate, up-to-date, and precise responses, making it a game-changer in the natural language processing landscape. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    OpenDelta

    OpenDelta

    A plug-and-play library for parameter-efficient-tuning

    OpenDelta is an open-source parameter-efficient fine-tuning library that enables efficient adaptation of large-scale pre-trained models using delta tuning techniques. OpenDelta is a toolkit for parameter-efficient tuning methods (we dub it as delta tuning), by which users could flexibly assign (or add) a small amount parameters to update while keeping the most parameters frozen. By using OpenDelta, users could easily implement prefix-tuning, adapters, Lora, or any other types of delta tuning...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Emb-GAM

    Emb-GAM

    An interpretable and efficient predictor using pre-trained models

    ...Across a variety of natural-language-processing datasets, Emb-GAM achieves strong prediction performance without sacrificing interpretability.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo