Search Results for "language processing" - Page 22

Showing 961 open source projects for "language processing"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    fastText

    fastText

    Library for fast text classification and representation

    FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices. ext classification is a core problem to many applications, like spam detection, sentiment analysis or smart replies. In this tutorial, we describe how to build a text classifier with the fastText tool. The goal of text classification is to assign documents (such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    NLP-progress

    NLP-progress

    Repository to track the progress in Natural Language Processing (NLP)

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks. This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets. It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    YouTokenToMe

    YouTokenToMe

    Unsupervised text tokenizer focused on computational efficiency

    YouTokenToMe is a fast and efficient unsupervised text tokenization library designed for training subword embeddings, particularly useful for NLP models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Weld

    Weld

    High-performance runtime for data analytics applications

    ...Weld is particularly useful for workloads involving large-scale data processing in frameworks such as NumPy, Spark, and TensorFlow. The language includes built-in constructs for expressing data-parallel operations, enabling efficient execution on modern hardware architectures. By combining operations from multiple libraries into a single optimized execution plan, Weld can significantly improve performance in analytics and machine learning pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    PyTorch Natural Language Processing

    PyTorch Natural Language Processing

    Basic Utilities for PyTorch Natural Language Processing (NLP)

    PyTorch-NLP is a library for Natural Language Processing (NLP) in Python. It’s built with the very latest research in mind, and was designed from day one to support rapid prototyping. PyTorch-NLP comes with pre-trained embeddings, samplers, dataset loaders, metrics, neural network modules and text encoders. It’s open-source software, released under the BSD3 license. With your batch in hand, you can use PyTorch to develop and train your model using gradient descent.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    cocoNLP

    cocoNLP

    A Chinese information extraction tool

    cocoNLP is a lightweight natural-language processing toolkit geared toward practical information extraction from raw text, especially for Chinese and mixed Chinese–English content. Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Top deep learning Github repositories

    Top deep learning Github repositories

    Top 200 deep learning Github repositories sorted by stars

    ...Instead of providing its own machine learning models or frameworks, the project functions as an organized index that helps users discover high-quality deep learning repositories across different application domains. The repository categorizes projects related to neural networks, computer vision, natural language processing, reinforcement learning, and other areas of artificial intelligence. By collecting popular open-source implementations in one place, the project simplifies the process of exploring cutting-edge tools and research implementations for deep learning practitioners. The curated lists are particularly helpful for developers who want to quickly identify well-maintained projects with strong community support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PyResParser

    PyResParser

    A simple resume parser used for extracting information from resumes

    PyResParser is a simple resume parser that extracts information from resumes, aiding in the automation of resume-processing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    PRMLT

    PRMLT

    Matlab code of machine learning algorithms in book PRML

    This Matlab package implements machine learning algorithms described in the great textbook: Pattern Recognition and Machine Learning by C. Bishop (PRML). It is written purely in Matlab language. It is self-contained. There is no external dependency. This package requires Matlab R2016b or latter, since it utilizes a new Matlab syntax called Implicit expansion (a.k.a. broadcasting). It also requires Statistics Toolbox (for some simple random number generator) and Image Processing Toolbox (for reading image data). The code is extremely compact. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Dragonfire

    Dragonfire

    The open-source virtual assistant for Ubuntu based Linux distributions

    Dragonfire is the open-source virtual assistant project for Ubuntu-based Linux distributions. Her main objective is to serve as a command and control interface to the helmet user. So that you will be able to give orders just by using your voice commands and your eye movements. That makes the helmet handsfree. We are planning to ship Dragonfire as a preinstalled software package on DragonOS Linux Distribution. DragonOS will be a Linux distribution specially designed for the helmet. It will...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Texar

    Texar

    Toolkit for Machine Learning, Natural Language Processing

    Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation. Texar was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    KSUCCA Corpus

    A 50 million tokens corpus of Classical Arabic.

    ... • Arabic computational linguistics, which includes: lexical, morphological, syntactic, semantic and pragmatic research including their various applications. • Arabic language teaching for both Arabs and non Arabs. • Artificial intelligence. • Natural language processing. • Information retrieval. • Question answering. • Machine translation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Go-Guerrilla SMTP Daemon

    Go-Guerrilla SMTP Daemon

    Mini SMTP server written in golang

    ...It's an SMTP server written in Go, for the purpose of receiving large volumes of email. It started as a project for GuerrillaMail.com which processes millions of emails every day, and needed a daemon with less bloat & written in a more memory-safe language that can take advantage of modern multi-core architectures. The purpose of this daemon is to grab the email, save it, and disconnect as quickly as possible, essentially performing the services of a Mail Transfer Agent (MTA) without the sending functionality. The software also includes a modular backend implementation, which can extend the email processing functionality to whatever needs you may require.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Network Function Framework for Go

    Network Function Framework for Go

    NFF-Go -Network Function Framework for GO (former YANFF)

    NFF-Go is a set of libraries for creating and deploying cloud-native Network Functions (NFs). It simplifies the creation of network functions without sacrificing performance. We are now supporting AF_XDP and supporting(almost) getting packets directly from Linux. So you do not need to write 3(three) different applications to process packets coming from different type of drivers of PMDs. You just write everything in NFF-Go, and it can dynamically use whatever you would like underneath....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Deep Learning Drizzle

    Deep Learning Drizzle

    Drench yourself in Deep Learning, Reinforcement Learning

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Chatito

    Chatito

    Dataset generation for AI chatbots, NLP tasks

    Chatito is a tool that helps generate datasets for training and validating chatbot models using a simple domain-specific language (DSL).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Netstack

    Netstack

    IPv4 and IPv6 userland network stack

    netstack is a userspace TCP/IP networking stack written in Go that implements core IPv4/IPv6 protocols with a focus on correctness, isolation, and testability. By running entirely in user space, it avoids kernel dependencies and can be embedded into sandboxes, virtualized environments, or custom appliances. Its architecture models NICs, link endpoints, route tables, and protocol engines as composable interfaces, enabling precise control over packet flow and easy mocking in tests. The stack...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    playSMS

    playSMS

    Free and Open Source SMS Gateway Software. Not A Free SMS Service.

    playSMS is a free and open source SMS management software, a web interface for SMS gateways and bulk SMS services.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    KonsolScript: Automate and Orchestrate

    KonsolScript: Automate and Orchestrate

    Embeddable scripting runtime for live behavior, AI, and automation.

    ...Key capabilities: - Embed into any C++ app with a single header - Hot-reload scripts at runtime without restarting - AI-safe: validate or reject scripts before execution - Orchestrate LLMs (OpenAI, Claude, Gemini, Ollama) in .ks scripts - Built-in: String, File, JSON, CSV, Math, Regex, and more - Plugins: HTTP, SQLite, MySQL, TCP, Redis, Crypto, JWT, Zip - Push behavior updates to remote instances over TCP Use cases: - Scriptable game engines (hot-patch rules mid-session) - AI event bridges (natural language to live app behavior) - Automation pipelines (CI, log triage, file processing) - LLM orchestration workflows Docs: https://konsolscript.sf.net/kookbook.html
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Twint

    Twint

    An advanced Twitter scraping & OSINT tool written in Python

    Twint is an advanced open-source Twitter scraping and OSINT tool written in Python that extracts tweets, user data, followers, likes, and more—without relying on Twitter’s API—making it highly useful for researchers, analysts, and hobbyists who want to bypass rate limits and access public Twitter data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Budou

    Budou

    Budou is an auto organizer tool for beautiful line breaking in CJK

    ...These spans can be styled with CSS to ensure smooth, visually coherent line breaks without splitting words or phrases. The tool supports multiple segmentation backends, including Google Cloud Natural Language API, MeCab, and TinySegmenter, enabling flexibility for both cloud-based and offline processing. Budou can be used via command line, in Python scripts, or integrated into web applications, and it provides advanced options such as caching and entity recognition for improved segmentation accuracy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    gpt2-client

    gpt2-client

    Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, etc.

    GPT-2 is a Natural Language Processing model developed by OpenAI for text generation. It is the successor to the GPT (Generative Pre-trained Transformer) model trained on 40GB of text from the internet. It features a Transformer model that was brought to light by the Attention Is All You Need paper in 2017. The model has 4 versions - 124M, 345M, 774M, and 1558M - that differ in terms of the amount of training data fed to it and the number of parameters they contain.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Rasa-UI

    Rasa-UI

    Rasa UI is a frontend for the Rasa Framework

    Rasa UI is a web application built on top of, and for Rasa. Rasa UI provides a web application to quickly and easily be able to create and manage bots, NLU components (Regex, Examples, Entities, Intents, etc.) and Core components (Stories, Actions, Responses, etc.) through a web interface. It also provides some convenience features for Rasa, like training and loading your models, monitoring usage or viewing logs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    artext

    artext

    Probabilistic Noising of Natural Language

    Artext is a work on injecting noise into text without affecting the core meaning for a human reader. This kind of data can be useful for many NLP tasks, particulary to make models robust to erroneous text. This is a work in progress, and we will publish the results of our experiments soon. Meanwhile, if you use artext in your research please cite this repository. Github: https://github.com/nlpcl-lab/artext
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    FalaBrasil

    FalaBrasil

    Resources for speech processing in Brazilian Portuguese

    The FalaBrasil Group provides free tools and resources for speech and natural language processing in Brazilian Portuguese, most of them under the BSD license. Tools include mainly scripts to do all sort of things with audio and text, whereas resources include ready-to-used acoustic and languages models, phonetic dictionaries, etc.
    Downloads: 0 This Week
    Last Update:
    See Project