Showing 519 open source projects for "language processing"

View related business solutions
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc..
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    NLP.js

    NLP.js

    An NLP library for building bots

    ...Natural Language Processing Classifier, to classify an utterance into intents. NLP Manager, a tool able to manage several languages, the Named Entities for each language, the utterances, and intents for the training of the classifier, and for a given utterance return the entity extraction, the intent classification and the sentiment analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DeText

    DeText

    A Deep Neural Text Understanding Framework

    DeText is a Deep Text understanding framework for NLP-related ranking, classification, and language generation tasks. It leverages semantic matching using deep neural networks to understand member intents in search and recommender systems. As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking, multi-class classification and query understanding tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Delta ML

    Delta ML

    Deep learning based natural language and speech processing platform

    DELTA is a deep learning-based end-to-end natural language and speech processing platform. DELTA aims to provide easy and fast experiences for using, deploying, and developing natural language processing and speech models for both academia and industry use cases. DELTA is mainly implemented using TensorFlow and Python 3. DELTA has been used for developing several state-of-the-art algorithms for publications and delivering real production to serve millions of users. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    NLP-Models-Tensorflow

    NLP-Models-Tensorflow

    Gathers machine learning and Tensorflow deep learning models for NLP

    NLP-Models-Tensorflow is a collection of natural language processing model implementations built using the TensorFlow deep learning framework. The repository provides numerous examples of neural network architectures used in modern NLP research and applications, including text classification, language modeling, machine translation, and sentiment analysis. Each model implementation is designed to illustrate how common NLP architectures operate, such as recurrent neural networks, convolutional models for text processing, and transformer-style attention mechanisms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Leseratte is a Java parser for German written language. Currently, it contains a German lexicon (based on the Wiktionary), inflexion rules, a grammar and a parser. (Semantics component planned.) Usable as a Java library, also provides a graphical UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PyText

    PyText

    A natural language modeling framework based on PyTorch

    PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapid experimentation and of serving models at scale. It achieves this by providing simple and extensible interfaces and abstractions for model components, and by using PyTorch’s capabilities of exporting models for inference via the optimized Caffe2 execution engine. We use PyText at Facebook to iterate quickly on new modeling ideas and then seamlessly...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DeepLearning

    DeepLearning

    Deep Learning (Flower Book) mathematical derivation

    ...At the same time, it also introduces deep learning techniques used by practitioners in the industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling and practical methods, and investigates topics such as natural language processing, Applications in speech recognition, computer vision, online recommender systems, bioinformatics, and video games. Finally, the Deep Learning book provides research directions covering theoretical topics including linear factor models, autoencoders, representation learning, structured probabilistic models, etc.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 10
    NLP Best Practices

    NLP Best Practices

    Natural Language Processing Best Practices & Examples

    In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    fastText

    fastText

    Library for fast text classification and representation

    FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices. ext classification is a core problem to many applications, like spam detection, sentiment analysis or smart replies. In this tutorial, we describe how to build a text classifier with the fastText tool. The goal of text classification is to assign documents (such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    NLP-progress

    NLP-progress

    Repository to track the progress in Natural Language Processing (NLP)

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks. This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets. It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    YouTokenToMe

    YouTokenToMe

    Unsupervised text tokenizer focused on computational efficiency

    YouTokenToMe is a fast and efficient unsupervised text tokenization library designed for training subword embeddings, particularly useful for NLP models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Weld

    Weld

    High-performance runtime for data analytics applications

    ...Weld is particularly useful for workloads involving large-scale data processing in frameworks such as NumPy, Spark, and TensorFlow. The language includes built-in constructs for expressing data-parallel operations, enabling efficient execution on modern hardware architectures. By combining operations from multiple libraries into a single optimized execution plan, Weld can significantly improve performance in analytics and machine learning pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    PyTorch Natural Language Processing

    PyTorch Natural Language Processing

    Basic Utilities for PyTorch Natural Language Processing (NLP)

    PyTorch-NLP is a library for Natural Language Processing (NLP) in Python. It’s built with the very latest research in mind, and was designed from day one to support rapid prototyping. PyTorch-NLP comes with pre-trained embeddings, samplers, dataset loaders, metrics, neural network modules and text encoders. It’s open-source software, released under the BSD3 license. With your batch in hand, you can use PyTorch to develop and train your model using gradient descent.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    cocoNLP

    cocoNLP

    A Chinese information extraction tool

    cocoNLP is a lightweight natural-language processing toolkit geared toward practical information extraction from raw text, especially for Chinese and mixed Chinese–English content. Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    PyResParser

    PyResParser

    A simple resume parser used for extracting information from resumes

    PyResParser is a simple resume parser that extracts information from resumes, aiding in the automation of resume-processing tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Top deep learning Github repositories

    Top deep learning Github repositories

    Top 200 deep learning Github repositories sorted by stars

    ...Instead of providing its own machine learning models or frameworks, the project functions as an organized index that helps users discover high-quality deep learning repositories across different application domains. The repository categorizes projects related to neural networks, computer vision, natural language processing, reinforcement learning, and other areas of artificial intelligence. By collecting popular open-source implementations in one place, the project simplifies the process of exploring cutting-edge tools and research implementations for deep learning practitioners. The curated lists are particularly helpful for developers who want to quickly identify well-maintained projects with strong community support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    KSUCCA Corpus

    A 50 million tokens corpus of Classical Arabic.

    ... • Arabic computational linguistics, which includes: lexical, morphological, syntactic, semantic and pragmatic research including their various applications. • Arabic language teaching for both Arabs and non Arabs. • Artificial intelligence. • Natural language processing. • Information retrieval. • Question answering. • Machine translation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Texar

    Texar

    Toolkit for Machine Learning, Natural Language Processing

    Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation. Texar was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Deep Learning Drizzle

    Deep Learning Drizzle

    Drench yourself in Deep Learning, Reinforcement Learning

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Chatito

    Chatito

    Dataset generation for AI chatbots, NLP tasks

    Chatito is a tool that helps generate datasets for training and validating chatbot models using a simple domain-specific language (DSL).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    gpt2-client

    gpt2-client

    Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, etc.

    GPT-2 is a Natural Language Processing model developed by OpenAI for text generation. It is the successor to the GPT (Generative Pre-trained Transformer) model trained on 40GB of text from the internet. It features a Transformer model that was brought to light by the Attention Is All You Need paper in 2017. The model has 4 versions - 124M, 345M, 774M, and 1558M - that differ in terms of the amount of training data fed to it and the number of parameters they contain.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Rasa-UI

    Rasa-UI

    Rasa UI is a frontend for the Rasa Framework

    Rasa UI is a web application built on top of, and for Rasa. Rasa UI provides a web application to quickly and easily be able to create and manage bots, NLU components (Regex, Examples, Entities, Intents, etc.) and Core components (Stories, Actions, Responses, etc.) through a web interface. It also provides some convenience features for Rasa, like training and loading your models, monitoring usage or viewing logs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    FalaBrasil

    FalaBrasil

    Resources for speech processing in Brazilian Portuguese

    The FalaBrasil Group provides free tools and resources for speech and natural language processing in Brazilian Portuguese, most of them under the BSD license. Tools include mainly scripts to do all sort of things with audio and text, whereas resources include ready-to-used acoustic and languages models, phonetic dictionaries, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo