Showing 276 open source projects for "text based"

View related business solutions
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Hugging Face Transformer

    Hugging Face Transformer

    CPU/GPU inference server for Hugging Face transformer models

    Optimize and deploy in production Hugging Face Transformer models in a single command line. At Lefebvre Dalloz we run in-production semantic search engines in the legal domain, in the non-marketing language it's a re-ranker, and we based ours on Transformer. In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Ecco

    Ecco

    Explain, analyze, and visualize NLP language models

    Ecco is an interpretability tool for transformers that helps visualize and analyze how language models generate text, making model behavior more transparent.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    KoGPT

    KoGPT

    KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

    KoGPT is a Korean language model based on OpenAI’s GPT architecture, designed for various natural language processing (NLP) tasks such as text generation, summarization, and dialogue systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    VoiceFixer

    VoiceFixer

    General Speech Restoration

    VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    Mocking Bird

    Mocking Bird

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Parakeet

    Parakeet

    PAddle PARAllel text-to-speech toolKIT

    PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN) Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models. In order to facilitate exploiting the existing TTS models directly and developing the new ones, Parakeet selects typical models and provides...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g.,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TensorFlowTTS

    TensorFlowTTS

    Real-Time State-of-the-art Speech Synthesis for Tensorflow 2

    TensorFlowTTS is a state-of-the-art, open-source speech synthesis library built on TensorFlow 2. It offers a variety of architectures for text-to-speech, including classic and modern models such as Tacotron‑2, FastSpeech / FastSpeech2, and neural vocoders like MelGAN and Multiband‑MelGAN. Because it’s based on TensorFlow 2, it can leverage optimizations such as fake-quantization aware training and pruning — which allow models to run faster than real time and to be deployable on mobile or embedded platforms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    aseryla

    aseryla

    Aseryla code repositories

    This project describes a model of how the semantic human memory represents the information relevant to the objects of the world in text format. It provides a system and a GUI application capable of extracting and managing concepts and relations from English texts. https://aseryla2.sourceforge.io/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    VITS

    VITS

    Conditional Variational Autoencoder with Adversarial Learning

    VITS is a foundational research implementation of “VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech,” a well-known neural TTS architecture. Unlike traditional two-stage systems that separately train an acoustic model and a vocoder, VITS trains an end-to-end model that maps text directly to waveform using a conditional variational autoencoder combined with normalizing flows and adversarial training. This architecture enables parallel generation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    TransformerTTS is an implementation of a non-autoregressive Transformer-based neural network for text-to-speech, built with TensorFlow 2. It takes inspiration from architectures like FastSpeech, FastSpeech 2, FastPitch, and Transformer TTS, and extends them with its own aligner and forward models. The system separates alignment learning and acoustic modeling: an autoregressive Transformer is used as an aligner to extract phoneme-to-frame durations, while a non-autoregressive “ForwardTransformer” generates mel-spectrograms conditioned on text and durations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    hebrew-gpt_neo

    hebrew-gpt_neo

    Hebrew text generation models based on EleutherAI's gpt-neo

    Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 which was made available to me via the TPU Research Cloud Program. The Open Super-large Crawled ALMAnaCH coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the goclassy architecture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    DrQA

    DrQA

    Reading Wikipedia to Answer Open-Domain Questions

    DrQA is an open-domain question answering system that reads large text corpora—famously Wikipedia—to answer natural language questions with extractive spans. It follows a two-stage pipeline: a fast document retriever first narrows down candidate articles, and a neural machine reader then predicts the exact answer span from those passages. The retriever relies on classic IR features (like TF-IDF and n-gram statistics) to remain lightweight and scalable to millions of documents. The reader is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CRSLab

    CRSLab

    CRSLab is an open-source toolkit

    CRSLab is an open-source toolkit for building Conversational Recommender System (CRS). It is developed based on Python and PyTorch. CRSLab has the following highlights. Comprehensive benchmark models and datasets: We have integrated commonly-used 6 datasets and 18 models, including graph neural network and pre-training models such as R-GCN, BERT and GPT-2. We have preprocessed these datasets to support these models, and release for downloading. Extensive and standard evaluation protocols: We...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Text2Video

    Text2Video

    Software tool that converts text to video for more engaging experience

    ...I created a prototype web application that takes text as an input and generates a video as an output. I plan to further work on the project targeting young college students who are aged between 18 to 23 because they tend to prefer learning through videos over books based on the survey I found. The technologies I used for the project are HTML, CSS, Javascript, Node.js, CCapture.js, ffmpegserver.js, Amazon Polly, Python, Flask, gevent, spaCy, and Pixabay API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    HiFi-GAN

    HiFi-GAN

    Generative Adversarial Networks for Efficient and High Fidelity Speech

    HiFi-GAN is a GAN-based neural vocoder designed to generate high-fidelity speech waveforms from mel spectrograms with exceptional efficiency. It introduces a generator architecture tailored to model the periodic structure of speech and a set of discriminators that focus on different scales and periods of the waveform to better capture naturalness. The model targets a sweet spot between sample quality and generation speed, outperforming many previous GAN vocoders while being far faster than...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    commit-autosuggestions

    commit-autosuggestions

    A tool that AI automatically recommends commit messages

    This is implementation of CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model. CommitBERT is accepted in ACL workshop : NLP4Prog. Have you ever hesitated to write a commit message? Now get a commit message from Artificial Intelligence! CodeBERT: A Pre-Trained Model for Programming and Natural Languages introduces a pre-trained model in a combination of Program Language and Natural Language(PL-NL). It also introduces the problem of converting code into natural...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or updated with new crawls. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DeText

    DeText

    A Deep Neural Text Understanding Framework

    DeText is a Deep Text understanding framework for NLP-related ranking, classification, and language generation tasks. It leverages semantic matching using deep neural networks to understand member intents in search and recommender systems. As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking, multi-class classification and query understanding tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Delta ML

    Delta ML

    Deep learning based natural language and speech processing platform

    ...Use configuration files to easily tune parameters and network structures. What you see in training is what you get in serving: all data processing and features extraction are integrated into a model graph. Text classification, named entity recognition, question and answering, text summarization, etc. Uniform I/O interfaces and no changes for new models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GPT2 for Multiple Languages

    GPT2 for Multiple Languages

    GPT2 for Multiple Languages, including pretrained models

    With just 2 clicks (not including Colab auth process), the 1.5B pretrained Chinese model demo is ready to go. The contents in this repository are for academic research purpose, and we do not provide any conclusive remarks. Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC) Simplifed GPT2 train scripts(based on Grover, supporting TPUs). Ported bert tokenizer, multilingual corpus compatible. 1.5B GPT2 pretrained Chinese model (~15G corpus, 10w steps)....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    End-to-End Negotiator

    End-to-End Negotiator

    Deal or No Deal? End-to-End Learning for Negotiation Dialogues

    End-to-End Negotiator is a PyTorch-based research framework developed by Facebook AI Research to train neural agents capable of conducting strategic negotiations in natural language. The project implements the models presented in two key papers: “Deal or No Deal? End-to-End Learning for Negotiation Dialogues” and “Hierarchical Text Generation and Planning for Strategic Dialogue”.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    NLP Best Practices

    NLP Best Practices

    Natural Language Processing Best Practices & Examples

    In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora. This repository contains examples and...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB