Showing 524 open source projects for "python text parser"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 1
    Defox text to speech and downloader

    Defox text to speech and downloader

    Written or imported text offline read or online download.

    ...I'm worried about that. ! Note 2 : When you double click on the software maybe it will get some seconds to open. That's not my fault. I used Python language to make this software and Python was not supported speedy to modern computers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    Safe Harbor Deidentification

    Safe Harbor Deidentification for medical documents

    Phalanx - Deidentify Safe Harbor Deidentification Mode of Phalanx is an abridged pipeline of NLP annotators culminating in NER annotators which write output of text offsets. It uses the Safe Harbor deidentification method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    easy12306

    easy12306

    Automatic recognition of 12306 verification code

    Automatic recognition of 12306 verification code using machine learning algorithm. Identify never-before-seen pictures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    lazynlp

    lazynlp

    Library to scrape and clean web pages to create massive datasets

    LazyNLP is a lightweight tool for collecting and curating large-scale text datasets for machine learning and NLP applications with minimal manual effort.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    NeuroNER

    NeuroNER

    Named-entity recognition using neural networks

    Named-entity recognition (NER) aims at identifying entities of interest in the text, such as location, organization and temporal expression. Identified entities can be used in various downstream applications such as patient note de-identification and information extraction systems. They can also be used as features for machine learning systems for other natural language processing tasks. Leverages the state-of-the-art prediction capabilities of neural networks (a.k.a. "deep learning") Is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    TextRank

    TextRank

    TextRank implementation for Python 3

    TextRank is an implementation of the TextRank algorithm for extractive text summarization and keyword extraction, inspired by Google’s PageRank.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Tacotron-2

    Tacotron-2

    DeepMind's Tacotron-2 Tensorflow implementation

    Tacotron-2 is a TensorFlow implementation of DeepMind’s Tacotron-2 end-to-end text-to-speech architecture, which predicts mel spectrograms from raw text and then feeds them to a neural vocoder such as WaveNet. It reproduces the original paper’s hyperparameters exactly via paper_hparams.py, while also offering a tuned hparams.py with extra improvements that often yield better audio quality in practice. The repository is structured as a full training pipeline: dataset preparation,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8

    Arabic Corpus

    Text categorization, arabic language processing, language modeling

    The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Deepvoice3_pytorch

    Deepvoice3_pytorch

    PyTorch implementation of convolutional neural networks

    An open source implementation of Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 10
    Market Reporter

    Market Reporter

    Automatic Generation of Brief Summaries of Time-Series Data

    Market Reporter automatically generates short comments that describe time series data of stock prices, FX rates, etc. This is an implementation of Murakami et al. This tool stores data to Amazon S3. Ask the manager to give you AmazonS3FullAccess and issue a credential file. For details, please read AWS Identity and Access Management. Install Docker and Docker Compose. Edit envs/docker-compose.yaml according to your environment. Then, launch containers by docker-compose. We recommend to use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SG2Im

    SG2Im

    Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201

    sg2im is a research codebase that learns to synthesize images from scene graphs—structured descriptions of objects and their relationships. Instead of conditioning on free-form text alone, it leverages graph structure to control layout and interactions, generating scenes that respect constraints like “person left of dog” or “cup on table.” The pipeline typically predicts object layouts (bounding boxes and masks) from the graph, then renders a realistic image conditioned on those layouts....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    OpenSeq2Seq is a TensorFlow-based toolkit for efficient experimentation with sequence-to-sequence models across speech and NLP tasks. Its core goal is to give researchers a flexible, modular framework for building and training encoder–decoder architectures while fully leveraging distributed and mixed-precision training. The toolkit includes ready-made models for neural machine translation, automatic speech recognition, speech synthesis, language modeling, and additional NLP tasks such as...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13

    Presage

    the intelligent predictive text entry platform

    Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic...
    Leader badge
    Downloads: 323 This Week
    Last Update:
    See Project
  • 14
    SLING

    SLING

    A natural language frame semantics parser

    The aim of the SLING project is to learn to read and understand Wikipedia articles in many languages for the purpose of knowledge base completion, e.g. adding facts mentioned in Wikipedia (and other sources) to the Wikidata knowledge base. We use frame semantics as a common representation for both knowledge representation and document annotation. The SLING parser can be trained to produce frame semantic representations of text directly without any explicit intervening linguistic representation. The SLING project is still work in progress. We do not yet have a full system that can extract facts from arbitrary text, but we have built a number of the subsystems needed for such a system. The SLING frame store is our basic framework for building and manipulating frame semantic graph structures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    DC-TTS

    DC-TTS

    TensorFlow Implementation of DC-TTS: yet another text-to-speech model

    DC-TTS is a TensorFlow implementation of the DC-TTS architecture, a fully convolutional text-to-speech system designed to be efficiently trainable while producing natural speech. It follows the “Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention” paper, but the author adapts and extends the design to make it practical for real experiments. The model is split into two networks: Text2Mel, which maps text to mel-spectrograms, and SSRN...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    DeepLearn

    DeepLearn

    Implementation of research papers on Deep Learning+ NLP+ CV in Python

    Welcome to DeepLearn. This repository contains an implementation of the following research papers on NLP, CV, ML, and deep learning. The required dependencies are mentioned in requirement.txt. I will also use dl-text modules for preparing the datasets. If you haven't use it, please do have a quick look at it. CV, transfer learning, representation learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    PyTorch Book

    PyTorch Book

    PyTorch tutorials and fun projects including neural talk

    This is the corresponding code for the book "The Deep Learning Framework PyTorch: Getting Started and Practical", but it can also be used as a standalone PyTorch Getting Started Guide and Tutorial. The current version of the code is based on pytorch 1.0.1, if you want to use an older version please git checkout v0.4or git checkout v0.3. Legacy code has better python2/python3 compatibility, CPU/GPU compatibility test. The new version of the code has not been fully tested, it has been tested...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MDictate

    MDictate

    Speech to text using python, pocketsphinx, ready to deploy

    Automated speech recognition software is extremely cumbersome. This project's aim is to incrementally improve the quality of an open-source and ready to deploy speech to text recognition system. Runs on Windows using the mdictate.exe, but the core workings are found in the mdictate.py script which should work on Windows/Linux/OS X. In version 1.0, we use pocketsphinx' default setup with a basic graphic interface.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Command Line Parser GetPot

    Command Line Parser GetPot

    Tool to parse the command line and configuration files.

    Powerful command line and configuration file parsing for C++, Python, Ruby and Java (others to come). This tool provides many features, such as separate treatment for options, variables, and flags, unrecognized object detection, prefixes and much more.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    FM2TXT

    FM2TXT

    RtlSdr listen to radio, recognize audio, and writes text file log

    Just log your favorite FM station speech to a text file using rtl-sdr dongle and speech recognition. Cross-platform tool. Follow the README on the download page for Windows installation. https://sourceforge.net/projects/fm2txt-rtlsdr/files/ If you prefer GitHub source, not SF: https://github.com/randaller/fm2txt For those, who want to recognize from soundcard, not from rtl-sdr (this allows to transcribe NFM etc): https://github.com/randaller/souncard2txt
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    JAVT - Just Another Voice Transformer

    JAVT - Just Another Voice Transformer

    Just Another Speech Recognition and Text to Speech software.

    JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    TEES

    Turku Event Extraction System

    Turku Event Extraction System (TEES) is a free and open source natural language processing system developed for the extraction of events and relations from biomedical text. It is written mostly in Python, and should work in generic Unix/Linux environments. Currently, the TEES source code repository still remains on GitHub at http://jbjorne.github.com/TEES/ where there is also a wiki with more information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    vinuxproject

    vinuxproject

    Vinux is an Ubuntu derived distribution for blind & visually impaired.

    Vinux supports software text to speech and Braille support from boot-up to shutdown. Users can use installation medium to install independently with no sighted assistance required. Vinux supports command line environment speech, Desktop environment speech and magnification features. Vinux comes with an accessible suite of software and has an excellent mailing list support group.
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 25

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are...
    Leader badge
    Downloads: 20 This Week
    Last Update:
    See Project