Showing 2002 open source projects for "python text parser"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    ElevenLabs Python

    ElevenLabs Python

    The official Python SDK for the ElevenLabs API

    elevenlabs-python is the official Python SDK for the ElevenLabs API, giving developers a convenient way to access ElevenLabs’ high-quality, lifelike voices. The library wraps the HTTP API into a typed Python client, so you can perform text-to-speech, streaming, voice cloning, voice management, and agents-related operations with simple method calls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    python-bibtexparser v2

    python-bibtexparser v2

    Bibtex parser for Python 3

    Welcome to python-bibtexparser, a parser for .bib files with a long history and wide adaption. Bibtexparser is available in two versions: V1 and V2. For new projects, we recommend using v2 which, in the long run, will provide an overall more robust and faster experience. For now, however, note that v2 is an early beta, and does not contain all features of v1.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 5
    Recognizers-Text

    Recognizers-Text

    Recognition and resolution of numbers, units, date/time, etc.

    Recognizers-Text is a multilingual text recognition library that extracts structured information such as dates, numbers, and currency values from unstructured text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 7
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    MCP Text Editor

    MCP Text Editor

    Provides line-oriented text file editing capabilities

    The MCP Text Editor Server provides line-oriented text file editing capabilities through a standardized API, optimized for integration with Large Language Models (LLMs). It enables efficient partial file access, minimizing token usage while ensuring safe concurrent editing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 10
    Text Generation Web UI

    Text Generation Web UI

    A gradio web UI for running Large Language Models like LLaMA

    A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan. Markdown output for GALACTICA, including LaTeX rendering. Custom chat characters. Advanced chat features (send images, get audio responses with TTS)....
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Python Client For NLP Cloud

    Python Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, source code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    mistletoe

    mistletoe

    A fast, extensible and spec-compliant Markdown parser in pure Python

    mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest CommonMark-compliant Markdown parser implementation in pure Python, mistletoe also supports easy definitions of custom tokens. Parsing Markdown into an abstract syntax tree also allows us to swap out renderers for different output formats, without touching any of the core components.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    html-to-markdown

    html-to-markdown

    Convert HTML to Markdown. Even works with entire websites

    Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent some weird cases and allows it to be used for cases where the input is totally unknown.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Selectolax

    Selectolax

    Python binding to Modest and Lexbor engines

    A fast HTML5 parser with CSS selectors using Modest and Lexbor engines. Selectolax supports two backends: Modest and Lexbor. By default, all examples use the Modest backend. Most of the features between backends are almost identical, but there are still some differences. Currently, the Lexbor backend is in beta and missing some of the features. To use lexbor, just import the parser and use it in the similar way to the HTMLParser.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    CommonMark.jl

    CommonMark.jl

    A CommonMark-compliant parser for Julia

    A CommonMark-compliant parser for Julia.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Think Python 2

    Think Python 2

    LaTeX source and supporting code for Think Python, 2nd edition

    ThinkPython2 is the repository for the second edition of Allen Downey’s Think Python textbook, which teaches programming fundamentals in Python to beginners. The code includes all of the example programs, exercises, and supplementary files referenced in the book, allowing learners to run the examples, experiment, and extend them. The repository contains clean, well-commented Python scripts that are easy to follow and map directly to chapters of the text, covering topics like variables, control flow, functions, recursion, data structures (lists, dictionaries), classes and objects, file I/O, and algorithmic thinking. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    mXparser

    mXparser

    Math Parser: Java, C#, C++, Kotlin, Android, and all .NET platforms

    Math Parser: Java, C#, C++, Kotlin, Android, and all .NET platforms (Nuget, Maven, CMake). Supports .NET Framework, .NET Core, .NET Standard, Xamarin, and more. Features: rich built-in library of math functions, operators, constants. Flexible in user-defined arguments, and functions. Expressions are provided as plain text. Easy to use. Well documented.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Bot Framework SDK for Python

    Bot Framework SDK for Python

    Build and connect intelligent bots that interact naturally

    This repository contains code for the Python version of the Microsoft Bot Framework SDK, which is part of the Microsoft Bot Framework - a comprehensive framework for building enterprise-grade conversational AI experiences. This SDK enables developers to model conversation and build sophisticated bot applications using Python. SDKs for JavaScript and .NET are also available.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    MeloTTS

    MeloTTS

    High-quality multi-lingual text-to-speech library by MyShell.ai

    MeloTTS is an open-source text-to-speech (TTS) system that generates natural-sounding speech from text input. It utilizes advanced machine-learning models to produce high-quality audio outputs.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 22
    Neovim

    Neovim

    Hyperextensible Vim-based text editor

    Neovim is a hyperextensible text editor based on Vim. It seeks to maximize usability and extensibility, simplify maintenance and encourage contributions.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 23
    Pix2Text

    Pix2Text

    Open-Source Python3 tool for recognizing layouts, tables, and math

    An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical formulas, and integrate all of these contents into Markdown format. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 24
    novelWriter

    novelWriter

    Open source plain text editor designed for writing novels

    A markdown-like text editor designed for writing novels and larger projects of many smaller plain text documents. It is designed to be a simple text editor that allows for easy organization of text files and notes, with a metadata syntax for comments, synopsis, and cross-referencing between files, and built on plain text files for robustness. The project storage is suitable for version control software, and also well suited for file synchronisation tools. All text is saved as plain text...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    GDScript Toolkit

    GDScript Toolkit

    Independent set of GDScript tools - parser, linter and formatter

    Independent set of GDScript tools, parser, linter and formatter. This project provides a set of tools for daily work with GDScript. At the moment it provides a parser that produces a parse tree for debugging and educational purposes. A linter that performs a static analysis according to some predefined configuration. A formatter that formats the code according to some predefined rules. A code metrics calculator which calculates the cyclomatic complexity of functions and classes. To install...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next