Showing 84 open source projects for "python text parser"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    xAI Python SDK

    xAI Python SDK

    The official Python SDK for the xAI API

    xAI Python SDK is the official Python library for building applications with xAI’s APIs. It is a gRPC-based SDK designed for Python 3.10 and above, with both synchronous and asynchronous clients for different application styles. Developers can use it to generate text, images, videos, and structured outputs through xAI’s model services. The package is built for direct integration into Python projects, making it useful for backend apps, automation scripts, AI tools, research prototypes, and production workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Jupyter Notebook Tools for Sphinx

    Jupyter Notebook Tools for Sphinx

    Sphinx source parser for Jupyter notebooks

    nbsphinx is a Sphinx extension that provides a source parser for *.ipynb files. Custom Sphinx directives are used to show Jupyter Notebook code cells (and of course their results) in both HTML and LaTeX output. Un-evaluated notebooks – i.e. notebooks without stored output cells – will be automatically executed during the Sphinx build process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ART ASCII Library

    ART ASCII Library

    ASCII art library for Python

    ASCII art is also known as "computer text art". It involves the smart placement of typed special characters or letters to make a visual shape that is spread over multiple lines of text. ART is a Python lib for text converting to ASCII art fancy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    rich

    rich

    Rich is a Python library for rich text and beautiful formatting

    The Rich API makes it easy to add color and style to terminal output. Rich can also render pretty tables, progress bars, markdown, syntax highlighted source code, tracebacks, and more, out of the box. Rich is a Python library for rich text and beautiful formatting in the terminal. Rich works with Linux, OSX, and Windows. True color/emoji works with new Windows Terminal, classic terminal is limited to 16 colors. Rich requires Python 3.7 or later. Effortlessly add rich output to your application, you can import the rich print method, which has the same signature as the builtin Python function. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    JC

    JC

    CLI tool and python library

    ...The JC parsers can also be used as python modules. In this case, the output will be a python dictionary, or a list of dictionaries, instead of JSON. Two representations of the data are available. The default representation uses a strict schema per parser and converts known numbers to int/float JSON values. Certain known values of None are converted to JSON null, known boolean values are converted, and, in some cases, additional semantic context fields are added.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Rasa

    Rasa

    Open source machine learning framework to automate text conversations

    ...Rasa uses Poetry for packaging and dependency management. If you want to build it from the source, you have to install Poetry first. By default, Poetry will try to use the currently activated Python version to create the virtual environment for the current project automatically.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    The Arcade Library

    The Arcade Library

    Easy to use Python library for creating 2D arcade games

    Arcade is an easy-to-use Python library for creating 2D video games. It provides a modern and straightforward API, enabling developers to craft engaging games and graphical applications efficiently. Arcade supports rendering shapes, handling user input, and managing game physics, making it suitable for both beginners and experienced developers.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    zpdf is a high-performance PDF text extraction library written in Zig that focuses on speed, low overhead, and modern parsing techniques. It leans heavily on memory-mapped file reading and zero-copy patterns where possible, so it can scan large PDFs without repeatedly copying data around in memory. The library supports streaming extraction using efficient arena allocation, making it well suited for workloads that need to process big documents quickly or in batches. It implements multiple PDF...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Imagen - Pytorch

    Imagen - Pytorch

    Implementation of Imagen, Google's Text-to-Image Neural Network

    Implementation of Imagen, Google's Text-to-Image Neural Network that beats DALL-E2, in Pytorch. It is the new SOTA for text-to-image synthesis. Architecturally, it is actually much simpler than DALL-E2. It consists of a cascading DDPM conditioned on text embeddings from a large pre-trained T5 model (attention network). It also contains dynamic clipping for improved classifier-free guidance, noise level conditioning, and a memory-efficient unit design. It appears neither CLIP nor prior...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    xhtml2pdf

    xhtml2pdf

    A library for converting HTML into PDFs using ReportLab

    xhtml2pdf enables users to generate PDF documents from HTML content easily and with automated flow control such as pagination and keeping text together. The Python module can be used in any Python environment, including Django. The Command line tool is a stand-alone program that can be executed from the command line.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    go1pylib

    go1pylib

    go1pylib is a Python library designed to control the Go1 robot

    go1pylib is a Python library designed to control the Go1 robot by Unitree Robotics. It provides an easy-to-use interface for robot movement, state management, collision avoidance, battery monitoring, and MQTT communication. Ideal for research and robotics development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    pyTermTk

    pyTermTk

    Python Terminal Toolkit - a Spiced Up TUI Library

    pyTermTk is a Text-based user interface library (TUI). Evolved from the discontinued project pyCuT and inspired by a mix of Qt5, GTK, and tkinter API definition with a touch of personal interpretation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    GenAI Processors

    GenAI Processors

    GenAI Processors is a lightweight Python library

    GenAI Processors is a lightweight Python library for building modular, asynchronous, and composable AI pipelines around Gemini. Its central abstraction is the Processor, a unit of work that consumes an asynchronous stream of parts (text, images, audio, JSON) and produces another stream, making it natural to chain operations and keep everything streaming end-to-end. Processors can be composed sequentially (to build multi-step flows) or in parallel (to fan-out work and merge results), which makes sophisticated agent behaviors easy to express with simple operators. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Doctrine Annotations

    Doctrine Annotations

    Annotations docblock parser

    Doctrine Annotations allows to implement custom annotation functionality for PHP classes. Annotations aren't implemented in PHP itself which is why this component offers a way to use the PHP doc-blocks as a place for the well known annotation syntax using the @ char. Annotations in Doctrine are used for the ORM configuration to build the class mapping, but it can be used in other projects for other purposes too. You can install the Annotation component with composer. The access to the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    sqlite-utils

    sqlite-utils

    Python CLI utility and library for manipulating SQLite databases

    sqlite-utils is both a Python library and a command-line tool for creating, inspecting, and transforming SQLite databases with minimal boilerplate. It focuses on making common tasks like importing CSV/JSON, exploring tables, and running ad-hoc queries feel ergonomic and scriptable. As a CLI, it lets you build databases from structured data in one line, run queries against local files or in-memory databases, output results as JSON, CSV, or pretty tables, and configure full-text search. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Claude Cookbooks

    Claude Cookbooks

    A collection of notebooks/recipes showcasing ways of using Claude

    ...The repository includes structured examples for integrating Claude with external tools, databases, and APIs, showcasing how to extend its functionality beyond basic text generation. It also covers advanced techniques like sub-agent orchestration, prompt optimization, and automated evaluation workflows. The content is organized into thematic sections, allowing users to explore specific capabilities or integration patterns systematically. Designed with accessibility in mind, the examples are primarily written in Python but can be adapted to other languages.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    borb

    borb

    borb is a library for reading, creating and manipulating PDF files

    borb is a library for creating and manipulating PDF files in python. borb is a pure python library to read, write, and manipulate PDF documents. It represents a PDF document as a JSON-like data structure of nested lists, dictionaries and primitives (numbers, string, booleans, etc) This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB