Showing 128 open source projects for "python text parser"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    CommonMark.jl

    CommonMark.jl

    A CommonMark-compliant parser for Julia

    A CommonMark-compliant parser for Julia.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    OpenMed

    OpenMed

    Open source healthcare AI

    ...OpenMed can be used in three main ways: as a simple Python API for scripts and notebooks, as a Docker-friendly FastAPI service for backend integration, and as a batch-processing system for multi-document workflows.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    ...Support for lists of Numpy arrays, Pandas dataframes, text or (mixed) lists. Applying topic models and other text vectorization methods to text data. HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. The basic pipeline is to feed in a high-dimensional dataset (or a series of high-dimensional datasets) and, in a single function call, reduce the dimensionality of the dataset(s) and create a plot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    kb

    kb

    A minimalist command line knowledge base manager

    kb is a minimalist command-line knowledge base manager that gives users a fast, organized way to collect, store, search, and retrieve notes, documents, cheatsheets, procedures, and other artifacts directly from the terminal. It was created to solve the common problem of having scattered text files or reference materials on disk that are hard to search or categorize, and it surfaces a simple CLI interface with intuitive commands for adding, viewing, editing, and deleting knowledge items. Each...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Automa.jl

    Automa.jl

    A julia code generator for regular expressions

    Automa is a regex-to-Julia compiler. By compiling regex to Julia code in the form of Expr objects, Automa provides facilities to create efficient and robust regex-based lexers, tokenizers and parsers using Julia's metaprogramming capabilities. You can view Automa as a regex engine that can insert arbitrary Julia code into its input-matching process, which will be executed when certain parts of the regex match an input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    Log Parser

    Tool for parsing all logs in present directory for search of phrases.

    Simple Tool allowing for parsing all text logs in the present directory for search for IP addresses, phrases or it's a combination. Put the compiled version of this tool into the directory with logs and run.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Flet

    Flet

    Flet enables developers to easily build realtime web and mobile apps

    ...With Flet you just write a monolith stateful app in Python only and get a multi-user, real-time Single-Page Application (SPA). To start developing with Flet, you just need your favorite IDE or text editor. With no SDKs, no thousands of dependencies, no complex tooling, Flet has a built-in web server with assets hosting and desktop clients.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 9
    borb

    borb

    borb is a library for reading, creating and manipulating PDF files

    borb is a library for creating and manipulating PDF files in python. borb is a pure python library to read, write, and manipulate PDF documents. It represents a PDF document as a JSON-like data structure of nested lists, dictionaries and primitives (numbers, string, booleans, etc) This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    AutoGluon

    AutoGluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    AutoGluon enables easy-to-use and easy-to-extend AutoML with a focus on automated stack ensembling, deep learning, and real-world applications spanning image, text, and tabular data. Intended for both ML beginners and experts, AutoGluon enables you to quickly prototype deep learning and classical ML solutions for your raw data with a few lines of code. Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Sweetviz

    Sweetviz

    Visualize and compare datasets, target values and associations

    Sweetviz is an open-source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code. Output is a fully self-contained HTML application. The system is built around quickly visualizing target values and comparing datasets. Its goal is to help quick analysis of target characteristics, training vs testing data, and other such data characterization tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    NBInclude.jl

    NBInclude.jl

    import code from IJulia Jupyter notebooks into Julia programs

    NBInclude is a package for the Julia language that allows you to include and execute IJulia (Julia-language Jupyter) notebook files just as you would include an ordinary Julia file. The goal of this package is to make notebook files just as easy to incorporate into Julia programs as ordinary Julia (.jl) files, giving you the advantages of a notebook (integrated code, formatted text, equations, graphics, and other results) while retaining the modularity and re-usability of .jl files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Conversational Health Agents (CHA)

    Conversational Health Agents (CHA)

    A Personalized LLM-powered Agent Frameworks

    CHA, or Conversational Health Agents, is an open-source framework designed to build intelligent healthcare assistants powered by large language models and external data sources. The system enables developers to create personalized AI agents that can interact with users through natural language while performing multi-step reasoning and task execution. It integrates orchestration capabilities that allow the agent to gather information from APIs, knowledge bases, and external services in order...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Gretel Synthetics

    Gretel Synthetics

    Synthetic data generators for structured and unstructured text

    Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Cleanlab

    Cleanlab

    The standard data-centric AI package for data quality and ML

    cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Obsidian Visual Skills Pack

    Obsidian Visual Skills Pack

    Generate Canvas, Excalidraw, and Mermaid diagrams from text

    LLM-TLDR is a Python-based tool designed to dramatically reduce the amount of code a large language model needs to read by extracting the essential structure and context from a codebase and presenting only the most relevant parts to the model. Traditional approaches often dump entire files into a model’s context, which quickly exceeds token limits; LLM-TLDR instead indexes project structure, traces dependencies, and summarizes code in a way that preserves semantic relevance while shrinking...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 17,013 This Week
    Last Update:
    See Project
  • 19
    Apache OpenOffice

    Apache OpenOffice

    The free and Open Source productivity suite

    Free alternative for Office productivity tools: Apache OpenOffice - formerly known as OpenOffice.org - is an open-source office productivity software suite containing word processor, spreadsheet, presentation, graphics, formula editor, and database management applications. OpenOffice is available in many languages, works on all common computers, stores data in ODF - the international open standard format - and is able to read and write files in other formats, included the format used by the...
    Leader badge
    Downloads: 244,262 This Week
    Last Update:
    See Project
  • 20
    KeyParaStocX

    KeyParaStocX

    Set styles to words and create a Table of Contents in a click

    KeyParaStocX (Keyword-based Paragraph Styling and Table of Contents eXtension) is a LibreOffice/Apache OpenOffice/OpenOffice.org extension that searches for the configured keywords in a text, changes their style and builds a Table of Contents for them, up to 7 levels. The keywords and their target styles can be configured by the users and used for every document they open. The extension integrates into Writer options and is independent of the operating system (should work on all). See...
    Downloads: 191 This Week
    Last Update:
    See Project
  • 21
    Asymptote

    Asymptote

    2D & 3D TeX-Aware Vector Graphics Language

    Asymptote is a powerful descriptive vector graphics language for technical drawing, inspired by MetaPost but with an improved C++-like syntax. Asymptote provides for figures the same high-quality typesetting that LaTeX does for scientific text.
    Leader badge
    Downloads: 187 This Week
    Last Update:
    See Project
  • 22
    PII-Blackout

    PII-Blackout

    100% offline, AI-powered PDF redaction

    PII Blackout — 100% Offline, Automated PDF & Image Redaction Protecting sensitive data shouldn't mean risking your privacy by uploading files to online tools. PII Blackout is a powerful, local-first desktop application that automates the tedious process of finding and masking Personally Identifiable Information (PII) with military-grade security. Key Features: Smart Auto-Detection & Redaction Manually searching for emails, phone numbers, tax IDs, and names is a thing of the past. PII...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    QUAST

    QUAST

    Quality Assessment Tool for Genome Assemblies

    QUAST performs fast and convenient quality evaluation and comparison of genome assemblies. It is maintained by the Gurevich lab at HIPS (https://helmholtz-hips.de/en/hmsb). For the most up-to-date description, please visit http://quast.sf.net. Below are just some highlights. QUAST computes several well-known metrics, including contig accuracy, the number of genes discovered, N50, and others, as well as introducing new ones, like NA50 (see details in the paper and manual). A...
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 24
    BoolHub

    BoolHub

    A fully functional personal information management software.

    [Native application]: Developed with QT, pure native application, faster response; Cloud synchronization: Data is recorded in the cloud, and the data is no longer lost. Data synchronization is unlimited, everything is in your control; [Rich note types]: Support rich text, Markdown, code, tables, drawings, flowcharts and other note formats; [Manage your customers]: A customer relationship management system that supports team collaboration to grasp every lead and every customer; [Manage...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    Financial Calculator (Linux)

    Financial Calculator (Linux)

    Professional financial tools: 16 sections with text and visual reports

    Take control of your finances with Financial Calculator 8.0, a professional yet easy-to-use desktop tool built for precise, everyday financial calculations. From planning loans and estimating taxes to generating invoices and creating QR codes, this all-in-one software offers 16 specialized sections that make your daily financial and business tasks faster, clearer, and more accurate. Whether you’re a student, small business owner, or finance professional, Financial Calculator brings...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB