Open Source Python Text Processing Software

Python Text Processing Software

View 91 business solutions

Browse free open source Python Text Processing Software and projects below. Use the toggles on the left to filter open source Python Text Processing Software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Powering the best of the internet | Fastly Icon
    Powering the best of the internet | Fastly

    Fastly's edge cloud platform delivers faster, safer, and more scalable sites and apps to customers.

    Ensure your websites, applications and services can effortlessly handle the demands of your users with Fastly. Fastly’s portfolio is designed to be highly performant, personalized and secure while seamlessly scaling to support your growth.
    Try for free
  • 1
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 12,402 This Week
    Last Update:
    See Project
  • 2
    Notepad++ Python Script

    Notepad++ Python Script

    A Python Scripting plugin for Notepad++

    A Python Scripting plugin for Notepad++. Complete easy script access to all of the editor's features (including absolutely everything in Scintilla). Configurable menus and toolbar options, assign shortcuts to scripts.
    Leader badge
    Downloads: 266 This Week
    Last Update:
    See Project
  • 3
    Diffuse
    Diffuse is a graphical tool for comparing and merging text files. It can retrieve files for comparison from Bazaar, CVS, Darcs, Git, Mercurial, Monotone, RCS, Subversion, and SVK repositories.
    Leader badge
    Downloads: 206 This Week
    Last Update:
    See Project
  • 4
    Utilities for general- and special-purpose documentation. Includes reStructuredText, the easy to read, easy to use, what-you-see-is-what-you-get plaintext markup language.
    Leader badge
    Downloads: 117 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 5
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Leader badge
    Downloads: 59 This Week
    Last Update:
    See Project
  • 6
    meld-installer

    meld-installer

    Meld Installer for Windows

    Bundles Portable Python (with PyGTK) and Meld together in an easy to use installer. This allows you to not have to worry about setting up Python or PyGTK and you can keep Meld's Python separate from other Python installations on your machine. ** NOTE ** Meld 3.11 and later now have official installers, hence this project is no longer supported. You can download the new installer here: https://download.gnome.org/binaries/win32/meld/. You should uninstall the old 1.8 version before upgrading.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 7
    Tomoe is a handwriting character recognition engine.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 8
    Alphabetizer

    Alphabetizer

    Take a list of words or sentences and arrange them alphabetically.

    Alphabetizer lets anyone take a list of words or sentences and arranged them in alphabetical order easily. Alphabetizer is a tool that takes a list of words or phrases and arranged them in alphabetical order. This tool is useful for organizing information, creating glossaries, sorting names, or any task where the items in a list need to be in alphabetical order. Overall, Alphabetizer can save time and effort by quickly organizing information and making it easier to read and comprehend.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 9
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both. Supports word inflection (pluralization and singularization) and lemmatization, as well as spelling correction. Add new models or languages through extensions. Also, it comes with a WordNet integration. If you only intend to use TextBlob’s default models (no model overrides), you can pass the lite argument. This downloads only those corpora needed for basic functionality. TextBlob is also available as a conda package.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Picsart Enterprise Background Removal API for Stunning eCommerce Visuals Icon
    Picsart Enterprise Background Removal API for Stunning eCommerce Visuals

    Instantly remove the background from your images in just one click.

    With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.
    Learn More
  • 10
    PyRtfLib is a python library that provides a parser and few translators like rtf to html and to simple text.
    Downloads: 26 This Week
    Last Update:
    See Project
  • 11
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    PyRTF is a pure python module for the efficient creation of RTF documents.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Queequeg is an English grammar checker for non-native English speakers.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    TeXML is an XML vocabulary for TeX. The processor transforms the TeXML markup into the TeX markup, escaping special and out-of-encoding characters. The intended audience is developers who automatically generate [La]TeX or ConTeXt files.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    DrPython is a highly customizable cross-platform ide to aid programming in Python. It was developed with teaching in mind, and has a clean, simple interface. It is written in Python, using wxPython as the gui.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    DocLite is a simple documentation authoring system, it produces multi-page HTML output (like this document) in a style similar to that found in the Linux HOTWTOs or other DocBook created documents.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    The Python scripts for the conversion from the Chinese Pinyin transcription(ISO 7098) to International Phonetic Alphabet(IPA), comprised of a core module for developers and a flexible GUI application for the common end-users on Modern Chinese phonetics.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    SE|PY is an ActionScript editor written in python, wxPython and using scintilla for text highlight, code collapsing. some features: snippets panel, functions panel and much more. Contain also Flush
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Pyana is a extension module that allows Python programs to interface with the Apache Software Foundation's Xalan XSLT transformation engine.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    xml2txt is a text formatter for XMl in the same way the FO is a PDF formatter. It uses python to convert an XML document to well-formatted text, wtih borders, indents, and tables.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Wyneken is a content-oriented text processor that makes your life as a student easier by allowing you to create and manage digital notebooks. Wyneken also allows you to create PDF presentations, letters, articles, and reports. In 2015, Wyneken may or may not work with the latest Linux distributions, but you can use it for building pdfs by pulling our docker repo indicated in the homepage field of this site. The docker site has the information on how to invoke the container, which is based on Fedora 23 and has a texlive stack.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    The converter performs automatically the full process of converting the files of a C project into the equivalent C++ files. Classes are created, var and functions becomes attributes and methods and the changes are propagated into all files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    DocScript is an approach to document preparation. It presents tools and utilities to edit and publish documents. The philosophy behind the DocScript project is to utilize the programming tools you're working with anyway in your daily work.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    A Python class to convert a PCL document to plain ASCII text. PCL is HP's Printer Control Language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    arCHMage

    A reader and decompiler for files in the CHM format

    arCHMage is a reader and decompiler for files in the CHM format. This is the format used by Microsoft HTML Help, and is also known as Compiled HTML.
    Downloads: 1 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.