Showing 143 open source projects for "python text parser"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    Concurrence is a networked file editing program that enables multiple people to modify a document simultaneously. It is written entirely in Python, and uses the wxPython library for the GUI and the Twisted library for networking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Scratchy is an Apache log file parser and HTML report generator written in Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Web documents that look similar often use different HTML tags to achieve their layout effect. These tags often make it difficult for a machine to find text or images of interest. Our goal is to implement a parser to overcome this.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    pdfplayground is a collection of tools written in python to read/write PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Ultimate Quiz Maker & Engagement Platform Icon
    The Ultimate Quiz Maker & Engagement Platform

    Powering publishers, brands, and sports teams with 30+ interactive content types. Maximize engagement and revenue with Riddle.

    Riddle is an online platform for creating interactive content such as quizzes, surveys, personality tests, prediction games, and leaderboards. Our customers create content on our platform and then embed it on their website. The goal? Increased engagement, lead generation, segmentation, and content monetization - all 100% GDPR compliant.
    Try for free
  • 5
    Pyana is a extension module that allows Python programs to interface with the Apache Software Foundation's Xalan XSLT transformation engine.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    BlogMatrix Jäger is an extensible, one-panel weblog and rss aggregator and podcasting client. The project uses wxPython and runs on both Windows and Macintosh as normal application. This code also includes the "Universal Search Parser"
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    A multiplatform visual implementation of the Unix utility grep
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Process OpenOffice.org Writer Files and transform them to PDF without installing OpenOffice.org What is PyOpenOffice? * It is a class library, written in the Python Language. * It is a platform-independent command-line utility (many abilitie
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Piccolo is the fastest SAX parser for Java, supporting SAX1, SAX2, and JAXP (SAX only). Piccolo is different from other parsers in that it was developed using parser generators. It weighs 160K including XML APIs. See http://piccolo.sf.net for more info.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    pso- Python Service objects is a package that simplifies HTTP handlers: Built-in sessions. Write once run on modpython, modsnake, NASAPY, fastcgi, CGI. Easy interface to HTTP info. Simple, fast, robust and powerful extendable OO template parser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Object oriented PHP based HTML parser. The HtmlParser class allows you to interate through HTML nodes and get their attributes, names and values. It also comes with an example class for converting HTML to formatted ASCII text.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    The tre project is simply the author\'s attempt at mimicking artificial intelligence, using a home-made HTTP daemon and simple text recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    pyBlog is yet another blogging software written in python for creating custom blogs. It works in offline mode containing an ftp client for up- and downloading files to any web server. It supports multiple users (bloggers) and blogging via mail (optional)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Lupy is a full text indexer for Python. It is a port of Jakarta Lucene 1.2 to Python. Specifically, it reads and writes indexes in Lucene binary format. Like Lucene, it is sophisticated and scalable.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    pyLJclient, a cross-platform livejournal client, written in Python, with a wxPython based gui.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MAD is acronym for \'Monitor, Analyse and Delivery\'. Project\'s goal is create some scripts for periodicall checkups for new messages in interested forums, extract it into portable text format without html-junk and annoying advertisments, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Chaperon is a LALR(1) parser, which parse structured text documents and generate XML documents as output. It includes a parser generator like yacc and a regex scaner like lex. As input use Chaperon a grammar written in XML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    LAN log parser for counter-strike can be used to convert multiple names on log files to one name for every player. psychostats and HL stats don't work on LAN because of "wrong" WON IDs. and the nametrack doesn't work too, because of those nickfakers
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Mamba is a extensible xml templates preprocessor wrote in Python. Using it, you can rapidly develop powerful applications ready to integrate with the internet. It can be used to work as a generic CGI program or for generate content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    xSiteable is a fully relational website compiler written entirely in XSLT, using topic maps (using XTM directly) as the backbone information technology, bundled with the fast Sablotron XSLT parser, a GUI admin tool and other nifty features. Watch this sp
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A File System Folder behaves just like a normal Zope Folder, except it can export its contents to individual human-readable text files in the file system, and then re-import those same files, enabling (for example) proper version control.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Pyndex is a simple and fast full-text indexer and Bayesian classifier implemented in Python. It uses Metakit as its storage back-end. It works well for quickly adding search to an application, and is also well suited to in-memory indexing and search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    This has a Python ICAP (Internet Content Adaptation Protocol) server and IRML parser. Using this, a web proxy can do rule-based adaptation of content before delivery to clients. Has a Python proxylet API and squid-icap-client code too.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Java API to process or parse HTML documents. If your Java application needs or would like to be able to process some text in HTML format, you'd probably find this API interesting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    POST (Python Obviously Simple Text) provides support for simple, flexible dynamic document generation in multiple output formats. Supports inputs in text or XML, outputs in HTML, PDF, RTF, LaTeX source, nroff source, postscript, and plain text.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.