Showing 23 open source projects for "python text parser"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Froala Editor

    Froala Editor

    The next generation Javascript WYSIWYG HTML Editor

    Froala Editor is a lightweight WYSIWYG HTML Editor written in Javascript that enables rich text editing capabilities for your applications. Froala WYSIWYG HTML Editor is one of the most powerful JavaScript rich text editors ever. Froala Rich Text Editor has a vast range of both simple and complex features for all kind of use cases. Lots of features don't have to overwhelm the user with hundreds of buttons. The Froala's WYSIWYG editor smart toolbar can accommodate over 100 features in this...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    html-loader

    html-loader

    HTML Loader

    ...Allows to setup which tags and attributes to process and how, as well as the ability to filter some of them. Filter can also be used to extend the supported elements and attributes. By default, the parser in html-loader interprets content inside noscript tags as #text, so processing of content inside this tag will be ignored. A very common scenario is exporting the HTML into their own .html file, to serve them directly instead of injecting with javascript.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Atributika

    Atributika

    Convert text with HTML tags, links, hashtags, mentions, etc.

    Atributika is an easy and painless way to build NSAttributedString. It is able to detect HTML-like tags, links, phone numbers, hashtags, any regex or even standard ios data detectors and style them with various attributes like font, color, etc. Atributika comes with drop-in label replacement AttributedLabel which is able to make any detection clickable. NSAttributedString is really powerful but still a low-level API that requires a lot of work to set up things. It is especially painful if...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    CssSelector Component

    CssSelector Component

    Converts CSS selectors to XPath expressions

    XPath expressions are incredibly flexible, so there is almost always an XPath expression that will find the element you need. Unfortunately, they can also become very complicated, and the learning curve is steep. Even common operations (such as finding an element with a particular class) can require long and unwieldy expressions. CSS selectors are less powerful than XPath, but far easier to write, read and understand. Since they are less powerful, almost all CSS selectors can be converted to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    HTML parser in Delphi

    A Delphi class with functions to read and dissect a HTML file

    THTMLdom is a (Delphi) class with functions to read a HTML source file and dissect it into a tree of THTMLelement. The attributes of the HTML tags are stored in the elements. Functions are provided to select elements on the basis of the attribute values or tag names. The structure of the tree can be shown and it can be rendered as plain text. The source is plain Delphi pascal, requiring a version that supports Tdictionary. There is no dependency on 3rd party units. The file to be parsed...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 7
    HTMLMinifier

    HTMLMinifier

    Javascript-based HTML compressor/minifier (with Node.js support)

    HTMLMinifier is a highly configurable, well-tested, JavaScript-based HTML minifier. Minifier options like sortAttributes and sortClassName won't impact the plain-text size of the output. However, they form long repetitive chains of characters that should improve compression ratio of gzip used in HTTP compression. SVG tags are automatically recognized, and when they are minified, both case-sensitivity and closing slashes are preserved, regardless of the minification settings used for the rest...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8

    dpanalyzer

    postprocessing tool for Project Gutenberg Distributed Proofreaders

    Specialized tool for PostProcessors of books produced by Project Gutenberg Distributed Proofreaders. Parses the markup structure of a project file out of the formatting rounds; reports about the text structure found, and identifies markup errors. Planned future features: generation of normalized dp output by rejoining split paragraphs and moving around footnotes, renumbering of pages; conversion to basic LaTeX and basic HTML markup for further processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 1 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 10
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Gumbo

    Gumbo

    An HTML5 parsing library in pure C99

    Gumbo is an implementation of the HTML5 parsing algorithm implemented as a pure C99 library with no outside dependencies. It's designed to serve as a building block for other tools and libraries such as linters, validators, templating languages, and refactoring and analysis tools. Gumbo gains some of this by virtue of being written in C, but it is not an important consideration for the intended use-case, and was not a major design factor. Gumbo is intentionally designed to turn an HTML...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    HTML XHTML Parser + XPath

    Delphi HTML XHTML Parser +XPath

    Delphi HTML Parser This module lets you work with HTML documents as DOM tree and use XPath for searching tags. It is very simple way to parse HTML. This tested with version Delphi XE5,6 Usage Add in Uses parser.pas; begin HtmlTxt:= ''; //here your html NodeList:= TNodeList.Create; ValueList:= TStringList.Create; DomTree:= TDomTree.Create; DomTreeNode:= DomTree.RootNode; If DomTreeNode.RunParse(HtmlTxt) then begin {your code example: DomTreeNode.FindXPath('//*[@id="TopBox"]/div[1]/div[@class="draw default"]'),NodeList,ValueList)} end; end; Xpath support: attributes - //*[@id="TopBox"]/div/@class comment - //*[@id="TopBox"]/div/comment()[3] text - //*[@id="TopBox"]/div/text()[2] previous level - /.....
    Downloads: 14 This Week
    Last Update:
    See Project
  • 13
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    HTML DOM Parser

    HTML parser which can be used for screen-scraping applications

    htmldom parses the HTML file and provides methods for iterating and searching the parse tree in a similar way as Jquery. To report bugs please mail me at bhimsen.pes@gmail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Html Assembler
    Html Assembler is a static site generator. It automatically integrates page content such as text and photos in a modifiable page template creating a complete set of html files ready for upload to your site.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Create or parse ANY Mark-up Language (HTML XML X3D VRML MathML XAML XDP CDA SCORM COLLADA XBRL) file or string into a simple and versatile MLDocument, MLElement, MLParameter hierarchical object model, written in VB 6 (Win32). Alternative to using DOM.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    A JavaScript library for parsing Creole 1.0 wiki markup.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    ZML, the Zeitung Markup Language, is a simple CMS for small newspapers. It was specifically designed to publish a student newspaper in print and on the Web. It uses LaTeX and XHTML. So far, it is documented in German only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Markout is a pure-Java lightweight wiki markup parser based on John Gruber's Markdown.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A Python tool for creating websites or project documentation. Pages can be stored as reST (text) or html. With a simple templating and macro system it can autogenerate index pages and navigation links. Facilities for multiple translations as well.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PyBookmark manipulates bookmark files. It can sync files (no server required), merge, sort, remove duplicates, and check links. Its library pybookmarklib provides access to these operations, data structures, and parser for further extensibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A .Net (C#) program to convert grammar based text (like code) to colorful, CSS based HTML. Based on the GOLD parser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next