Showing 17 open source projects for "python text parser"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make every...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    The CSS Parser is implemented as a package of Java classes, that inputs Cascading Style Sheets source text and outputs a Document Object Model Level 2 Style tree. Alternatively, applications can use SAC: The Simple API for CSS. Its purpose is to allow developers working with Java to incorporate Cascading Style Sheet information, primarily in conjunction with XML application developments. The CSS Parser package includes parsers for: * Cascading Style Sheets Level 3, * Cascading...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    JDynamiTe, Dynamic Template in Java

    JDynamiTe, Dynamic Template in Java

    Dynamically generate documents from templates

    JDynamiTe is a tool which allows you to dynamically create documents in any format from "template" documents. And very few lines of code (or no line at all!) are needed to do that. Some typical usage domains of JDynamiTe are: - dynamic Web pages creation, - text document generation, - source code generation... In fact, it can be useful in any case where pre-defined documents (templates) have to be dynamically populated with data. The main benefit of JDynamiTe is to allow a true...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Enterprise
    Enterprise is an open source monitor and advanced log parser. Based on Enterprise ship state system controller of Star Trek, It is able to let you know about the state of your services in a given time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LiMa means Lightweight Markup Language. It is a parser for an easy to use ASCII/Text-based markup - comparable to Markdown or the Wikipedia-Markup language with special configurable extensions in defining Links and image-resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 10
    Markout is a pure-Java lightweight wiki markup parser based on John Gruber's Markdown.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Web documents that look similar often use different HTML tags to achieve their layout effect. These tags often make it difficult for a machine to find text or images of interest. Our goal is to implement a parser to overcome this.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Piccolo is the fastest SAX parser for Java, supporting SAX1, SAX2, and JAXP (SAX only). Piccolo is different from other parsers in that it was developed using parser generators. It weighs 160K including XML APIs. See http://piccolo.sf.net for more info.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Chaperon is a LALR(1) parser, which parse structured text documents and generate XML documents as output. It includes a parser generator like yacc and a regex scaner like lex. As input use Chaperon a grammar written in XML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Java API to process or parse HTML documents. If your Java application needs or would like to be able to process some text in HTML format, you'd probably find this API interesting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Suite of multi-language templating system classes, loosely based on PEAR/PHPlib template class. The classes may be used to parse any text based template (HTML, ASCII, text, etc). Already available: - JScript - Python - PHP Planned: - VBscript - Ja
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    A WebDAV browser and remote file editing framework written in Java, including an RFC-2518-compliant WEBDAV client library with optional SSL support, a low-level DAV command-line client (written in JPython), and a built-in text editing component.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Frosttie (FROnt-end SchemaTron Text Internet Engine) takes XHTML pages and processes them with various user-definable filters such a W3C's WAI, Section 508 (US) web usability compliance, ad removal, etc. It can be used with zKnowMan.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.