27 programs for "scrape text from html" with 2 filters applied:

  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
    Get free CRM
  • 1
    Writer2LaTeX and Writer2xhtml is a collection of converters from OpenDocument Format (ODF) to LaTeX/BibTeX, HTML+MathML and EPUB. It is delivered as a standalone java library, as a command line application and as extensions for LibreOffice.
    Leader badge
    Downloads: 43 This Week
    Last Update:
    See Project
  • 2
    metaf2xml

    metaf2xml

    Parse and decode METAR, TAF, SYNOP, BUOY, AMDAR and write data as XML

    metaf2xml can download, parse and decode aviation routine weather reports (METAR, SPECI, SAO), aerodrome forecasts (TAF), synoptic observations (SYNOP), observations from buoys (BUOY) and meteorological reports from aircrafts (AMDAR). Data can also be taken from decoded BUFR messages. The extracted data can be written as XML or passed to a user-defined function (all done in Perl). It also provides XSLT style sheets to convert the XML to plain language (text, HTML), or XML with different...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    adx - addressbook.xml

    adx - addressbook.xml

    Minimalistic address book in web browser. No server or plugin needed.

    Minimalistic but full-featured addressbook in your web browser. adx is a standalone and portable web app (online and offline). FEATURES Contact Management, portable, small (~350KB), lightweight, contact tagging, geo mapping, web accounts, trigger phone/Skype calls, etc. EXPORT FUNCTIONALITY vCard (as file or QR code via offline generator) HOW IT WORKS Your address-book (XML file) is transformed in your web browser (via XSLT) to a full-featured web application (HTML...
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4

    RecordEditor

    Editor for Fixed Width, Csv and Existing Xml files.

    The RecordEditor is a Data File editor for Flat Files (delimited and fixed field position). It supports Unix / PC / Legacy (e.g. Mainframe) file formats, both Text and binary files. The Editor uses a Record-Layout description to format the files. This is ideal for Fixed width (Text or Binary) files, Cobol Data Files, Mainframe files and complicated Csv files. Cobol Copybooks can be used to format Cobol Data files. As well as an editor, The following utilities are supplied * Formatted...
    Leader badge
    Downloads: 43 This Week
    Last Update:
    See Project
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 5
    XML Editor/Validator/Designer with CAMV

    XML Editor/Validator/Designer with CAMV

    CAM XML Editor for XML+JSON+Hibernate+SQL Open-XDX sponsored by Oracle

    ..., & OASIS modes) + JAXB bindings; Mindmap FreeMind or UML models(XMI); XML unit test & live SQL data; HTML docs + spreadsheets (NIEM IEPDs). Canonical component dictionaries from schema sets, SQL, JSON, ERwin XSD, or spreadsheets. The XML CAM templates (OASIS standard) store the exchange structure, content model, code lists, DBMappings, SQL lookups+business rules (XPath). Java CAMV XML/JSON validation engine is a complete exchange test framework [XMLUnit, TEAM(Schematron)]. Java/Eclipse +Saxon/XSL
    Downloads: 39 This Week
    Last Update:
    See Project
  • 6
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions...
    Leader badge
    Downloads: 511 This Week
    Last Update:
    See Project
  • 7
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    dvi2bitmap is a utility to convert TeX DVI files directly to bitmaps, without going through the complicated (and slow!) route of conversion via PostScript and PNM. The prime motivation for this is to prepare mathematical equations for inclusion in HTML files, but there is a broad range of uses beyond that. dvi2bitmap... * is written in portable C++, and the program acts as a wrapper round the libdvi2bitmap library (both static and shareable), which abstracts DVI and PK files...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    JSONjuicer

    JSON parser and encoder

    A Java open-source library which makes encoding and decoding Java data-structures to and from JSON text easy and intuitive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Save hundreds of developer hours with components built for SaaS applications. Icon
    Save hundreds of developer hours with components built for SaaS applications.

    The #1 Embedded Analytics Solution for SaaS Teams.

    Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.
    Try Developer Playground
  • 10
    LaTeX Web Publisher

    LaTeX Web Publisher

    LaTeX Web Publisher is a Makefile based Web publishing system

    LaTeX Web Publisher is a Makefile based Web publishing system featuring content creation into HTML, non-split HTML, HTML Zip, PDF, DjVu, PostScript, DVI and Plain text formats. All LaTeX Web Publisher output formats are from a single LaTeX source and have indices. LaTeX Web Publisher can be used for website creation and has FTP deployment capabilities. A website created with LaTeX Web Publisher will have HTML, non-split HTML and PDF content formats. The website will have complete HTML...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Z Notation E-Mail Mark-up Tools

    Tools to convert Z mark-up to HTML or text.

    A small library and two command-line tools to parse and convert Z notation from the "e-mail" mark-up into HTML code, or into UTF-8 text with box-drawing graphics, or into the Z Standard text format. See the project's Wiki Home Page for details --- the "Wiki" button in the bar above, or the following link:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    NAT Braille

    NAT Braille

    A free universal Braille Transcriber

    NAT is a free universal Braille translator. It supports French Braille grade 1, mathematical Braille, Braille layout and reverse transcription. French Braille grade 2, music and other languages are currently under development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Crème Fraiche

    Crème Fraiche

    eml2pdf converter

    I DO NO LONGER CLAIM PLATFORM-INDEPEDENCE FOR Crème Fraiche. THIS PROGRAM RUNS ON LINUX. Crème Fraiche transforms EML-files, as they are created by email-clients, to PDF. PSE see the rubygems.org site for updates or use the gem-tool right away to install Crème Fraiche: ~$ gem install cremefraiche
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HTML++

    HTML++

    Object-oriented generation of HTML code on .NET 4 Client Profile

    This library allows you to generate HTML pages directly from your code in a strongly typed, compositional, safe and concise manner. Requires the .NET Framework 4 Client Profile only. It is licensed under LGPL, which means that you may use it in commercial products. The project logo is from Mariano Real, who kindly provided it under a Creative-Commons license.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15

    htmlpicker

    Picks up text from a web page using a html template.

    A java html picker - text extractor Picks up text from a web page using a html template. Useful if you have regularly data to extract from the same site. You may use the same url or you may build urls having parameters. These parameters are fetch from a text file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    javawebutils

    web application utilities

    This library contains utility classes such as a converter from plain text to HTML (for safe inclusion of user-supplied text into web pages, avoiding XSS attacks, etc.), converters from binary to hex representation, and similar functions
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Search and export numerics from any text/ascii file. Data sets (scalar, vector, matrix) are given unique names, based on file content. Results can be generated for Matlab, IDL, Scilab, Octave, XML, HTML A wrapper exists for direct usage from Matlab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    RTF2HTML is a name for a cross-platform C++ library (DLL, OCX) and command-line utility, which is intended to convert documents from Rich Text Format (e.g. Word, OO Writer) to HTML. Its features are tiny size, speed, low mem usage and compact output.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Takes data from a text table and wraps fields with html tags, this way building an html page ready to be served to your iPhone or iPod Touch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Use Xilize to create XHTML pages or entire websites with just a plain-text editor. The markup is similar to Textile and extensible via BeanShell. Run as a jEdit plugin, from the command line, or embed in a Java program. Small, fast, easy-to-use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    This is a component type of toolkit, which can be used to build a HTML rich editor. Developers can use them to build a rich editor rapidly, Or they can easily debug and develop other User Interface Components from this foundation again.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Includes tools for creating ebooks in xml-format. xTrans helps in creating an XML-Ebook from plain text like RTF, TXT. XTrans converts xml-ebooks into the final format like PDF, HTML, RTF, PDB (various forms), ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Strip out useless tags and other junk from HTML files. Shrink files, enhance readability of HTML source, promote privacy, and clean HTML exported from Microsoft Word (MS-Word). Run HTMLStrip as-is or customize it with your own regular expressions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Mediawiki-PDF is a mediawiki extension to convert wiki articles into PDF Documents. The extension uses HTMLDOC to convert the wiki pages from plain HTML into PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    eBookMagus is a tool for converting standard ASCII-based eBooks (such as those from Project Gutenberg) into multiple HTML files that are easier to read, especially on the iPhone.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next