Showing 4 open source projects for "html source extractor"

View related business solutions
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Savvy DOCX Recovery

    Savvy DOCX Recovery

    Open corrupt Word DOCX files and possibly recover formatting too.

    XML was designed from the beginning to be intolerant of errors. This decision adversely affects MS Word's corruption recovery. With one error in the document.xml subfile where all the DOCX file's text is stored, instead of a partial recovery, Word will stop and throw an error. Savvy DOCX Recovery attempts to do precise surgery on corrupt Word documents to reorder or excise bad XML tags. If this doesn't work, it uses the command line app xmllint first to attempt to repair corrupt XML...
    Leader badge
    Downloads: 123 This Week
    Last Update:
    See Project
  • 2
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    pandas-datareader

    pandas-datareader

    Extract data from a wide range of Internet sources

    Up-to-date remote data access for pandas. Works for multiple versions of pandas. Install using pip and then import and use one of the data readers. This example reads 5-years of 10-year constant maturity yields on U.S. government bonds. Stable documentation is available on github.io. A second copy of the stable documentation is hosted on read the docs for more details.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Corrupt Extractor for Microsoft Office

    Corrupt Extractor for Microsoft Office

    Extracts text/data from corrupt MS Office 2007-13 format files.

    Corrupt Office 2007 Extractor will extract the text/data from corrupt docx, xlsx, and pptx files where the respective MS Office files error out and refuse to open. In advanced mode the program can fix the zip structure of "Office Open XML" format files, a step which I now recommend despite our dissuasive blurb which comes up when you start that function. Advanced mode also allows recovering images and includes is a basic editor for editing the corrupt XML subfiles. Additionally I...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB