Showing 46 open source projects for "scrape text from html"

View related business solutions
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • SKUDONET Open Source Load Balancer Icon
    SKUDONET Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    SKUDONET ADC, operates at the application layer, efficiently distributing network load and application load across multiple servers. This not only enhances the performance of your application but also ensures that your web servers can handle more traffic seamlessly.
  • 1
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. Includes features like Create PDF from Images, HTML, Text files. Create a processing log file. Extract Page, Split Page, Rotate Page, Merge Page, Duplicate page, Move Page, Printing, and Compress Page. Improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc. Super PDF Editor is best for bulk pdf processing, especially for the printing industry...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 2
    Fidus Writer

    Fidus Writer

    Fidus Writer is an online collaborative editor for academics

    Fidus Writer is an online collaborative editor especially made for academics who need to use citations and/or formulas. The editor focuses on the content rather than the layout, so that with the same text, you can later on publish it in multiple ways: On a website, as a printed book, or as an ebook. In each case, you can choose from a number of layouts that are adequate for the medium of choice.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Java Tablesaw

    Java Tablesaw

    Java dataframe and visualization library

    Tablesaw is a dataframe and visualization library that supports loading, cleaning, transforming, filtering, and summarizing data. If you work with data in Java, it may save you time and effort. Tablesaw also supports descriptive statistics and can be used to prepare data for working with machine learning libraries like Smile, Tribuo, H20.ai, DL4J. Import data from RDBMS, Excel, CSV, TSV, JSON, HTML, or Fixed Width text files, whether they are local or remote (http, S3, etc.) Tablesaw supports...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    ... technological architecture design, OpenKM meets the document management needs of businesses of all sizes (from SMEs to big corporations). Thanks to its elegant and intuitive interface, OpenKM transforms complex operations into easy tasks. The most relevant functions of OpenKM is the indexing of the most common types of files: text, Office, Office 2007, OpenOffice, PDF, HTML, XML, MP3, JPEG, etc. For a complete feature list take a look at http://goo.gl/au8cQy
    Leader badge
    Downloads: 653 This Week
    Last Update:
    See Project
  • Find out just how much your login box can do for your customer | Auth0 Icon
    Find out just how much your login box can do for your customer | Auth0

    With over 53 social login options, you can fast-track the signup and login experience for users.

    From improving customer experience through seamless sign-on to making MFA as easy as a click of a button – your login box must find the right balance between user convenience, privacy and security.
  • 5
    CopyQ

    CopyQ

    Clipboard manager with advanced features

    CopyQ is advanced clipboard manager with searchable and editable history with support for image formats, command line control and more.
    Leader badge
    Downloads: 147 This Week
    Last Update:
    See Project
  • 6
    Writer2LaTeX and Writer2xhtml is a collection of converters from OpenDocument Format (ODF) to LaTeX/BibTeX, HTML+MathML and EPUB. It is delivered as a standalone java library, as a command line application and as extensions for LibreOffice.
    Leader badge
    Downloads: 39 This Week
    Last Update:
    See Project
  • 7
    Office Search

    Office Search

    Desktop Full-Text Search inside text and Microsoft Office files.

    Search inside Microsoft Office (Word, Excel, Power Point), LibreOffice (Writer, Calc, Impress), Visio and text/ASCII files (RTF/TXT/CSV/MD/HTML etc.). For all other files it will use fuzzy logic to check if file is text or binary. If text, it will search contents of the file for a match. Works on Windows 7 or above. Requires .NET framework 4.7 or above. Open source software developed in VB.NET 2019.
    Leader badge
    Downloads: 52 This Week
    Last Update:
    See Project
  • 8
    FastReport Open Source

    FastReport Open Source

    Free Open Source Reporting tool for .NET

    Free Open Source Reporting tool for .NET Core/.NET Framework that helps your application generate document-like reports.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 9
    Super PDF Editor Lite

    Super PDF Editor Lite

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign.

    Super PDF Editor Lite is a robust and versatile PDF management software designed to streamline your document handling needs. Whether you're an individual, student, or professional, this software offers a comprehensive suite of tools to create, edit, and manage your PDFs with ease. Key Features: Extract Page: Easily extract specific pages from a PDF document. Split Page: Divide a single PDF page into multiple smaller pages. Rotate Page: Rotate pages to adjust their orientation. Merge Page...
    Downloads: 16 This Week
    Last Update:
    See Project
  • Let your volunteer coordinators do their best work. Icon
    Let your volunteer coordinators do their best work.

    For non-profit organizations requiring a software solution to keep track of volunteers

    Stop messing with tools that aren’t designed to amplify volunteer programs. With VolunteerMatters, it’s a delight to manage everything in one place.
  • 10
    adx - addressbook.xml

    adx - addressbook.xml

    Minimalistic address book in web browser. No server or plugin needed.

    Minimalistic but full-featured addressbook in your web browser. adx is a standalone and portable web app (online and offline). FEATURES Contact Management, portable, small (~350KB), lightweight, contact tagging, geo mapping, web accounts, trigger phone/Skype calls, etc. EXPORT FUNCTIONALITY vCard (as file or QR code via offline generator) HOW IT WORKS Your address-book (XML file) is transformed in your web browser (via XSLT) to a full-featured web application (HTML). REQUIREMENTS Web...
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing in C++17/20

    DocWire SDK, a standout C++17/20 data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. The upcoming integration of C++17 and C++20 will bring advanced functionalities, particularly in areas like HTTP capabilities and web data extraction. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Pinasi(win32bit)

    Pinasi(win32bit)

    Array Data Processing Application

    Pinasi v1.15 Pinasi is a data processing application, which is used to input, process and output data. some examples of input are text, numbers, files, dates and others some examples of the process are mathematical. some examples of output are tables, graphs, pivots, and others. Pinasi is licensed under CC BY-NC 4.0. and created with NWjs. NW.js is an app runtime based on Chromium and node.js. You can write native apps in HTML and JavaScript with NW.js. It also lets you call Node.js...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Pinasi(win64bit)

    Pinasi(win64bit)

    Array Data Processing Application

    Pinasi v1.15 Pinasi is a data processing application, which is used to input, process and output data. some examples of input are text, numbers, files, dates and others some examples of the process are mathematical. some examples of output are tables, graphs, pivots, and others. Pinasi is licensed under CC BY-NC 4.0. and created with NWjs. NW.js is an app runtime based on Chromium and node.js. You can write native apps in HTML and JavaScript with NW.js. It also lets you call Node.js...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Pinasi(linux64bit)

    Pinasi(linux64bit)

    Array Data Processing Application

    Pinasi v1.15 Pinasi is a data processing application, which is used to input, process and output data. some examples of input are text, numbers, files, dates and others some examples of the process are mathematical. some examples of output are tables, graphs, pivots, and others. Pinasi is licensed under CC BY-NC 4.0. and created with NWjs. NW.js is an app runtime based on Chromium and node.js. You can write native apps in HTML and JavaScript with NW.js. It also lets you call Node.js modules...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    qPDFconvert

    qPDFconvert

    pdf converter

    Linux Converter for pdf to html and text
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Tailwind Starter Kit

    Tailwind Starter Kit

    Tailwind Starter Kit a beautiful extension for TailwindCSS, Free

    Tailwind Starter Kit is Free and Open Source. It does not change or add any CSS to the already one from TailwindCSS. It features multiple HTML elements and it comes with dynamic components for ReactJS, Vue and Angular. Tailwind Starter Kit comes with a huge number of Fully Coded CSS components. This extension also comes with 3 sample pages. They are fully coded so you can start working instantly. We also feature many dynamic components for React, Vue and Angular. Putting together a page has...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    HTML Article Generator

    HTML Article Generator

    Quickly create custom webpages from your content

    HTML Article Generator is a tool for quickly generating webpages based on content you enter, including both text and images. These webpages can be customised to give a unique appearance, with a selection of 5 different themes. Other features include the ability to save the current values you have entered and restore these values after future changes have been made. Images can have caption text added to them and given alt text to improve accessibility. Each webpage can also be given a favourite...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    IPyPublish

    IPyPublish

    Workflow for creating and editing publication ready scientific reports

    A program for creating and editing publication-ready scientific reports and presentations, from one or more Jupyter Notebooks. Dynamically (and reproducibly) explore data, run code, and output the results. Dynamically edit and visualize the basic components of the document (text, math, figures, tables, references, citations, etc). Have precise control over what elements are output to the final document and how they are layed out and typeset. Also be able to output the same source document...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Data Science at the Command Line

    Data Science at the Command Line

    Data science at the command line

    Command Line by Jeroen Janssens, published by O’Reilly Media in October 2021. Obtain, scrub, explore, and model data with Unix Power Tools. This repository contains the full text, data, and scripts used in the second edition of the book Data Science at the Command Line by Jeroen Janssens. This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small yet powerful command...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Helpy

    Helpy

    A Modern Helpdesk Platform

    Helpy is a modern, self-hosted, on-premise customer support helpdesk platform designed from the ground up to give your customers a heroic customer service experience. Written in Ruby on Rails, Helpy seamlessly integrates support ticketing, Knowledgebase and a public community into one powerful solution. Helpy powers your helpcenter by providing a host of exceptional features, including multichannel ticketing, a full text searchable and SEO optimized Knowledgebase, community support forums...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    LimeReport

    LimeReport

    Report generator for Qt Framework

    ... use SQL database or data passed from application using QAbstractTableModel interface. Besides one can initialize variables which available as database request parameters. LimeReport goal is to provide your application with functionaly abundant and at the same time simple to use tool for a report generation to be used even by inexperienced in IT users.
    Leader badge
    Downloads: 19 This Week
    Last Update:
    See Project
  • 22
    MyNotes

    MyNotes

    Sticky notes/post-it application for linux

    MyNotes is a sticky notes/post-it application. Notes are created using the system tray icon. They can be organized in categories and each category has a color. Images, checkboxes and a few predefined symbols can be inserted in the notes. The style of the text can be changed (alignment, style, color).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows...
    Downloads: 59 This Week
    Last Update:
    See Project
  • 24
    Cleaver

    Cleaver

    30-second slideshows for hackers

    Cleaver is a one-stop-shop for generating HTML presentations in record time. Using some spiced up markdown, you can produce good-looking, interactive presentations with a just a few lines of text. Cleaver supports several basic options that allow you to further customize the look and feel of your presentation, including author info, stylesheets, and custom templates. Cleaver has substantial theme support to give you more fine-grained control over your presentation, similar to options. Instead...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    JAWS - Just Another Web Scraper

    JAWS - Just Another Web Scraper

    A simple Web Scraper using Regular Expression or Html Agility

    JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next