Showing 103 open source projects for "scrape text from html"

View related business solutions
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
    Get started free
  • The #1 Embedded Analytics Solution for SaaS Teams. Icon
    The #1 Embedded Analytics Solution for SaaS Teams.

    Qrvey saves engineering teams time and money with a turnkey multi-tenant solution connecting your data warehouse to your SaaS application.

    Qrvey’s comprehensive embedded analytics software enables you to design more customizable analytics experiences for your end users.
    Try Developer Playground
  • 1
    Jupyter Notebook

    Jupyter Notebook

    Jupyter Interactive Notebook

    The notebook extends the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results. The Jupyter notebook combines two components. A web application, which is a browser-based tool for interactive authoring of documents which combine explanatory text, mathematics, computations and their rich media output...
    Downloads: 764 This Week
    Last Update:
    See Project
  • 2
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 3
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make every...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Typed.js

    Typed.js

    A JavaScript typing animation library

    Typed.js is a library that types. Enter in any string, and watch it type at the speed you've set, backspace what it's typed, and begin a new sentence for however many strings you've set. Rather than using the strings array to insert strings, you can place an HTML div on the page and read from it. This allows bots and search engines, as well as users with JavaScript disabled, to see your text on the page. You can pause in the middle of a string for a given amount of time by including an escape...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Secure remote access solution to your private network, in the cloud or on-prem. Icon
    Secure remote access solution to your private network, in the cloud or on-prem.

    Deliver secure remote access with OpenVPN.

    OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
    Get started — no credit card required.
  • 5
    Parsera

    Parsera

    Lightweight library for scraping web-sites with LLMs

    Scrape data from any website with only a link and column descriptions. Parsera is a tool designed to scrape web content, specifically handling poorly structured or messy websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Karate

    Karate

    Test automation made simple

    Karate is the only open-source tool to combine API test-automation, mocks, performance-testing and even UI automation into a single, unified framework. The BDD syntax popularized by Cucumber is language-neutral, and easy for even non-programmers. Assertions and HTML reports are built-in, and you can run tests in parallel for speed. There’s also a cross-platform stand-alone executable for teams not comfortable with Java. You don’t have to compile code. Just write tests in a simple, readable...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    Sphinx

    Sphinx

    Main repository for the Sphinx documentation builder

    Sphinx is a tool that makes it easy to create intelligent and beautiful documentation, written by Georg Brandl and licensed under the BSD license. It was originally created for the Python documentation, and it has excellent facilities for the documentation of software projects in a range of languages. Of course, this site is also created from reStructuredText sources using Sphinx! HTML (including Windows HTML Help), LaTeX (for printable PDF versions), ePub, Texinfo, manual pages, plain text...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    MudBlazor

    MudBlazor

    Do more with Blazor, utilizing CSS and keeping Javascript to a minimum

    Trusted by thousands of users, from hobby developers to large enterprises. Use MudBlazor to rapidly build amazing web applications without leaving your loved C# language and toolchain. We bring together everything that's required to build amazing Blazor applications that scale from desktop to mobile. Apart from the library itself we also provide templates, a learning platform, theme manager, demo and example projects as well as an online code editor integrated with our documentation and issue...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Markdig

    Markdig

    A fast, powerful, CommonMark compliant, extensible Markdown processor

    ... behavior. Parses trivia (whitespace, newlines and other characters) to support lossless parse ⭢ render roundtrip. This enables changing markdown documents without introducing undesired trivia changes. Special attributes or attached HTML attributes (inspired from PHP Markdown Extra - Special Attributes). Diagrams extension whenever a fenced code block contains a special keyword, it will be converted to a div block with the content as-is.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 10
    Chroma

    Chroma

    A general purpose syntax highlighter in pure Go

    As Chroma has just been released, its API is still in flux. That said, the high-level interface should not change significantly. Chroma takes source code and other structured text and converts it into syntax-highlighted HTML, ANSI-coloured text, etc. Chroma is based heavily on Pygments and includes translators for Pygments lexers and styles. ABAP, ABNF, ActionScript, ActionScript 3, Ada, Angular2, ANTLR, ApacheConf, APL, AppleScript, Arduino, Awk. PacmanConf, Perl, PHP, PHTML, Pig, PkgConfig...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Google Open Source Project Style Guide

    Google Open Source Project Style Guide

    Chinese version of Google open source project style guide

    .... If the project you are modifying originates from Google, you may be directed to the English version of the project page to understand the style used by the project. The Chinese version of the project uses reStructuredText plain text markup syntax, and uses Sphinx to generate document formats such as HTML / CHM / PDF.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Schema Spy

    Schema Spy

    SchemaSpy code home

    This is a new code repository for SchemaSpy tool initially created and maintained by John Currier. I personally believe that work on SchemaSpy should be continued, and a lot of still existing issues should be resolved. Last released version of the SchemaSpy was in 2010, and I have a plan to change this. Process of installation is very simple because SchemaSpy is only one Java .jar application. You can learn more read the installation doc. When you environment will be ready, and you can start...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Angular DataTables

    Angular DataTables

    DataTables with Angular

    An Angular2+ library for building complex HTML tables using DataTables JQuery plug-in. Implementation of the example on custom filtering with range search. The HTML element provides a Promise that returns the instance of the DataTable. Implementation of the example on individual column searching (text inputs). Sometimes, your DataTable options are stored or computed server-side. All you need to do is to return the expected result as a promise. You can use Angular Pipe to transform data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Render

    Render

    Go package for easily rendering JSON, XML, binary data, and HTML

    Render is a package that provides functionality for easily rendering JSON, XML, text, binary data, and HTML templates. Render can be used with pretty much any web framework provided you can access the HTTP.ResponseWriter from your handler. The rendering functions simply wraps Go's existing functionality for marshaling and rendering data. HTML: Uses the html/template package to render HTML templates. JSON: Uses the encoding/json package to marshal data into a JSON-encoded response. XML: Uses...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    xhtml2pdf

    xhtml2pdf

    A library for converting HTML into PDFs using ReportLab

    xhtml2pdf enables users to generate PDF documents from HTML content easily and with automated flow control such as pagination and keeping text together. The Python module can be used in any Python environment, including Django. The Command line tool is a stand-alone program that can be executed from the command line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Relative-Time Element

    Relative-Time Element

    Web component extensions to the standard <time> element

    Formats a timestamp as a localized string or as relative text that auto-updates in the user's browser. This allows the server to cache HTML fragments containing dates and lets the browser choose how to localize the displayed time according to the user's preferences. Every visitor is served the same markup from the server's cache. When it reaches the browser, the custom relative-time JavaScript localizes the element's text into the local timezone and formatting. Dates are displayed before months...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    tslab

    tslab

    Interactive JavaScript and TypeScript programming with Jupyter

    tslab is an interactive programming environment and REPL with Jupyter for JavaScript and TypeScript users. You can write and execute JavaScript and TypeScript interactively on browsers and save results as Jupyter notebooks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    react-use

    react-use

    Component for React

    Tracks device battery state. Plays audio and exposes its controls. Tracks geo location state of user's device. Triggers callback when user clicks outside target area. Tracks mouse hover state of some element. Display an element or video full-screen. Tracks location hash value. Tracks whether user is being inactive. Tracks an HTML element's intersection. Synthesizes speech from a text string. Tracks page navigation bar location state. Re-renders component, while tweening a number from 0 to 1...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Laravel Response Cache

    Laravel Response Cache

    Speed up a Laravel app by caching the entire response

    This Laravel package can cache an entire response. By default, it will cache all successful get-requests that return text-based content (such as HTML and json) for a week. This could potentially speed up the response quite considerably. So the first time a request comes in the package will save the response before sending it to the users. When the same request comes in again we're not going through the entire application but just respond with the saved response.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Mithril.js

    Mithril.js

    A JavaScript framework for building brilliant applications

    ... be indented more naturally than HTML for complex tags, and since its syntax is just JavaScript, it's possible to leverage a lot of JavaScript tooling ecosystem. Mithril is all about getting meaningful work done efficiently. Doing file uploads? The docs show you how. Authentication? Documented too. Exit animations? You got it. No extra libraries, no magic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    command-output-to-html-table

    command-output-to-html-table

    A shell script to convert any file or command output into a html table

    Please watch the video below, to convert any file or a command output into a nice html table, in less than 5 Minutes time. The output html file can then be browsed from any location, using a local webserver or an internet www domain. Usage Examples: (Type them on Terminal) cd ~/Downloads/tabulate # location chmod +x *.sh cat "student_marks.csv" | { cat ; echo ; } | ./tabulate.sh -d "," -t "My School" -h "First Term" > "marks.html" # or > "/var/www/html/marks.html" -d specifies...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 22
    HTML Quiz Application With Timer & Point

    HTML Quiz Application With Timer & Point

    An Easy to Edit HTML, CSS & JAVASCRIPT QUIZ - For Students & Teachers

    Just Download & Extract the Above Zip File provided, then edit the script.js file with a good text / code editor like Sublime Text ( Check Google ) & then save the changes & then view the index.html file on the internet browser. That's all. This is a Browser Based, Cross Platform, Supporting all Operating Systems, Easy Application. If needed, you can change the value of the Timer from 15 seconds to any other value, by Searching & Replacing all occurences of 15 in index.html & script.js...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    mPDF

    mPDF

    PHP library generating PDF files from UTF-8 encoded HTML

    mPDF is a PHP library that generates PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files ‘on-the-fly’ from his website, handling different languages. It is slower than the original scripts e.g. HTML2FPDF and produces larger files when using Unicode fonts, but support for CSS styles etc. and has been much enhanced. Supports almost all languages including RTL (Arabic and Hebrew), and CJK (Chinese-Japanese-Korean). Nested block-level elements (e.g. P...
    Downloads: 71 This Week
    Last Update:
    See Project
  • 24
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,788 This Week
    Last Update:
    See Project
  • 25

    htmLawed

    PHP code to purify & filter HTML

    The htmLawed PHP script makes HTML more secure and standards- & policy-compliant. The customizable HTML filter/purifier can balance tags, ensure proper nestings, neutralize XSS, restrict HTML, beautify code like Tidy, implement anti-spam measures, etc.
    Downloads: 136 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next