Search Results for "data scraper website" - Page 2

Showing 725 open source projects for "data scraper website"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    LaTeX.CSS

    LaTeX.CSS

    LaTeX.css is a library that makes your website look like a LaTeX doc

    This almost class-less CSS library turns your HTML document into a website that looks like a LATEX document. Write semantic HTML, and you are good to go. The source code can be found on GitHub. LaTeX.css is a minimal, almost class-less CSS library that makes any website look like a LaTeX document. Add any optional classes to elements with special styles (author subtitle, abstract, lemmas, theorems, etc.). The labels of theorems, definitions, lemmas and proofs can be changed to other...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    OSCAL

    OSCAL

    Open Security Controls Assessment Language (OSCAL)

    ...Public contributions to this project are welcome. With this effort, we are stressing the agile development of a set of minimal formats that are generic enough to capture the breadth of data in scope (controls specifications), while also capable of ad-hoc tuning and extension to support peculiarities of both (industry or sector) standards and new control types. The OSCAL website provides an overview of the OSCAL project, including an XML and JSON schema reference, examples, and other resources.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 3
    SkyCrypt

    SkyCrypt

    A Hypixel skyblock stats website

    SkyCrypt is a web-based application that allows players of Hypixel SkyBlock to view and share detailed information about their in-game profiles through a visually rich interface. It aggregates data from the Hypixel API and presents it in an organized format, including player statistics, skills, equipment, and inventory details. The project is built with a Node.js-based stack and integrates additional technologies such as MongoDB and Redis to handle data storage and caching. SkyCrypt enhances...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Skyvern

    Skyvern

    Automate browser-based workflows with LLMs and Computer Vision

    ...Skyvern understands how to solve CAPTCHAs to complete complicated workflows. Support for authenticating into user accounts, including support for 2FA/TOTP. Extract data from workflows in any schema of your choice including CSV or JSON. Automate procurement pipelines, breeze through government forms, and complete workflows in any language.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    DuckDuckGo Android App

    DuckDuckGo Android App

    Privacy browser for Android

    DuckDuckGo is an app that gives you utmost privacy when browsing online. It stops you from getting tracked and protects your personal and private information, no matter where the internet may take you. Apart from providing standard browsing functionality, DuckDuckGo blocks all hidden third-party trackers, forces sites to use an encrypted connection where available, and provides a Privacy Grade rating for each website you visit.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Parsera

    Parsera

    Lightweight library for scraping web-sites with LLMs

    Scrape data from any website with only a link and column descriptions. Parsera is a tool designed to scrape web content, specifically handling poorly structured or messy websites.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Jikan REST

    Jikan REST

    The REST API for Jikan

    Jikan REST is an unofficial RESTful API for MyAnimeList.net, providing access to anime, manga, and user data by scraping the website. It allows developers to integrate MyAnimeList data into their applications without relying on the official API. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Publii

    Publii

    Publii is a desktop-based CMS for Windows, Mac and Linux

    Publii is a powerful blogging app perfect for anyone looking to create a privacy-focused website. Whether you're a beginner or a developer, it has all the tools you need to get started. Publii is a static site generator that makes it easy to create a personal blog, portfolio, or corporate website. With instant site switching and no databases or other credentials to remember, Publii is the perfect platform for anyone who wants a hassle-free way to build and manage an online presence. Websites...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    MDCx

    MDCx

    Movie metadata scraper and organizer for media libraries and NFO

    MDCx is an open source media metadata scraping and organization tool designed to automate the process of collecting detailed information for movie files. It retrieves metadata from multiple online sources and applies it to local media collections, helping users maintain structured and well-organized libraries. MDCx can download information such as titles, cast data, artwork, and other metadata, then generate standardized NFO files compatible with media management systems. It also supports...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    RuoYi

    RuoYi

    The warehouse's SpringBoot-based rights management system

    The warehouse's SpringBoot-based rights management system is easy to read and understand, and the interface is simple and beautiful. The core technology uses Spring, MyBatis, and Shiro without any other heavy dependencies. I have always wanted to make a background management system, and I have seen many excellent open source projects but found no suitable ones. So I started to write a background system in my spare time. So there is Zoe. She can be used for all web applications, such as...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    DuckDB

    DuckDB

    DuckDB is an in-process SQL OLAP Database Management System

    ...DuckDB supports arbitrary and nested correlated subqueries, window functions, collations, complex types (arrays, structs), and more. For more information on the goals of DuckDB, please refer to the Why DuckDB page on our website. Processing and storing tabular datasets, e.g. from CSV or Parquet files. Interactive data analysis, e.g. Joining & aggregate multiple large tables. Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns. Large result set transfer to client. For development, DuckDB requires CMake, Python3 and a C++11 compliant compiler. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    101-0250-00

    101-0250-00

    ETH course - Solving PDEs in parallel on GPUs

    This course aims to cover state-of-the-art methods in modern parallel Graphical Processing Unit (GPU) computing, supercomputing and code development with applications to natural sciences and engineering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Email Scraper and Validator
    This is a simple desktop application built with Python and Tkinter that allows users to scrape email addresses from websites and validate them using an external API. It also provides features to save the scraped emails to a database, and export the data to various file formats. 1. Enter a list of website URLs or emails in the input field. 2. Click the Scrape button to scrape email addresses from the provided websites. 3. Click the Validate button to validate the scraped email addresses. 4. Use other buttons to clean the database, save data, or export data to different file formats. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    GeoServer

    GeoServer

    GeoServer repository

    GeoServer is an open-source software server written in Java that allows users to share and edit geospatial data. Designed for interoperability, it publishes data from any major spatial data source using open standards. Being a community-driven project, GeoServer is developed, tested, and supported by a diverse group of individuals and organizations from around the world. GeoServer is the reference implementation of the Open Geospatial Consortium (OGC) Web Feature Service (WFS) and Web...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 16
    Astro

    Astro

    The web framework for content-driven websites

    Astro powers the world's fastest marketing sites, blogs, e-commerce websites, and more. Astro improves website performance by rendering components on the server, sending lightweight HTML to the browser with zero unnecessary JavaScript overhead. Astro was designed to work with your content, no matter where it lives. Load data from your file system, external API, or your favorite CMS. Extend Astro with your favorite tools. Bring your own JavaScript UI components, CSS libraries, themes, integrations, and more. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    PlutoSliderServer.jl

    PlutoSliderServer.jl

    Web server to run just the `@bind` parts of a Pluto.jl notebook

    Web server to run just the @bind parts of a Pluto.jl notebook. PlutoSliderServer can run a notebook and generate the export HTML file. This will give you the same file as the export button inside Pluto (top right), but automatically, without opening a browser. One use case is to automatically create a GitHub Pages site from a repository with notebooks. For this, take a look at our template repository that used GitHub Actions and PlutoSliderServer to generate a website on every commit. Many...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    I Still Don't Care About Cookies

    I Still Don't Care About Cookies

    Debloated fork of the extension "I don't care about cookies"

    Debloated fork of the extension "I don't care about cookies". Get rid of cookie warnings from almost all websites! This extension has been acquired by Avast and simply I don't trust Avast with my data. Also having it on Github allows us to improve the code & add support for websites faster. The EU regulations require that any website using tracking cookies must get user's permission before installing them. These warnings appear on most websites until the visitor agrees with the website's terms and conditions. Imagine how irritating that becomes when you surf anonymously or if you delete cookies automatically every time you close the browser. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Datasette

    Datasette

    An open source multi-tool for exploring and publishing data

    Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size, analyze and explore it, and publish it as an interactive website and accompanying API. Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of tools and plugins dedicated to making working with structured data as productive as possible. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Surmon.me

    Surmon.me

    Personal website and blog

    Surmon.me is a full-featured personal website and blog platform built with Vue and designed as part of a larger ecosystem of interconnected applications and services. The project functions as a server-side rendered (SSR) web application that delivers content dynamically while maintaining performance and SEO optimization. It is powered by a dedicated backend service called NodePress, which provides RESTful APIs for content management, data retrieval, and system operations. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    finvizfinance

    finvizfinance

    Finviz analysis python library

    finvizfinance is a package that collects financial information from FinViz website. Stock charts, fundamental & technical information, insider information and stock news. Forex charts and performance. Crypto charts and performance. Screener and Group provide data frames for comparing stocks according to different filters and trading signals. Getting information (fundament, description, outer rating, stock news, inside trader) of an individual stock.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 22
    NVIDIA Merlin

    NVIDIA Merlin

    Library providing end-to-end GPU-accelerated recommender systems

    ...For more information, see NVIDIA Merlin on the NVIDIA developer website. Transform data (ETL) for preprocessing and engineering features. Accelerate your existing training pipelines in TensorFlow, PyTorch, or FastAI by leveraging optimized, custom-built data loaders. Scale large deep learning recommender models by distributing large embedding tables that exceed available GPU and CPU memory. Deploy data transformations and trained models to production with only a few lines of code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    AdGuard Home

    AdGuard Home

    Network-wide ads and trackers blocking DNS server

    ...With the rise of Internet-Of-Things and connected devices, it becomes more and more important to be able to control your whole network. Block throughout the whole system. This includes video ads and ads in your favorite apps, browsers, games, and on any website you can imagine. Dozens of ad filters are available to you and are updated on a regular basis, guaranteeing the best filtering quality. Protecting your personal data is our top priority. With AdGuard, you and your sensitive data will be safe from any online tracker and analytics system that may attempt to steal your data while surfing the web. ...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 24
    Logseq

    Logseq

    A privacy-first, open-source platform for knowledge management

    ...Logseq is a platform for knowledge management and collaboration. It focuses on privacy, longevity, and user control. The server will never store or analyze your private notes. Your data are plain text files and we currently support both Markdown and Emacs Org-mode (more to be added soon). In the unlikely event that the website is down or cannot be maintained, your data is, and will always be yours. No data lock-in, no proprietary formats, you can edit the same Markdown/Org-mode file with any tools at the same time. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 25
    Webstudio

    Webstudio

    Open source website builder and Webflow alternative

    Webstudio is an open source visual development platform that enables developers, designers, and cross-functional teams to build modern websites through a powerful visual builder while maintaining full ownership of their data and infrastructure. The project positions itself as a Webflow alternative but emphasizes openness, portability, and deep control over the generated frontend code. It connects to any headless CMS and exposes the full power of CSS within a visual interface, allowing users...
    Downloads: 6 This Week
    Last Update:
    See Project