Showing 105 open source projects for "web archive extractor"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    Anna’s Archive

    Anna’s Archive

    Comprehensive search engine for books, papers, comics, magazines

    Anna’s Archive is a large-scale open-source search engine and data aggregation platform designed to index and provide access to a vast collection of books, academic papers, comics, magazines, and other digital texts through a unified interface. The project includes all the infrastructure required to run a full instance locally or in production, combining web servers, databases, and search indexing systems into a scalable architecture.
    Downloads: 93 This Week
    Last Update:
    See Project
  • 2
    Web Archives

    Web Archives

    Browser extension for viewing archived and cached versions of websites

    Browser extension for viewing archived and cached versions of web pages, available for Chrome, Edge and Safari. Web Archives is a browser extension that enables you to find archived and cached versions of web pages, and comes with support for more than 10 search engines. Searches can be initiated from the context menu and the browser toolbar. A diverse set of archive and cache sources are supported, which can be toggled and reordered from the extension's options. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender : https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    sperm

    sperm

    Collection of reverse engineering articles curated for learning

    sperm is a curated repository that gathers a collection of notable articles related to reverse engineering and software analysis. It primarily acts as a knowledge archive where previously published technical posts are compiled and organized for easier access and long-term reference. These articles originate from multiple technical communities and platforms and are exported into Markdown format to maintain a consistent and readable structure. sperm focuses on educational material that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    ...The extractor tries to strike a balance between limiting noise (precision) and including all valid parts (recall). It also has to be robust and reasonably fast, it runs in production on millions of documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    bilibili-manga-downloader

    bilibili-manga-downloader

    Download and manage Bilibili Manga chapters with GUI downloader

    ...It also offers multiple output formats, allowing chapters to be saved as image folders or compressed comic archive formats suitable for local manga readers.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    scrawler

    scrawler

    Desktop tool for downloading media from many social platforms

    SCrawler is a desktop application designed to download media content from a wide range of online platforms and social media services. It allows users to add profiles, channels, or posts and automatically collect images, videos, and other media associated with them. It provides tools for organizing downloaded content locally, including feeds, profile folders, and customizable file naming rules. SCrawler includes advanced configuration options that allow users to control download behavior,...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    qrcp

    qrcp

    Transfer files over wifi from your computer to your mobile device

    ...When sending multiple files at once, qrcp creates a zip archive of the files or folders you want to transfer, and deletes the zip archive once the transfer is complete. When receiving files, qrcp serves an “upload page” through which you can choose files from your mobile. The default configuration file is stored in $HOME/qrcp.json, however, you can specify the location of the config file by passing the --config flag.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    LinkChecker

    LinkChecker

    Check links in web documents or full websites

    LinkChecker is a free, GPL licensed website validator. LinkChecker checks links in web documents or full websites. It runs on Python 3 systems, requiring Python 3.8 or later. The version in the pip repository may be old, to find out how to get the latest code, plus platform-specific information and other advice see doc/install.txt in the source code archive. If you do not want to install any additional libraries/dependencies you can use the Docker image which is published on GitHub Packages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 10
    tumblr-crawler

    tumblr-crawler

    Python crawler to download photos and videos from Tumblr blogs

    tumblr-crawler is an open source Python-based utility designed to download media content from Tumblr blogs. It provides a script that automatically retrieves photos and videos from specified Tumblr sites and saves them locally for offline access. Users can specify one or multiple blogs to crawl by editing a configuration file or by passing parameters through the command line. Once executed, the script fetches media from the Tumblr API and stores the downloaded files in folders named after...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    TiddlyWiki

    TiddlyWiki

    A self-contained JavaScript wiki for the browser, Node.js, AWS Lambda

    TiddlyWiki5 is a mature, self-contained open-source personal wiki application and non-linear notebook implemented entirely in JavaScript that runs in the browser or a Node.js environment, letting users create, organize, and interlink small pieces of content called tiddlers without the need for a server backend or traditional hierarchical pages. Its entire application — including content, interface, and logic — can live in a single HTML file that users open and edit directly in a web browser, making it portable, offline-capable, and easy to share or archive without dependencies. Beyond its single-file form, TiddlyWiki5 can also run as a more traditional application via Node.js, enabling filesystem storage and plugin ecosystems to extend capabilities like rendering engines, themes, and automation tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    BlueSpice free (Support archive)

    BlueSpice free (Support archive)

    Our support forum has moved: community.bluespice.com

    This freely available open-source software turns Wikipedia’s popular software engine MediaWiki into a fully-fledged enterprise wiki solution. Companies can continue cherishing MediaWiki’s numerous advantages and automation capabilities; with BlueSpice, they can now work even more comfortably, safely and more effectively. Compared with basic MediaWiki, BlueSpice provides, amongst other, the following enhancements: comfortable and sophisticated rights management capabilities, a visual editor...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13

    Media Dock

    Extract and download media from any website with ease and speed.

    ...It operates securely without tracking your activity and ensures smooth performance without slowing down your browsing experience. Perfect for students, content creators, and professionals, Media Extractor empowers users to access web media effortlessly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Kemono Downloader

    Kemono Downloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 1,542 This Week
    Last Update:
    See Project
  • 15
    Anti-Spam SMTP Proxy Server

    Anti-Spam SMTP Proxy Server

    Anti-Spam SMTP Proxy Server implements multiple spam filters

    The Anti-Spam SMTP Proxy (ASSP) Server project aims to create an open source platform-independent SMTP Proxy server which implements auto-whitelists, self learning Hidden-Markov-Model and/or Bayesian, Greylisting, DNSBL, DNSWL, URIBL, SPF, SRS, Backscatter, Virus scanning, attachment blocking, Senderbase and multiple other filter methods. Click 'Files' to download the professional version 2.8.1 build 24261. A linux(ubuntu 20.04 LTS) and a freeBSD 12.2 based ready to run OVA of ASSP V2 are...
    Leader badge
    Downloads: 39,357 This Week
    Last Update:
    See Project
  • 16
    AeroFTP

    AeroFTP

    AeroFTP is a Cross-platform desktop client for FTP, SFTP, WebDAV, S3

    AeroFTP is a cross-platform file transfer client that goes beyond traditional FTP. Connect to 25+ protocols, FTP/FTPS, SFTP, WebDAV, S3, Google Drive, Dropbox, OneDrive, MEGA, Box, pCloud, Azure, Filen, and more from a single interface. Security-first: AeroVault v2 encrypted containers (AES-256-GCM-SIV), Cryptomator support, and zero telemetry. Built-in AeroAgent AI assistant with 19 providers and 47 tools for file operations and workflow automation. Includes Monaco editor,...
    Leader badge
    Downloads: 87 This Week
    Last Update:
    See Project
  • 17
    ResCarta

    ResCarta

    Archive your personal history

    ResCarta Toolkit offers an open source solution to creating, storing, viewing, and searching digital collections. Applications in the toolkit let users create and edit metadata, convert data to open standard ResCarta format, index and host collections.
    Leader badge
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    Oscailt CMS
    Oscailt is an open publishing content management system written using PHP with a MySQL database. It is primarily intended for member sites of the Indymedia network, but can be used for any site with an open publishing model and it includes some template sites of simple publishing sites that can be imported at install time. These templates are ideal for anyone wishing to setup their own grassroots based news site. The latest version is 4.3.2 and this is the recommended version to use. It...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    ZnetDK 4 Mobile

    ZnetDK 4 Mobile

    Responsive Web App development full-stack framework in PHP and JS

    Develop easily in PHP, MySQL and JavaScript your business web app for mobile and tablet devices from a ready-to-customize starter app, fully Responsive and Installable (PWA). No need to start up from scratch. You just have to add the views that fullfill your business requirements and connect them to the user navigation menu. All the features expected for a mobile business application are already developed! Authenticate your users through the built-in login form. Show them only the pages...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    KN-Lite-Web-Browser

    KN-Lite-Web-Browser

    Text based browser with limited search provided by DuckDuckGo.

    V2.0 - Browsing made Simple, Now with Images I found it useful back when I was only getting 5kbps down on a rainy day to build a browser that cut out images. Certainly isn't as advanced or fun to browse on compared to even Netscape Navigator, but it's gotten to a place where it might be of some use to those who want to avoid Javascript, ADs, or just see a lotta improperly formatted text. It is a Godot HTTP Request tool, does not display the page, only digests the text and links and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Jamilsoft Blade Email Extractor

    A powerful and easy-to-use email extracting software

    Jamilsoft Blade is a powerful and easy-to-use email extracting software that can help you extract email addresses from a variety of sources, including websites, documents, and social media. With Jamilsoft Blade, you can quickly and easily find the email addresses you need, even if they are hidden or obscured.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    grab-site

    grab-site

    Web crawler for archiving and backing up sites into WARC archives

    grab-site is an open source web crawling tool designed to archive and back up websites by recursively downloading their content. It works by taking a starting URL and systematically following links across the site, capturing pages and resources and saving them into WARC archive files for long-term preservation. Internally, the crawler uses a fork of the wpull engine to fetch and process web pages efficiently during large-scale crawls. grab-site includes a built-in dashboard that displays real-time crawl activity, including which URLs are currently being processed and how many remain in the queue. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Password Extractor

    Password Extractor

    Transfer passwords to and from K-Meleon

    Transfer passwords between browsers. This extension for K-Meleon can also be installed on other browsers that use XUL including SeaMonkey, Pale Moon, Mypal, Roytam's New Moon, and Waterfox Classic. The Password Extractor XML export/import format is also used by Password Exporter (for Firefox and SeaMonkey) and Password Backup Tool (for Pale Moon and Basilisk). The CSV export format is compatible with popular browsers and password managers including Mozilla Firefox, Google Chrome, Microsoft...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    DigiOz .NET Portal

    DigiOz .NET Portal

    ASP.NET MVC Based Portal CMS System to Create an Instant Website

    DigiOz .NET Portal is a FREE web based portal CMS system written in ASP.NET MVC 5 in C# which uses a Microsoft SQL Database to allows webmasters to setup and customize an instant website for either business or personal use. List of Technologies used: - ASP.NET MVC 5 - Microsoft SQL Server - Bootstrap - HTML 5 - jQuery Demo Site: http://digioznetportal.digioz.net/ Source Code: https://sourceforge.net/p/digioznetportal/codenew/ci/master/tree/ Installation Instructions:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB