Showing 18 open source projects for "python web crawler"

View related business solutions
  • Resolve Support Tickets 2x Faster​ with ServoDesk Icon
    Resolve Support Tickets 2x Faster​ with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • Repair-CRM Icon
    Repair-CRM

    For small companies that repair and maintenance customer machines

    All-In-One Solution with an Online Booking portal for automating scheduling & dispatching to ditch paperwork and improve the productivity of your technicians!
    Learn More
  • 1
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data....
    Downloads: 14 This Week
    Last Update:
    See Project
  • 2
    Rockstor

    Rockstor

    BTRFS based NAS and private cloud storage solution

    ...These Rock-ons, combined with advanced NAS features, turn Rockstor into a private cloud storage solution accessible from anywhere, giving users complete control of cost, ownership, privacy and data security. Rockstor UI is written in Javascript, making it simple to manage everything from your Web browser. The backend is written in Python and exposes RESTful APIs to easily extend functionality!
    Downloads: 35 This Week
    Last Update:
    See Project
  • 3
    Plum Cave

    Plum Cave

    A cloud backup solution that employs advanced cryptography

    A cloud backup solution that employs the "ChaCha20 + Serpent-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange. Check it out at https://plum-cave.netlify.app/ GitHub page: https://github.com/Northstrix/plum-cave
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    bitfarm-Archiv Document Management - DMS
    bitfarm-Archiv is a powerful Document Management (DMS), Enterprise Content Management (ECM) and Knowledge Management System (KMS) with Workflow Components. Help us! As we live in the internet age, the best thing, you can help, is to write a short statement about your scenario and your use of the DMS, along with your experiences and put it on your own website or in a blog or forum. It would help us best, if you can also add a hyperlink to our site http://www.bitfarm-archiv.com. By this...
    Downloads: 13 This Week
    Last Update:
    See Project
  • WinMan ERP Software Icon
    WinMan ERP Software

    For companies of all sizes and enterprises in need of a solution to improve their operations

    WinMan ERP is an all-encompassing solution designed to manage the operational, quality, commercial, and financial processes of manufacturers and distributors. It is particularly well-suited for companies embracing Lean strategies.
    Learn More
  • 5
    Configuration Backup (ConfiBack)

    Configuration Backup (ConfiBack)

    Project for backing up network device configuration

    Using this project you can make backup and track changes of configuration of network devices like switches, routers, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Cloud Export is a tool to automatically extract your data from web applications and save it to your local file system for backup purposes, but more extensive than Google Takeout. Plans are based on http://www.dataliberation.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Backup and restore of files to web mail systems, ftp, sftp. Uses free storage of gmail/hotmail etc. Archives files, splits large files, encrypts and uploads. Requires python (tested with python 2.5)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Mutualized distant storage space management tool (using a distributed system).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Powerful cloud-based licensing solution designed for fast-growing software businesses. Icon
    Powerful cloud-based licensing solution designed for fast-growing software businesses.

    A single-point of license control for desktop, SaaS, and mobile applications, APIs, VMs and devices.

    10Duke Enterprise is a cloud-based, scalable and flexible software licensing solution enabling software vendors to easily configure, manage and monetize the licenses they provide to their customers in real-time.
    Learn More
  • 10
    Arrowbase is a collection of tools for backup persoses. Together they combine a backup system that can be used on more then one Operating system. This makes the project not only widely spread but portable as wel.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    XSDB XML is to DATA as HTML is to DOCUMENT. Publish and combine data as easily as HTML format and web browsers publish and view documents. Implementations in Python, javascript, java, C#/.NET.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CAIRN is a modular copy and restore program for the imaging of a computer. It copies every file on a computer and figures out how to recreate it from scratch. It is primarily network oriented but is also flexible enough to boot from any possible method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Agile Author is a framework for developing networked repositories of digital information such as digital libraries and content management systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Cat-photo makes administration and web pages with photos easy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    idyuts is \"I Dare You to Use This Shell\"; a pre-hibernate approach to replacing an ORM written with jython functors into a pure-Java language command pattern. The \"pipeline codegen artifacts\" are simple IoC templates, and trivial to adapt
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    View, track, filter, archive, alert, group, rotate logs through a GUI, CLI, or WebUI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Rescuezilla

    Rescuezilla

    The Swiss Army Knife of System Recovery

    Rescuezilla is an easy-to-use disk cloning and imaging application that's fully compatible with Clonezilla — the industry-standard trusted by tens of millions. Yes, Rescuezilla is the Clonezilla GUI (graphical user interface) that you might have been looking for. **See: https://rescuezilla.com/ for download links** **NEW** Weekly rolling release downloads: https://github.com/rescuezilla/rescuezilla/releases Rescuezilla is a fork of Redo Backup and Recovery (now called Redo...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next