Showing 40 open source projects for "python web crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Run applications fast and securely in a fully managed environment Icon
    Run applications fast and securely in a fully managed environment

    Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

    Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.
    Try for free
  • 1
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data....
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    Airstrike

    Airstrike

    Imitate AirDrop on Windows

    Airstrike is a lightweight FastAPI application that allows you to upload files to your machine via a web interface. I created this to imitate Apple's AirDrop in a simpler and open source way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Rockstor

    Rockstor

    BTRFS based NAS and private cloud storage solution

    ...These Rock-ons, combined with advanced NAS features, turn Rockstor into a private cloud storage solution accessible from anywhere, giving users complete control of cost, ownership, privacy and data security. Rockstor UI is written in Javascript, making it simple to manage everything from your Web browser. The backend is written in Python and exposes RESTful APIs to easily extend functionality!
    Downloads: 31 This Week
    Last Update:
    See Project
  • 4
    Plum Cave

    Plum Cave

    A cloud backup solution that employs advanced cryptography

    A cloud backup solution that employs the "ChaCha20 + Serpent-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange. Check it out at https://plum-cave.netlify.app/ GitHub page: https://github.com/Northstrix/plum-cave
    Downloads: 0 This Week
    Last Update:
    See Project
  • Most modern and flexible cloud platform for MLM companies Icon
    Most modern and flexible cloud platform for MLM companies

    ERP-class software for multi-level marketing

    For direct selling (MLM) companies, from startup to well established enterprises with millions of distributors across the world
    Learn More
  • 5
    migrid

    migrid

    A grid middleware with minimal user and resource requirements

    [This project moved to Github and is no longer maintained here] Minimum intrusion Grid (MiG) is an attempt to design a new platform for Grid computing which is driven by a stand-alone approach to Grid, rather than integration with existing systems. The goal of the MiG project is to provide Grid infrastructure where the requirements on users and resources alike is as small as possible (minimum intrusion). MiG strives for minimum intrusion but will seek to provide a feature rich and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    transmission_cleanup

    transmission_cleanup

    Clean up of torrent files using the RPC protocal

    This application connects to the tranmission web client using the RPC interface, it allows the user to set the inital download folder for the torrents for sorting into their own folders based on the type of file it is. it also allows scheduling of the cleaning process eithe daily or weekly at a time set by you in the install process. you supply your username and password for the RPC web interface whohc is encrypted by the application and saved to the disk, The application checks if the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    bitfarm-Archiv Document Management - DMS
    bitfarm-Archiv is a powerful Document Management (DMS), Enterprise Content Management (ECM) and Knowledge Management System (KMS) with Workflow Components. Help us! As we live in the internet age, the best thing, you can help, is to write a short statement about your scenario and your use of the DMS, along with your experiences and put it on your own website or in a blog or forum. It would help us best, if you can also add a hyperlink to our site http://www.bitfarm-archiv.com. By this...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 8

    Delayter

    Utility to queue files for deferred deletion, days/weeks/months later

    Full documentation: Download delayterX.Y.html User has files that can probably be deleted later but does not feel comfortable deleting right now. Instruct with simple commands in which the file names and delay time are specified, eg.: Delayter -m 1 -w 2 -d 3 file1 file2 by which file1 and file2 are scheduled for deletion 1 month, 2 weeks and 3 days from now (roughly 47 days). Useful on projects with many temporary junk files that cannot be deleted until a later time at which you might...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Configuration Backup (ConfiBack)

    Configuration Backup (ConfiBack)

    Project for backing up network device configuration

    Using this project you can make backup and track changes of configuration of network devices like switches, routers, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 10
    A set of tools (command line and GUI) to provide a complete digital photo workflow for Unixes. EXIF headers are used as the central information repository, so users may change their software at any time without loosing any data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    jQuery File Upload

    jQuery File Upload

    File Upload widget with multiple file selection

    jQuery-File-Upload is a mature, full-featured jQuery plugin (often paired with server-side handlers) for handling file uploads from the browser with advanced capabilities. It supports chunked uploads, drag and drop, multiple file selection, progress bars, client-side image resizing, and preview generation. On the server side, artifacts may be processed using compatible back-end scripts in languages like PHP, Ruby, Node.js, or Java, making the plugin cross-platform. Because uploads can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    diskover

    diskover

    File system crawler and disk space usage software

    diskover is a file system crawler and disk space usage software that uses Elasticsearch to index your file metadata. diskover crawls and indexes your files on a local computer or remote storage server over network mounts. diskover helps manage your storage by identifying old and unused files and give better insights into data change "hotfiles", file duplication "dupes" and wasted space. It is designed to help deal with managing large amounts of data growth and provide detailed storage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    mediaTUM is free software written in Python for archiving and retrieval of images, documents and other research data. It was originally developed in the framework of the DFG project IntegraTUM and is continuously expanded with new functionalities as required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    angular-filemanager

    angular-filemanager

    JavaScript file manager Material Design folder explorer

    A very smart filemanager to manage your files in the browser developed in AngularJS following Material Design styles by Jonas Sciangula Street. This project provides a web file manager interface, allowing you to create your own backend connector following the connector API. By the way, we provide some example backend connectors in many languages as an example (PHP-FTP, PHP-local, python, etc). Pick files callback for third parties apps. Directory tree navigation. Copy, Move, Rename (Interactive UX). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Cloud Export is a tool to automatically extract your data from web applications and save it to your local file system for backup purposes, but more extensive than Google Takeout. Plans are based on http://www.dataliberation.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    FreeNAS

    FreeNAS

    This project has moved to github - see https://github.com/freenas

    FreeNAS is an Open Source Storage Platform and supports sharing across Windows, Apple, and UNIX-like systems. It includes ZFS (high storage capacities and integrates file systems and volume management into a single piece of software). Note: This project is currently inactive on sourceforge as it has moved to github (see https://github.com/freenas)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GIIAF Microscopy Library

    GIIAF Microscopy Library

    The GIIAF Microscopy Library, that uses customised OMERO software

    This project incorporates a suite of tools that aim to allow researchers within Griffith's Imaging and Image Analysis Facility (GIIAF) to efficiently and effectively provide secure, centralised, web-accessible data storage, management and manipulation. The open-source Java-based OMERO software was customised to provide most of the features of this project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    RiverGlass EssentialScanner is an open source web and file system crawler which indexes the text content of discovered files so they can be retrieved and analyzed. It provides simple scanner capabilities as part of larger enterprise search solutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pyTarget
    Implement a powerful iSCSI target in python, easily use under most popular systems. It also includes the following features: multi-target, multi-connect/session support chap authentication support header & data digest support erl =2, VTL, etc...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Sushi, huh? is an aplication for download GNU/Linux packages from another OS or Linux distribution, for an posterior offline installation. Thinked for people that not have conexion to Internet.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Backup and restore of files to web mail systems, ftp, sftp. Uses free storage of gmail/hotmail etc. Archives files, splits large files, encrypts and uploads. Requires python (tested with python 2.5)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A small Python script that allows administrators to place quotas on *nix accounts without much technical knowledge or root access. It is ideal for those who share and/or resell web hosting or other resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DiskAt is disk/media catalogue app supporting multiple categories per item, good search and features which allow to use it as Movie/DVD/etc database. Written with PHP/Python/SQLite.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Universal information crawler is a fast precise and reliable Internet crawler. Uicrawler is a program/automated script which browses the World Wide Web in a methodical, automated manner and creates the index of documents that it accesses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next