Showing 39 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Rockstor

    Rockstor

    BTRFS based NAS and private cloud storage solution

    ..." are powered by a Docker-based application hosting framework. And new ones can be simply added. These Rock-ons, combined with advanced NAS features, turn Rockstor into a private cloud storage solution accessible from anywhere, giving users complete control of cost, ownership, privacy and data security. Rockstor UI is written in Javascript, making it simple to manage everything from your Web browser. The backend is written in Python and exposes RESTful APIs to easily extend functionality!
    Downloads: 23 This Week
    Last Update:
    See Project
  • 3
    bitfarm-Archiv Document Management - DMS
    ... help the software to gain a better presence in the web which helps distribute it. This, however, will allow us to acquire more enterprise customers which gives us more resources, e.g. for further development of the GPL version.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Plum Cave

    Plum Cave

    A cloud backup solution that employs advanced cryptography

    A cloud backup solution that employs the "ChaCha20 + Serpent-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange. Check it out at https://plum-cave.netlify.app/ GitHub page: https://github.com/Northstrix/plum-cave
    Downloads: 1 This Week
    Last Update:
    See Project
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 5
    migrid

    migrid

    A grid middleware with minimal user and resource requirements

    Minimum intrusion Grid (MiG) is an attempt to design a new platform for Grid computing which is driven by a stand-alone approach to Grid, rather than integration with existing systems. The goal of the MiG project is to provide Grid infrastructure where the requirements on users and resources alike is as small as possible (minimum intrusion). MiG strives for minimum intrusion but will seek to provide a feature rich and dependable Grid solution. In line with the minimum intrusion concept,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Delayter

    Utility to queue files for deferred deletion, days/weeks/months later

    ... not remember whether they are important or not. There are safeguards to protect against accidental deletion. Scheduled deletions can be viewed and also retracted. Files will not be deleted if they have been modified since scheduled for deletion. The documentation at the Delayter web site is well-written and extensive. Please take a minute to read the "Overview" section.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    transmission_cleanup

    transmission_cleanup

    Clean up of torrent files using the RPC protocal

    This application connects to the tranmission web client using the RPC interface, it allows the user to set the inital download folder for the torrents for sorting into their own folders based on the type of file it is. it also allows scheduling of the cleaning process eithe daily or weekly at a time set by you in the install process. you supply your username and password for the RPC web interface whohc is encrypted by the application and saved to the disk, The application checks...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Configuration Backup (ConfiBack)

    Configuration Backup (ConfiBack)

    Project for backing up network device configuration

    Using this project you can make backup and track changes of configuration of network devices like switches, routers, etc.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    A set of tools (command line and GUI) to provide a complete digital photo workflow for Unixes. EXIF headers are used as the central information repository, so users may change their software at any time without loosing any data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Picsart Enterprise Background Removal API for Stunning eCommerce Visuals Icon
    Picsart Enterprise Background Removal API for Stunning eCommerce Visuals

    Instantly remove the background from your images in just one click.

    With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.
    Learn More
  • 10
    diskover

    diskover

    File system crawler and disk space usage software

    diskover is a file system crawler and disk space usage software that uses Elasticsearch to index your file metadata. diskover crawls and indexes your files on a local computer or remote storage server over network mounts. diskover helps manage your storage by identifying old and unused files and give better insights into data change "hotfiles", file duplication "dupes" and wasted space. It is designed to help deal with managing large amounts of data growth and provide detailed storage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    mediaTUM is free software written in Python for archiving and retrieval of images, documents and other research data. It was originally developed in the framework of the DFG project IntegraTUM and is continuously expanded with new functionalities as required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    angular-filemanager

    angular-filemanager

    JavaScript file manager Material Design folder explorer

    A very smart filemanager to manage your files in the browser developed in AngularJS following Material Design styles by Jonas Sciangula Street. This project provides a web file manager interface, allowing you to create your own backend connector following the connector API. By the way, we provide some example backend connectors in many languages as an example (PHP-FTP, PHP-local, python, etc). Pick files callback for third parties apps. Directory tree navigation. Copy, Move, Rename (Interactive...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Cloud Export is a tool to automatically extract your data from web applications and save it to your local file system for backup purposes, but more extensive than Google Takeout. Plans are based on http://www.dataliberation.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    FreeNAS

    FreeNAS

    This project has moved to github - see https://github.com/freenas

    FreeNAS is an Open Source Storage Platform and supports sharing across Windows, Apple, and UNIX-like systems. It includes ZFS (high storage capacities and integrates file systems and volume management into a single piece of software). Note: This project is currently inactive on sourceforge as it has moved to github (see https://github.com/freenas)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    GIIAF Microscopy Library

    GIIAF Microscopy Library

    The GIIAF Microscopy Library, that uses customised OMERO software

    This project incorporates a suite of tools that aim to allow researchers within Griffith's Imaging and Image Analysis Facility (GIIAF) to efficiently and effectively provide secure, centralised, web-accessible data storage, management and manipulation. The open-source Java-based OMERO software was customised to provide most of the features of this project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    RiverGlass EssentialScanner is an open source web and file system crawler which indexes the text content of discovered files so they can be retrieved and analyzed. It provides simple scanner capabilities as part of larger enterprise search solutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Sushi, huh? is an aplication for download GNU/Linux packages from another OS or Linux distribution, for an posterior offline installation. Thinked for people that not have conexion to Internet.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    pyTarget
    Implement a powerful iSCSI target in python, easily use under most popular systems. It also includes the following features: multi-target, multi-connect/session support chap authentication support header & data digest support erl =2, VTL, etc...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Backup and restore of files to web mail systems, ftp, sftp. Uses free storage of gmail/hotmail etc. Archives files, splits large files, encrypts and uploads. Requires python (tested with python 2.5)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A small Python script that allows administrators to place quotas on *nix accounts without much technical knowledge or root access. It is ideal for those who share and/or resell web hosting or other resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DiskAt is disk/media catalogue app supporting multiple categories per item, good search and features which allow to use it as Movie/DVD/etc database. Written with PHP/Python/SQLite.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Universal information crawler is a fast precise and reliable Internet crawler. Uicrawler is a program/automated script which browses the World Wide Web in a methodical, automated manner and creates the index of documents that it accesses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Mutualized distant storage space management tool (using a distributed system).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Intronet is a light weighted framework which allows user to work with remote Linux system and to do some administration tasks using web browser. It's fully usable in browsers at mobile devices such pda, modern cell phones, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.