Showing 40 open source projects for "file crawler"

View related business solutions
  • Cybersecurity Management Software for MSPs Icon
    Cybersecurity Management Software for MSPs

    Secure your clients from cyber threats.

    Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.
  • All-in-One Payroll and HR Platform Icon
    All-in-One Payroll and HR Platform

    For small and mid-sized businesses that need a comprehensive payroll and HR solution with personalized support

    We design our technology to make workforce management easier. APS offers core HR, payroll, benefits administration, attendance, recruiting, employee onboarding, and more.
  • 1
    File System Crawler for Elasticsearch

    File System Crawler for Elasticsearch

    Elasticsearch File System Crawler (FS Crawler)

    This crawler helps to index binary documents such as PDF, Open Office, MS Office. Local file system (or a mounted drive) crawling and indexing new files, updating existing ones, and removing old ones. Remote file system over SSH/FTP crawling. REST interface to let you “upload” your binary documents to elastic search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender...
    Leader badge
    Downloads: 102 This Week
    Last Update:
    See Project
  • 3
    Nebula libp2p DHT

    Nebula libp2p DHT

    A libp2p DHT crawler, monitor, and measurement tool

    A libp2p DHT crawler and monitor that tracks the liveness of peers. The crawler connects to DHT bootstrap peers and then recursively follows all entries in their k-buckets until all peers have been visited. The crawler supports the IPFS, Filecoin, Polkadot, Kusama, Rococo, Westend networks and more. The crawler can store its results as JSON documents or in a postgres database - the --dry-run flag prevents it from doing either. Nebula will print a summary of the crawl at the end instead. A crawl...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    ... just load actually all links it is finding (and is allowed to load according to the robots.txt file), then it would just load the whole internet (if the URL(s) it starts with are no dead end). Or it can be restricted to load only links matching certain criteria (on same domain/host, URL path starts with "/foo",...) or only to a certain depth. A depth of 3 means 3 levels deep. Links found on the initial URLs provided to the crawler are level 1 and so on.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Simply Powerful Maintenance Management Software Icon
    Simply Powerful Maintenance Management Software

    CMMS helps you Schedule and Track everything in your facility

    Maintenance Care is a full-featured CMMS (computerized maintenance management system) offers preventive maintenance, asset tracking, document storage, reporting dashboards, numerous integrations and even more features designed to help you maintain the health and standards of every facility under your umbrella. Anyone can learn and begin using our CMMS with ease — no tech experience is required.
  • 5
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DNS Crawler

    DNS Crawler

    Take a domain list and fetch DNS query into tab-delimited file

    Very basic utility to assess various parameter from a domain. Useful for small-ISP and individual who own multiple domain. Currently the following values are fetched: SOA, NS, MX, TXT/SPF and A record.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Laravel Sitemap

    Laravel Sitemap

    Create and generate sitemaps with ease

    ... it in the callable you pass to hasCrawled. You can also instruct the underlying crawler to not crawl some pages by passing a callable to shouldCrawl. You can configure the crawler used by the sitemap generator. The sitemap generator can execute JavaScript on each page so it will discover links that are generated by your JS scripts. You can enable this feature by setting execute_javascript in the config file to true.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    miniblink49

    miniblink49

    Lighter, faster browser kernel of blink to integrate HTML UI in apps

    miniblink is an open source, one file, small browser widget based on chromium. By using C interface, you can create a browser with just some line code. miniblink is an open source, single-file, and currently the smallest known chromium-based browser control. Through its exported pure C interface, a browser control can be created in a few lines of code. C++, C#, Delphi and other language calls (support C++, C#, Delphi language to call). Embedded Nodejs, support electron (with Nodejs, can run...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LambdaHack

    LambdaHack

    Haskell game engine library for roguelike dungeon crawlers

    Haskell game engine library for roguelike dungeon crawlers. LambdaHack is a Haskell game engine library for ASCII roguelike games of arbitrary theme, size and complexity, with optional tactical squad combat. It's packaged together with a sample dungeon crawler in a quirky fantasy setting. To use the engine, you need to specify the content to be procedurally generated. You declare what the game world is made of (entities, their relations, physics and lore) and the engine builds the world...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Contractor Foreman is the most affordable all-in-one construction management software for contractors and is trusted by contractors in more than 75 countries. Icon
    Starting at $49/m for the WHOLE company, Contractor Foreman is the most affordable all-in-one construction management system for contractors. Our customers in 75+ countries and industry awards back it up. And it's all backed by a 100 day guarantee.
  • 10
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiber.com/marketing/ https://www.paraphrasingtool1.com...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12
    WFDownloader App

    WFDownloader App

    Free batch downloader for image, wallpaper, video, audio, document,

    Use as an image gallery, wallpaper, audio/music, video, document, and other media bulk downloader from supported websites. Also use to download sequential website urls that have a certain pattern (e.g. image01.png to image100.png). Also use app's built-in site crawler for advanced link search or extraction. There is also special support for forum media and open directory downloading. It's a programmable downloader and also works with password protected sites. Say goodbye to downloading one...
    Leader badge
    Downloads: 126 This Week
    Last Update:
    See Project
  • 13
    crawler

    crawler

    A toolchain for bringing web2 to web3

    A toolchain for bringing web2 to web3. IPFS daemon should be launched. Download enwiki-latest-all-titles to crawler root dir. Basically, there are two main functions provided by crawler tool. The first one is to parse wiki titles and submit links between keywords and wiki pages. Also, crawler has separate command upload-duras-to-ipfs to upload files to local IPFS node. All DURAs are collected under single root unixfs directory.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Pholcus

    Pholcus

    Distributed high-concurrency crawler software written in pure golang

    Pholcus is a high-concurrency crawler software written in pure Go language that supports distributed, only used for programming learning and research. It supports three operating modes of stand-alone, server and client, and has three operating interfaces, Web, GUI, and command line; simple and flexible rules, concurrent batch tasks, and rich output methods (mysql/mongodb/kafka/csv/excel, etc.); In addition, it also supports horizontal and vertical grabbing modes, and a series of advanced...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    WebSploit Framework

    WebSploit Framework

    WebSploit is a high level MITM Framework

    WebSploit Advanced MITM Framework [+]Autopwn - Used From Metasploit For Scan and Exploit Target Service [+]wmap - Scan,Crawler Target Used From Metasploit wmap plugin [+]format infector - inject reverse & bind payload into file format [+]phpmyadmin Scanner [+]CloudFlare resolver [+]LFI Bypasser [+]Apache Users Scanner [+]Dir Bruter [+]admin finder [+]MLITM Attack - Man Left In The Middle, XSS Phishing Attacks [+]MITM - Man In The Middle Attack [+]Java Applet Attack [+]MFOD...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 16
    X-RAY

    X-RAY

    The next web scraper, see through the <html> noise

    Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't lose...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 18
    diskover

    diskover

    File system crawler and disk space usage software

    diskover is a file system crawler and disk space usage software that uses Elasticsearch to index your file metadata. diskover crawls and indexes your files on a local computer or remote storage server over network mounts. diskover helps manage your storage by identifying old and unused files and give better insights into data change "hotfiles", file duplication "dupes" and wasted space. It is designed to help deal with managing large amounts of data growth and provide detailed storage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    YouSeer is an open source search engine framework, which was built on top of other open source components. It’s part of the general SeerSuite framework. YouSeer utilizes Hereitrix as a crawler and solr as an indexing system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    phoneutria
    A Java Web crawler: multi-threaded, scalable, with high performance, extensible and polite. It can be used to crawl and index any web or enterprise domain and is configurable through a XML configuration file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Pathfinder Wiki-fr Crawler

    Pathfinder Wiki-fr Crawler

    Tous les sorts, les monstres, les dons et les objets magiques en VF

    Toutes les infos viennent du http://www.pathfinder-fr.org/Wiki/Pathfinder-RPG.MainPage.ashx Le logiciel permet aussi la création de liste de sorts détaillé, d'exportation de de chaque type de données.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    WebCrawler

    get web page. include html、css and js files

    This tool is for the people who want to learn from a web site or web page,especially Web Developer.It can help get a web page's source code.Input the web page's address and press start button and this tool will find the page and according the page's quote,download all files that used in the page ,include css file and javascript files. The html file's name will be 'index.html' and other file's will use it's source name. Note:only support windows platform and http protocol.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Addons for IOSEC - DoS HTTP Security

    Addons for IOSEC - DoS HTTP Security

    IOSec Addons are enhancements for web security and crawler detection

    IOSEC PHP HTTP FLOOD PROTECTION ADDONS IOSEC is a php component that allows you to simply block unwanted access to your webpage. if a bad crawler uses to much of your servers resources iosec can block that. IOSec Enhanced Websites: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiber.com/marketing/ https://www.paraphrasingtool1.com/ https://www.artikelschreiben.com/ https://buzzerstar.com/ https
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24

    A simple Crawler

    We can make a simple crawler with using Java Servlet & JSP . A crawl

    This project consists of 1 main directory, 4 sub directories , 1 main page(html) , 1 servlet DD , 4 java files , 7 classes , and 1 document file as the list below: - [hw5] --- index.html - readme.txt - [WEB-INF] --- web.xml - [classes] --- [mvc] --- HelloController.java - HelloController.class - HelloModel.java - HelloMOdel.class - HelloView.java - HelloView.class - HelloResult.java
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    SauceWalk Proxy Helper

    Enumeration and automation of file discovery for your sec tools.

    ... via a PHP script on the target server(ASP/JSP coming soon). The advantage of this tool is that it allows access to files and folders (for example include or plugin folders) which are not usually seen via a spider or crawler to be security tested with traditional tools. The Py version is on its way soon.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next