43 projects for "gist web crawler" with 2 filters applied:

  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing Software: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://easyperlspider.sourceforge.io/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/opensource/ It is fun to look at some code that is few years ago and to see how one has improved himself. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    PHP mini vulnerability suite

    Multiple server/webapp vulnerability scanner

    github: https://github.com/samedog/phpmvs
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    phoneutria
    A Java Web crawler: multi-threaded, scalable, with high performance, extensible and polite. It can be used to crawl and index any web or enterprise domain and is configurable through a XML configuration file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Comprehensive School Dismissal Solution Icon
    The Comprehensive School Dismissal Solution

    For Public, Charter, and Private Schools, Daycares, After-School Programs, and Summer Camps

    PikMyKid is the first and only safe & smart dismissal solution for school districts, charter/private schools, after-school programs, YMCAs, JCCs, Summer camps, and daycare facilities. It connects schools, teachers, and parents through real-time tools to make dismissals safer and more efficient. PikMyKid schools are able to confidently organize their dismissals with ease and no longer rely on paper notes or tedious phone calls to the front office.
    Learn More
  • 5
    OpenWebSpider
    OpenWebSpider is an Open Source multi-threaded Web Spider (robot, crawler) and search engine with a lot of interesting features!
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Addons for IOSEC - DoS HTTP Security

    Addons for IOSEC - DoS HTTP Security

    IOSec Addons are enhancements for web security and crawler detection

    IOSEC PHP HTTP FLOOD PROTECTION ADDONS IOSEC is a php component that allows you to simply block unwanted access to your webpage. if a bad crawler uses to much of your servers resources iosec can block that. IOSec Enhanced...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    sitecheck

    Modular web site spider for web developers.

    More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Constellio Enterprise Search engine

    Constellio Enterprise Search engine

    Open source Search Engine and Enterprise Search

    Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Yet another web crawler? Yes, but this ones uses the full power of regular expressions to accept or reject, examine or ignore, save or refuse pages. You also use MIME types to do all this. Powerful and flexible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • B2i offers full-service IR websites, widgets and plugins Icon
    B2i offers full-service IR websites, widgets and plugins

    Built for IR professionals who work for, or support public companies

    B2i Technologies provides the most robust and versatile tools to manage your Corporate website, Investor Relations website and email communications. Our Investor Relations Software solutions work through automation and implements into existing systems with ease in only a few steps. Our solutions not only help you stay compliant but save valuable time while reporting and delivering critical financial data and press release activities to investors. B2i's Investor Relations Solution provides highly reliable and customizable data for corporate websites including press releases, stock data, charting, and SEC filings within SOX compliance standards. Our investor relations software displays real-time data on your website without requiring additional work on your behalf. Once you have completed your filings and press releases they are automatically loaded onto your website and formatted for easy access.
    Learn More
  • 10
    PRO-Search is a crawler of FTP servers, SMB shares, HTTP, dc++ networks, ... with powerful web search and navigation interface
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Python Crawler Library

    Python Web Crawler Library

    A simple library for crawling the web. This library will give you the ability to create macros for crawling web site and preforming simple actions like preforming "log in" and other simple actions in web sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Ex-Crawler
    Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    A school project consisting of a crawler, a server and a searchpage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ItSucks
    This project is a java web spider (web crawler) with the ability to download (and resume) files. It is also highly customizable with regular expressions and download templates. All backend functionalities are also available in a separate library.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    MuSE-CIR is a Multigram-based Search Engine and Collaborative Information Retrieval system. Written in Java /JSP, supports any JDBC connectable database - thoroughly tested only with OracleXE, and somewhat with MySQL, JSP on Apache Tomcat 5.5
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ** Guys I have built a much more powerful Fully Featured CMS system at: https://github.com/MacdonaldRobinson/FlexDotnetCMS Macs CMS is a Flat File ( XML and SQLite ) based AJAX Content Management System. It focuses mainly on the Edit In Place editing concept. It comes with a built in blog with moderation support, user manager section, roles manager section, SEO / SEF URL
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Discontinued lightweight Desktop-Files/SMB/FTP crawler and search engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    jSEO -- Pluggable SEO (Search Engine Optimization) for dynamic JEE web applications
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    APC Anti Crawler is a php5 class based on APC which can be used to limit the amount of http request per IP. It stop web crawler to download your entire website.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    elk is a powerful open-source python based command-line web crawler that can recursively search for files and text on websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Retriever is a simple crawler packed as a Java library that allows developers to collect and manipulate documents reachable by a variety of protocols (e.g. http, smb). You'll easily crawl documents shared in a LAN, on the Web, and many other sources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    The DeDuplicator is an add-on module (plug-in) for the web crawler Heritrix. It offers a means to reduce the amount of duplicate data collected in a series of snapshot crawls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PHP Crawler is a simple website search script for small-to-medium websites. The only requrements are PHP and MySQL, no shell access required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A web service that allows users to summarize and tag published research in a manner that is meaningful to the user, allows them to specify the "gist" of the article.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next