Showing 42 open source projects for "java crawler"

View related business solutions
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • Make hybrid work a reality with Robin Icon
    Make hybrid work a reality with Robin

    With maps, space and desk management, distance planning, analytics, and more, returning to the office is easier than ever.

    Whether you want to make it easier to find, book meeting rooms or search and reserve shared desks, Robin empowers office managers and employees alike to make the office work for them, and not the other way around.
    Learn More
  • 1
    LogCrawler is an ANT task for automatic testing of web applications. Using a HTTP crawler it visits all pages of a website and checks the server logfiles for errors. Use it as a "smoketest" with your CI system like CruiseControl.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Course Crawler is an application to compile term-definition pair from multiple web glossaries into a centralized, stable, and searchable location.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Bare Metal Cloud Servers Icon
    Bare Metal Cloud Servers

    Cloud-native dedicated servers powered by automation

    phoenixNAP is a global IaaS provider delivering world-class infrastructure solutions from strategic edge locations in the U.S., Europe, Asia-Pacific, Australia, and Latin America.
    Learn More
  • 5
    GronoSpy is a WWW crawler which tries to extract knowledge based on the data from grono.net - a community portal.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    J-Obey is a Java Library/package, which allows people writing their own crawlers to have a stable Robots.txt parser, if you are writing a web crawler of some sort you can use J-Obey to take out the hassle of writing a Robots.txt parser/intrepreter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SmartCrawler is a java-based fully configurable, multi-threaded and extensible crawler, which is able to fetch and analyze the contents of a web site by using dinamically pluggable filters
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    WebLoupe is a java-based tool for analysis, interactive visualization (sitemap), and exploration of the information architecture and specific properties of local or publicly accessible websites. Based on web spider (or web crawler) technology.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Member Management Software for Labor Unions Icon
    Member Management Software for Labor Unions

    ​eMembership has everything you need to effectively manage your labor union.

    We built the first version of eMembership in 2008 to address a growing problem faced by labor unions – aging computer systems that no longer supported the organization or the industry. Our goal was to build a system that could evolve with the times. We used contemporary software and a modular design that can support the unique requirements of any organization. We host eMembership in our highly-redundant, SSAE-16 compliant data center, so we take care of hardware, software, operating systems, security patches, system monitoring, bandwidth, backups…while you focus on your core business.
    Learn More
  • 10
    A new Web Crawler including sophisticated searching process especialized by language !
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    a crawler to index and search the XML web
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    An application to crawl public profiles of www.myspace.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    This project aims to be a base for specialized image crawlers. It can download images from a specific website and can be extended to crawler any website. All the the processes are multithread. Accept filters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    RedditCrawler

    Crawls reddit website to pull statistical info.

    Reddit Crawler is made to crawl a list of subreddits and get the number of online users. The project will be updated to get more statistical info
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Spider is web crawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    studiMaps is a web based application for visualization and analysis of social networks. It consists of two software components: a web-crawler for getting data and the web based application for visualization.
    Downloads: 0 This Week
    Last Update:
    See Project