Showing 75 open source projects for "java crawler"

View related business solutions
  • Leverage AI to Automate Medical Coding Icon
    Leverage AI to Automate Medical Coding

    Medical Coding Solution

    As a healthcare provider, you should be paid promptly for the services you provide to patients. Slow, inefficient, and error-prone manual coding keeps you from the financial peace you deserve. XpertDox’s autonomous coding solution accelerates the revenue cycle so you can focus on providing great healthcare.
    Learn More
  • Automated RMM Tools | RMM Software Icon
    Automated RMM Tools | RMM Software

    Proactively monitor, manage, and support client networks with ConnectWise Automate

    Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.
    Learn More
  • 1
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Course Crawler is an application to compile term-definition pair from multiple web glossaries into a centralized, stable, and searchable location.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GronoSpy is a WWW crawler which tries to extract knowledge based on the data from grono.net - a community portal.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • 5
    J-Obey is a Java Library/package, which allows people writing their own crawlers to have a stable Robots.txt parser, if you are writing a web crawler of some sort you can use J-Obey to take out the hassle of writing a Robots.txt parser/intrepreter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JCrawler is a perfect cralwing/load-testing tool which is cookie-enabled and follows human crawling pattern (hit/second).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    SmartCrawler is a java-based fully configurable, multi-threaded and extensible crawler, which is able to fetch and analyze the contents of a web site by using dinamically pluggable filters
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Web Crawler Engine: jsrCRAW is an intelligent Java engine Crawler for Internete Content Monitoring: read periodically the content of url, retrieve link, apply rules (Crawlet) alert user of changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Field Service Management Software | BlueFolder Icon
    Field Service Management Software | BlueFolder

    Maximize technician productivity with intuitive field service software

    Track all your service data in one easy-to-use system, enabling your team to move faster and generate more revenue for your bottom line.
    Learn More
  • 10
    WebLoupe is a java-based tool for analysis, interactive visualization (sitemap), and exploration of the information architecture and specific properties of local or publicly accessible websites. Based on web spider (or web crawler) technology.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Pödznsatch is a open and distributed hypergoogle of love. It is a semantic web application for social networking, word-of-mouth analysis and profiling. The Pödznsatch architecture includes a bot crawler, an inference engine and a query interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    WWW Universal Tester is a Java application designed to gather information about WWW. She works as a spider (robot, crawler) and collets information about size of files used on the web, structure of connections between pages, on so on.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    A new Web Crawler including sophisticated searching process especialized by language !
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    LARM is a 100% Java search solution for end-users of the Jakarta Lucene search engine framework. It contains methods for indexing files, database tables, and a crawler for indexing web sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    a crawler to index and search the XML web
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Content Engineering Tools including an XSLT based site rendering system, XSLT Documentation Generator, and Swing based Site Crawler. The tools may be downloaded and used seperately since there are no dependancies between them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    An application to crawl public profiles of www.myspace.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    This project aims to be a base for specialized image crawlers. It can download images from a specific website and can be extended to crawler any website. All the the processes are multithread. Accept filters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Java Twitter Crawler
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    RedditCrawler

    Crawls reddit website to pull statistical info.

    Reddit Crawler is made to crawl a list of subreddits and get the number of online users. The project will be updated to get more statistical info
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Stegcrawler

    A web crawler to search the Internet for use of steganography

    A web crawler to search the Internet for use of steganography. Includes a MySQL database, and a Java based application to search for, test, and attempt to crack images that (may) use steganography. Created by the CIST 1450: Object Orientated Programming class at the University of Pittsburgh at Bradford. Class participants were: Josiah Bennett Dan Connor Lincoln Dorward Samuel Ficorilli Samuel Kleiner Bryan Nelson Rachel Rybicki Mark Saccucci Adam Schrot Daniel Taylor Steven Trumbull Aaron Weise Learn more here: http://coursecast.upb.pitt.edu/Panopto/Pages/Viewer/Default.aspx?...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Luanium

    A Lua-based crawling scripting language and leveraging selenium

    ...I would put commands in a file or DB to use selenium to interpret the HTML and Javascript. The best would be to have a complete language with conditionals and looping. I'm a java developper and I needed that the crawler to run in a Spring-Boot application. So I decided to use a Lua interpreter in Java to build a crawling tool based on Selenium. The trick here is to add the crawling commands into the Lua interpreter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Spider is web crawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    studiMaps is a web based application for visualization and analysis of social networks. It consists of two software components: a web-crawler for getting data and the web based application for visualization.
    Downloads: 0 This Week
    Last Update:
    See Project