Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "java crawler" - Page 3

x

Sort By:

Relevance

OS

Linux 58
Windows 58
Mac 53
More...
BSD 37
ChromeOS 34
Desktop Operating Systems 1

Category

Internet 48
Software Development 9
Scientific/Engineering 8
System 8
Business 6
Games 4
Artificial Intelligence 2
Education 2
Formats and Protocols 2
Communications 1
Database 1
Security 1
Social sciences 1

License

OSI-Approved Open Source 48
Public Domain 3
Other License 2

Translations

English 22
French 2
German 2
Chinese (Simplified) 1
More...
Italian 1

Programming Language

Java 60
PHP 4
C++ 2
JavaScript 2
More...
JSP 2
PL/SQL 2
Python 2
C 1
Go 1
Lua 1
Visual Basic .NET 1

Status

Beta 14
Pre-Alpha 13
Alpha 13
Production/Stable 13
More...
Planning 7
Mature 1

Showing 75 open source projects for "java crawler"

View related business solutions

Leverage AI to Automate Medical Coding
Medical Coding Solution

As a healthcare provider, you should be paid promptly for the services you provide to patients. Slow, inefficient, and error-prone manual coding keeps you from the financial peace you deserve. XpertDox’s autonomous coding solution accelerates the revenue cycle so you can focus on providing great healthcare.

Learn More
Automated RMM Tools | RMM Software
Proactively monitor, manage, and support client networks with ConnectWise Automate

Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.

Learn More
1

WebNews Crawler

WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.

Downloads: 0 This Week

Last Update: 2013-04-23
See Project
2

Course Crawler

Course Crawler is an application to compile term-definition pair from multiple web glossaries into a centralized, stable, and searchable location.

Downloads: 0 This Week

Last Update: 2013-03-11
See Project
3

Crawl-By-Example (Heritrix plugin)

Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.

Downloads: 0 This Week

Last Update: 2014-12-14
See Project
4

GronoSpy

GronoSpy is a WWW crawler which tries to extract knowledge based on the data from grono.net - a community portal.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
Rezku Point of Sale
Designed for Real-World Restaurant Operations

Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.

Learn More
5

J-Obey (Robots.txt Crawler Module)

J-Obey is a Java Library/package, which allows people writing their own crawlers to have a stable Robots.txt parser, if you are writing a web crawler of some sort you can use J-Obey to take out the hassle of writing a Robots.txt parser/intrepreter.

Downloads: 0 This Week

Last Update: 2015-08-05
See Project
6

isobel

A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
7

Crawler/Load Tester in Java

JCrawler is a perfect cralwing/load-testing tool which is cookie-enabled and follows human crawling pattern (hit/second).

Downloads: 1 This Week

Last Update: 2013-04-25
See Project
8

SmartCrawler

SmartCrawler is a java-based fully configurable, multi-threaded and extensible crawler, which is able to fetch and analyze the contents of a web site by using dinamically pluggable filters

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
9

Crawlet engine.

Web Crawler Engine: jsrCRAW is an intelligent Java engine Crawler for Internete Content Monitoring: read periodically the content of url, retrieve link, apply rules (Crawlet) alert user of changes.

Downloads: 0 This Week

Last Update: 2015-11-28
See Project
Field Service Management Software | BlueFolder
Maximize technician productivity with intuitive field service software

Track all your service data in one easy-to-use system, enabling your team to move faster and generate more revenue for your bottom line.

Learn More
10

webloupe

WebLoupe is a java-based tool for analysis, interactive visualization (sitemap), and exploration of the information architecture and specific properties of local or publicly accessible websites. Based on web spider (or web crawler) technology.

Downloads: 0 This Week

Last Update: 2015-01-06
See Project
11

Pödznsnatch

Pödznsatch is a open and distributed hypergoogle of love. It is a semantic web application for social networking, word-of-mouth analysis and profiling. The Pödznsatch architecture includes a bot crawler, an inference engine and a query interface.

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
12

Universal WWW Tester

WWW Universal Tester is a Java application designed to gather information about WWW. She works as a spider (robot, crawler) and collets information about size of files used on the web, structure of connections between pages, on so on.

Downloads: 0 This Week

Last Update: 2013-02-27
See Project
13

Arn0lD

A new Web Crawler including sophisticated searching process especialized by language !

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
14

Lucene Advanced Retrieval Machine (LARM)

LARM is a 100% Java search solution for end-users of the Jakarta Lucene search engine framework. It contains methods for indexing files, database tables, and a crawler for indexing web sites.

Downloads: 0 This Week

Last Update: 2013-04-11
See Project
15

XMLCrawler

a crawler to index and search the XML web

Downloads: 0 This Week

Last Update: 2013-02-25
See Project
16

WebSPHINX

WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.

2 Reviews

Downloads: 0 This Week

Last Update: 2015-11-12
See Project
17

CETools

Content Engineering Tools including an XSLT based site rendering system, XSLT Documentation Generator, and Swing based Site Crawler. The tools may be downloaded and used seperately since there are no dependancies between them.

Downloads: 0 This Week

Last Update: 2014-07-07
See Project
18

MS Crawler

An application to crawl public profiles of www.myspace.com

Downloads: 0 This Week

Last Update: 2016-08-03
See Project
19

Roy Image Crawler

This project aims to be a base for specialized image crawlers. It can download images from a specific website and can be extended to crawler any website. All the the processes are multithread. Accept filters.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
20

JavaTwitterCrawler

Java Twitter Crawler

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
21

RedditCrawler

Crawls reddit website to pull statistical info.

Reddit Crawler is made to crawl a list of subreddits and get the number of online users. The project will be updated to get more statistical info

Downloads: 0 This Week

Last Update: 2014-11-22
See Project
22

Stegcrawler

A web crawler to search the Internet for use of steganography

A web crawler to search the Internet for use of steganography. Includes a MySQL database, and a Java based application to search for, test, and attempt to crack images that (may) use steganography. Created by the CIST 1450: Object Orientated Programming class at the University of Pittsburgh at Bradford. Class participants were: Josiah Bennett Dan Connor Lincoln Dorward Samuel Ficorilli Samuel Kleiner Bryan Nelson Rachel Rybicki Mark Saccucci Adam Schrot Daniel Taylor Steven Trumbull Aaron Weise Learn more here: http://coursecast.upb.pitt.edu/Panopto/Pages/Viewer/Default.aspx?...

Downloads: 0 This Week

Last Update: 2012-12-05
See Project
23

Luanium

A Lua-based crawling scripting language and leveraging selenium

...I would put commands in a file or DB to use selenium to interpret the HTML and Javascript. The best would be to have a complete language with conditionals and looping. I'm a java developper and I needed that the crawler to run in a Spring-Boot application. So I decided to use a Lua interpreter in Java to build a crawling tool based on Selenium. The trick here is to add the crawling commands into the Lua interpreter.

Downloads: 0 This Week

Last Update: 2018-12-06
See Project
24

Spider

Spider is web crawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.

Downloads: 0 This Week

Last Update: 2014-08-09
See Project
25

studiMaps

studiMaps is a web based application for visualization and analysis of social networks. It consists of two software components: a web-crawler for getting data and the web based application for visualization.

Downloads: 0 This Week

Last Update: 2014-08-03
See Project

Previous
1
2
You're on page 3
Next

Related Searches

news crawler

web crawler

simple tv

jcrawler

web spider

64 bit windows projects

spider

crawler spider

crawler and scraper

heritrix

Related Categories

Internet

Software Development

Scientific/Engineering

System

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: