void based linux free download

Showing 17 open source projects for "void based linux"

View related business solutions

Web Scrapers Java Clear Filters & Widen Search

Automate contact and company data extraction
Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.

Explore Apify Store
Rent Manager Software
Landlords, multi-family homes, manufactured home communities, single family homes, associations, commercial properties and mixed portfolios.

Rent Manager is award-winning property management software built for residential, commercial, and short-term-stay portfolios of any size. The program’s fully customizable features include a double-entry accounting system, maintenance management/scheduling, marketing integration, mobile applications, more than 450 insightful reports, and an API that integrates with the best PropTech providers on the market.

Learn More
1

WebMagic

A scalable web crawler framework for Java

WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other...

Downloads: 2 This Week

Last Update: 2025-02-10
See Project
2

WebHarvest - web data extraction tool

Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.

14 Reviews

Downloads: 5 This Week

Last Update: 2025-10-25
See Project
3

Crawlab

Distributed web crawler admin platform for spiders management

Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate...

Downloads: 0 This Week

Last Update: 2023-07-26
See Project
4

ACHE Focused Crawler

ACHE is a web crawler for domain-specific search

ACHE is a focused web crawler. It collects web pages that satisfy some specific criteria, e.g., pages that belong to a given domain or that contain a user-specified pattern. ACHE differs from generic crawlers in sense that it uses page classifiers to distinguish between relevant and irrelevant pages in a given domain. A page classifier can be defined as a simple regular expression (e.g., that matches every page that contains a specific word) or a machine-learning-based classification model....

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
Easy-to-use Business Software for the Waste Management Software Industry
Increase efficiency, expedite accounts receivables, optimize routes, acquire new customers, & more!

DOP Software’s mission is to streamline waste and recycling business’ processes by providing them with dynamic, comprehensive software and services that increase productivity and quality of performance.

Learn More
5

WebCollector

WebCollector is an open source web crawler framework based on Java.

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes. Github: https://github.com/CrawlScript/WebCollector Demo: https://github.com/CrawlScript/WebCollector/blob/master/YahooCrawler.java

Downloads: 0 This Week

Last Update: 2015-06-04
See Project
6

webStraktor

webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy...

Downloads: 0 This Week

Last Update: 2014-04-25
See Project
7

Constellio Enterprise Search engine

Open source Search Engine and Enterprise Search

Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.

Downloads: 0 This Week

Last Update: 2015-03-31
See Project
8

Heritrix: Internet Archive Web Crawler

The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.

21 Reviews

Downloads: 3 This Week

Last Update: 2013-06-05
See Project
9

Java Web Spider

Spider web scritto in java che consente un utilizzo sia come applicazione stand alone, sia come core di altre applicazioni che sfruttino le sue funzionalità.

Downloads: 0 This Week

Last Update: 2013-04-19
See Project
Trumba is an All-in-one Calendar Management and Event Registration platform
Great for live, virtual and hybrid events

Publish, promote and track your events more affordably and effectively—all in one place.

Learn More
10

Aracnis

Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.

Downloads: 0 This Week

Last Update: 2015-07-13
See Project
11

HtmlClient

HtmlClient provides an SGML/HTML/XHTML parser and connection client making web-spidering as easy for developers as actually surfing the web with a premade browser. Based on Apache's HttpClient.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
12

ASpider

Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
13

webloupe

WebLoupe is a java-based tool for analysis, interactive visualization (sitemap), and exploration of the information architecture and specific properties of local or publicly accessible websites. Based on web spider (or web crawler) technology.

Downloads: 0 This Week

Last Update: 2015-01-06
See Project
14

Arachnid Web Spider Framework

Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page

Downloads: 0 This Week

Last Update: 2013-03-08
See Project
15

WebSPHINX

WebSPHINX is a web crawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.

2 Reviews

Downloads: 1 This Week

Last Update: 2015-11-12
See Project
16

studiMaps

studiMaps is a web based application for visualization and analysis of social networks. It consists of two software components: a web-crawler for getting data and the web based application for visualization.

Downloads: 0 This Week

Last Update: 2014-08-03
See Project
17

Spider

Spider is web crawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.

Downloads: 0 This Week

Last Update: 2014-08-09
See Project