Page 6 | gnu/linux free download

Showing 226 open source projects for "gnu/linux"

View related business solutions

Web Scrapers Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
1

Mowglee

Mowglee - The Geo Crawler!

Mowglee is a distributed, multi-threaded, asynchronous task execution based web crawler in Java.It is designed for geographic affinity and is highly modular.

Downloads: 0 This Week

Last Update: 2021-02-18
See Project
2

serverless-chrome

Run headless Chrome/Chromium on AWS Lambda

Serverless Chrome contains everything you need to get started running headless Chrome on AWS Lambda (possibly Azure and GCP Functions soon). The aim of this project is to provide the scaffolding for using Headless Chrome during a serverless function invocation. Serverless Chrome takes care of building and bundling the Chrome binaries and making sure Chrome is running when your serverless function executes. In addition, this project also provides a few example services for common patterns...

Downloads: 0 This Week

Last Update: 2022-04-01
See Project
3

bt-btt

Guide and resources for accessing and using the U3C3 BitTorrent site

BT-btt is a repository that provides information and guidance related to the U3C3 BitTorrent resource site and its magnet-based content search ecosystem. It primarily serves as documentation describing how the site works, how users can access it, and how magnet link resources are organized and retrieved. It explains how BitTorrent and magnet link downloads operate, including the role of trackers and distributed hash table (DHT) networks in locating peers and downloading files. BT-btt also...

Downloads: 4 This Week

Last Update: 2026-03-11
See Project
4

RED HAWK

All-in-one reconnaissance and vulnerability scanning toolkit for sites

RED HAWK is an open source command-line security tool designed for information gathering, vulnerability scanning, and web reconnaissance tasks. It combines multiple scanning and analysis capabilities into a single toolkit to help security researchers and penetration testers quickly analyze a target website. It can collect a wide range of information about domains, servers, and web applications, including network details, hosting configuration, and content management system detection. It also...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

proxypool

Proxy crawler that aggregates, tests, and serves usable proxy nodes

proxypool is an open source proxy aggregation tool that automatically collects proxy node information from publicly available sources on the internet. It crawls different sources such as Telegram channels, subscription links, and publicly accessible web pages to gather proxy configurations. After collecting these nodes, proxypool processes them by removing duplicates and verifying whether each node is functional. proxypool then provides a usable list of proxy nodes that have passed...

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
6

GoogleScraper

Python tool for scraping search engine results from many providers

GoogleScraper is a Python-based tool designed to automatically collect and process search engine results from multiple providers. It enables developers and researchers to programmatically query search engines and extract useful information such as links, titles, and result descriptions. GoogleScraper supports several major search engines and can be used to gather structured datasets from search result pages for further analysis. It provides two different scraping approaches: sending direct...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
7

xsrfprobe

Advanced toolkit for detecting and exploiting CSRF vulnerabilities

XSRFProbe is an advanced security auditing toolkit designed to detect and analyze Cross Site Request Forgery (CSRF/XSRF) vulnerabilities in web applications. It uses an automated crawling engine that continuously scans a target application, collects forms and endpoints, and evaluates them for potential CSRF weaknesses. XSRFProbe performs numerous systematic checks to determine whether a web endpoint is vulnerable, including inspection of anti-CSRF tokens, cookie validation behavior, and...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
8

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...

Downloads: 2 This Week

Last Update: 2021-10-05
See Project
9

WebScrapa - Web Scraping Program

Pulls information from the web undetected.

This program will pull information from the web and display it into a textbox and will do all of it with a smart bot. You can connect bots to this program to make it more practical.

Downloads: 0 This Week

Last Update: 2019-10-01
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

ECommerceCrawlers

Collection of Python ecommerce and website crawler examples projects

ECommerceCrawlers is a collection of practical Python web crawler projects designed to gather data from a variety of ecommerce platforms, websites, and online services. It aggregates many independent crawler examples created by contributors and organized into separate subprojects that target specific sites or data sources. These examples demonstrate how to build and operate web scrapers capable of collecting structured information such as product listings, news content, job postings, social...

Downloads: 2 This Week

Last Update: 4 hours ago
See Project
11

YouTube Video Downloader

Allows you to download youtube videos into a video/audio format.

YouTube Video Downloader By Chase, This is a tool developed in python, by web scraping I can get the videos from YouTube and download it on my machine in a video/audio format, easy-to-use GUI for your needs, dark theme.

1 Review

Downloads: 1 This Week

Last Update: 2019-07-10
See Project
12

ProxyBroker

Asynchronous tool for finding and checking public proxy servers

ProxyBroker is an open source Python tool designed to automatically discover and verify public proxy servers from many online sources. It operates asynchronously, allowing it to gather and test large numbers of proxies efficiently while performing multiple checks concurrently. It collects proxy addresses from dozens of providers and evaluates whether they are functional and suitable for use. It supports several proxy protocols, including HTTP, HTTPS, SOCKS4, and SOCKS5, making it flexible...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
13

SEO MACROSCOPE

SEO Macroscope is a website scanning tool, to check your website

The website broken link scanner and technical SEO toolbox. SEO Macroscope for Microsoft Windows is a free and open-source website broken link checking and scanning tool, with some technical SEO functionality for common website problems. Find broken links on your website, both internal and external. Report robots.txt statuses. Check and report canonical, hreflang, and other metadata problems. Perform simple, configurable Technical SEO checks on titles and descriptions. Report fastest/slowest...

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
14

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers, but...

Downloads: 0 This Week

Last Update: 2022-09-05
See Project
15

WebExtractServer

WebExtractServer use with WebExtractLte for use with web browsers

Browse data, fetched by WebExtractLte directly in your browser. Designed to be used with Webscraper (webscraper.io) - third party web scraper tool, available as plugin for Chrome and Firefox.

Downloads: 0 This Week

Last Update: 2019-04-29
See Project
16

pxer

Pixiv crawler userscript for downloading artwork and galleries easily

Pxer is an open source tool designed to help users collect and download artwork from the Pixiv illustration platform. It is implemented primarily in client-side JavaScript and runs directly in the browser through a userscript environment, allowing it to integrate seamlessly with Pixiv pages. Pxer provides functionality to crawl and gather images, artwork metadata, and other related content from supported Pixiv pages. It is designed to be accessible even for users who are not developers,...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
17

GOPA

GOPA, a spider written in Golang, for Elasticsearch

GOPA, a spider written in Golang, for Elasticsearch. Lightweight, low footprint, memory requirement should, be 100MB. Easy to deploy, no runtime or dependency required. Easy to use, no programming or script ability needed, out-of-box features. First of all, get it, two opinions: download the pre-built package or compile it yourself. Besides Elasticsearch, Gopa doesn't require any other dependencies, just simply run ./gopa to start the program. It's safety to press ctrl+c to stop the current...

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
18

mzitu

Python crawler that downloads image galleries and analyzes titles

mzitu is a Python-based web crawling project designed to automatically download and organize image galleries from a specific photography site. It demonstrates how to build a scraper that navigates gallery pages, retrieves image links, and saves the images locally in a structured directory layout. It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
19

GitGet

Ever wanted to download only a part of a Git repository.

Ever wanted to download only a part of a Git repository. Just paste the URL of the repo you want to download and sit back and enjoy. This simple java application makes use of Web Scraping and downloads only those files you need, thus helping you save your precious bandwidth and space.

1 Review

Downloads: 1 This Week

Last Update: 2018-09-03
See Project
20

Twitter Intelligence

Twitter Intelligence OSINT project performs tracking and analysis

A project written in Python for Twitter tracking and analysis without using Twitter API. This project is a Python 3.x application. The package dependencies are in the file requirements.txt. Run that command to install the dependencies. SQLite is used as the database. Tweet data is stored on the Tweet, User, Location, Hashtag, HashtagTweet tables. The database is created automatically. analysis.py performs analysis processing. User, hashtag, and location analyzes are performed. You must write...

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
21

WeChatSogou

Python library to crawl and retrieve data from WeChat accounts

WechatSogou is an open source Python library designed to retrieve data from WeChat official accounts by using the Sogou WeChat search service as its data source. It provides developers with a programmatic way to search for public accounts and collect article information without manually browsing the search interface. It functions as a crawler interface that sends requests to the search engine, retrieves results, and converts the returned pages into structured data that can be used in...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
22

pyspider

A powerful Spider(Web Crawler) system in Python

pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking. Since pyspider has various components, you can just run pyspider to start a standalone and...

Downloads: 0 This Week

Last Update: 2021-03-31
See Project
23

crawler4j

Open source web crawler for Java

crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not. In the above example, this example is not allowing .css, .js and media files and only allows pages...

Downloads: 1 This Week

Last Update: 2022-01-12
See Project
24

haipproxy

Distributed proxy IP pool for web crawlers using Scrapy and Redis

HAipproxy is a distributed proxy IP pool system designed to collect, manage, and provide large numbers of proxy addresses for web crawling tasks. It automatically crawls proxy resources from the internet and aggregates them into a centralized pool that can be accessed by distributed spiders and scraping systems. It is built using Python and relies on Scrapy for high-performance crawling while Redis is used for data storage, communication, and task coordination between components. It includes...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
25

jd-autobuy

Python tool that automates JD.com login and product purchase tasks

jd-autobuy is an open source Python-based automation tool designed to simulate the purchasing process on the JD e-commerce platform. It uses web scraping and HTTP request techniques to log into an account, check product availability, and attempt to purchase specified items automatically. It supports login through methods such as QR code authentication, allowing users to sign in through the platform’s mobile application. Once authenticated, the script can retrieve product details including...

Downloads: 5 This Week

Last Update: 4 days ago
See Project