Showing 80 open source projects for "art-linux"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    Diskover Community Edition is an open source file system indexing and storage analytics platform designed to help organizations understand and manage large volumes of file data. It crawls file systems and indexes metadata using Elasticsearch, enabling fast search, analysis, and organization of files stored across different storage systems. It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Letterboxd Recommendations

    Letterboxd Recommendations

    Scraping publicly-accessible Letterboxd data for movie recommendations

    Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username. A user's "star" ratings are scraped from their Letterboxd profile and assigned numerical ratings from 1 to 10 (accounting for half stars). Their ratings are then combined with a sample of ratings from the top 4000 most active users on the site to create a collaborative filtering recommender model using singular value...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Selectolax

    Selectolax

    Python binding to Modest and Lexbor engines

    A fast HTML5 parser with CSS selectors using Modest and Lexbor engines. Selectolax supports two backends: Modest and Lexbor. By default, all examples use the Modest backend. Most of the features between backends are almost identical, but there are still some differences. Currently, the Lexbor backend is in beta and missing some of the features. To use lexbor, just import the parser and use it in the similar way to the HTMLParser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MechanicalSoup

    MechanicalSoup

    A Python library for automating interaction with websites

    A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do JavaScript. MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize was incompatible with Python 3 until 2019 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    furl

    furl

    The easiest way to parse and modify URLs in Python

    furl is a small Python library that makes it easier to parse, inspect, build, and modify URLs without relying on tedious low-level URL handling code. It wraps URL components into convenient objects, so developers can work directly with schemes, usernames, passwords, hosts, ports, paths, queries, and fragments. The library supports simple path editing, query argument changes, fragment manipulation, inline method chaining, and URL joining. It also handles encoding automatically, including...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    ScrapydWeb

    ScrapydWeb

    Web app for Scrapyd cluster management

    Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization. Make sure that Scrapyd has been installed and started on all of your hosts. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings on the first startup.) Add your Scrapyd servers, both formats of string and tuple are supported, you can attach basic auth for accessing the Scrapyd server, as well as a string for grouping or labeling. You can select any...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Grab Framework Project

    Grab Framework Project

    Web Scraping Framework

    Grab is a python framework for building web scrapers. With Grab you can build web scrapers of various complexity, from simple 5-line scripts to complex asynchronous website crawlers processing millions of web pages. Grab provides an API for performing network requests and for handling the received content e.g. interacting with DOM tree of the HTML document. The single request/response API that allows you to build network request, perform it and work with the received content. The API is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Scweet

    Scweet

    Scrape tweets, profiles, followers and following from Twitter/X

    Scweet is a Python-based Twitter/X scraping library and CLI designed to collect tweets, profile timelines, followers, following lists, and user profile data without requiring the official Twitter/X API or a developer account. Instead of depending on deprecated unauthenticated scraping methods, it works by using X’s web GraphQL API together with authenticated browser cookies, which gives it a more current and practical approach for data extraction. The project supports a broad set of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    Newspaper4k is a Python library designed for extracting, processing, and analyzing news articles from websites. It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    tumblr-crawler

    tumblr-crawler

    Python crawler to download photos and videos from Tumblr blogs

    tumblr-crawler is an open source Python-based utility designed to download media content from Tumblr blogs. It provides a script that automatically retrieves photos and videos from specified Tumblr sites and saves them locally for offline access. Users can specify one or multiple blogs to crawl by editing a configuration file or by passing parameters through the command line. Once executed, the script fetches media from the Tumblr API and stores the downloaded files in folders named after...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers. JobFunnel can be easily automated to run nightly with crontab. If you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    dxy-covid-19-crawler

    dxy-covid-19-crawler

    Realtime crawler for COVID-19 outbreak statistics from DXY data

    DXY-COVID-19-Crawler is a Python-based project designed to collect real-time COVID-19 infection data from the public dataset provided by Ding Xiang Yuan (DXY). The crawler periodically retrieves pandemic statistics and stores them in a database so that historical changes in the outbreak can be preserved and analyzed later. It was created to make up-to-date infection data more accessible for developers, researchers, and analysts who wanted to build visualizations or conduct data analysis...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Kemono Downloader

    Kemono Downloader

    Kemono Downloader - A cross-platform Python app built with PyQt6

    Welcome to Kemono Downloader, a versatile Python-based desktop application built with PyQt6, designed to download content from Kemono.su. This tool enables users to archive individual posts or entire creator profiles from services like Patreon, Fanbox, and more, supporting a wide range of file types with customizable settings and advanced features.
    Leader badge
    Downloads: 1,046 This Week
    Last Update:
    See Project
  • 18
    bilibili-manga-downloader

    bilibili-manga-downloader

    Download and manage Bilibili Manga chapters with GUI downloader

    BiliBili-Manga-Downloader is an open source desktop application designed to download manga chapters from the Bilibili Manga platform for offline reading and local management. It was created to address limitations of the web reading experience, such as intrusive advertisements, inconvenient image zooming, and inconsistent navigation during reading sessions. It provides a graphical user interface that allows users to search for manga titles using keywords, view detailed information about...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 19
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously. Users provide search terms through a...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    bilili

    bilili

    Command-line Bilibili video and danmaku downloader with batch support

    bilili is a command-line tool designed to download videos and related content from the Bilibili video platform. It focuses on enabling users to retrieve user-uploaded videos as well as serialized content such as bangumi episodes directly from the terminal environment. It provides automated downloading capabilities that handle video streams and associated data efficiently while minimizing manual interaction. bilili supports retrieving both the video files and danmaku comments, which are the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    dirhunt

    dirhunt

    Web crawler that finds hidden web directories without brute force

    Dirhunt is an open source security tool designed to discover web directories and analyze website structures without relying on brute-force techniques. Instead of sending large numbers of guess-based requests, it operates as a specialized crawler that intelligently explores websites to identify accessible or hidden directories. Dirhunt can detect directories that expose “Index Of” listings, which may reveal files and other resources that were not intended to be publicly visible. It can also...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Gerapy

    Gerapy

    Distributed Crawler Management Framework Based on Scrapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Someone who has worked as a crawler with Python may use Scrapy. Scrapy is indeed a very powerful crawler framework. It has high crawling efficiency and good scalability. It is basically a necessary tool for developing crawlers using Python. If you use Scrapy as a crawler, then of course we can use our own host to crawl when crawling, but when the crawl is very large, we can’t...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo