Showing 20 open source projects for "command-line"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Automate contact and company data extraction Icon
    Automate contact and company data extraction

    Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

    Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.
    Explore Apify Store
  • 1
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    Trafilatura is a Python package and command-line tool designed to gather text on the Web. It includes discovery, extraction and text-processing components. Its main applications are web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is required, the output can be converted to various commonly used formats.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    ...The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    ScrapydWeb

    ScrapydWeb

    Web app for Scrapyd cluster management

    Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization. Make sure that Scrapyd has been installed and started on all of your hosts. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings on the first startup.) Add your Scrapyd servers, both formats of string and tuple are supported, you can attach basic auth for accessing the Scrapyd server, as well as a string for grouping or labeling. You can select any number of Scrapyd servers by grouping and filtering, and then invoke the HTTP JSON API of Scrapyd on the cluster with just a few clicks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers. JobFunnel can be easily automated to run nightly with crontab. If you have a job website you'd like to write a scraper for, you are welcome to implement it, Review the Base Scraper for implementation details. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    Web Scraping for Laravel

    Web Scraping for Laravel

    Laravel adapter for Roach, the complete web scraping toolkit for PHP

    ...Roach ships with an interactive shell (often called Read-Evaluate-Print-Loop, or Repl for short) which makes prototyping our spiders a breeze. We can use the provided roach:shell command to launch a new Repl session.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    miniblink49

    miniblink49

    Lighter, faster browser kernel of blink to integrate HTML UI in apps

    miniblink is an open source, one file, small browser widget based on chromium. By using C interface, you can create a browser with just some line code. miniblink is an open source, single-file, and currently the smallest known chromium-based browser control. Through its exported pure C interface, a browser control can be created in a few lines of code. C++, C#, Delphi and other language calls (support C++, C#, Delphi language to call). Embedded Nodejs, support electron (with Nodejs, can run electron). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    twitch-batch-downloader

    Automate the download of entire Twitch.tv channels

    ...Save each Twitch video into its own folder, with date and time values, video ID, stream metadata, frame screenshot, .ts parts list and sha256 hash. Keep the original ts files and generate mp4 files from them. It requires a shell and some command line utilities. See README.md for details in the Code/git section.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    GitGet

    GitGet

    Ever wanted to download only a part of a Git repository.

    Ever wanted to download only a part of a Git repository. Just paste the URL of the repo you want to download and sit back and enjoy. This simple java application makes use of Web Scraping and downloads only those files you need, thus helping you save your precious bandwidth and space.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 10
    Twitter Intelligence

    Twitter Intelligence

    Twitter Intelligence OSINT project performs tracking and analysis

    A project written in Python for Twitter tracking and analysis without using Twitter API. This project is a Python 3.x application. The package dependencies are in the file requirements.txt. Run that command to install the dependencies. SQLite is used as the database. Tweet data is stored on the Tweet, User, Location, Hashtag, HashtagTweet tables. The database is created automatically. analysis.py performs analysis processing. User, hashtag, and location analyzes are performed. You must write Google Map API Key in setting.py to display Google Maps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    phoneutria
    A Java Web crawler: multi-threaded, scalable, with high performance, extensible and polite. It can be used to crawl and index any web or enterprise domain and is configurable through a XML configuration file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Yet another web crawler? Yes, but this ones uses the full power of regular expressions to accept or reject, examine or ignore, save or refuse pages. You also use MIME types to do all this. Powerful and flexible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    PGBuild

    Compile your mobile web pages into mobile aps via build.phonegap.com

    PGbuild is a Phonegap development system that automates the development process by connecting your CMS/web server with the online service [Phonegap Build](http://build.phonegap.com). PGBuild is essentially a web spider that make off-line versions of web pages. The off-line version is zippped and send to the Phonegap Build service. The spider is controlled by a project file that sets the rules for the spider and the options for the phonebap build service. You may create and manage your phonegap project source files manually on your webserver or use PGBuild to connect to a CMS system to extract content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ItSucks
    This project is a java web spider (web crawler) with the ability to download (and resume) files. It is also highly customizable with regular expressions and download templates. All backend functionalities are also available in a separate library.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 15
    ...It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    elk is a powerful open-source python based command-line web crawler that can recursively search for files and text on websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Webhunter is a distributed, multi-threaded web crawler designed for both general indexing and crawling the web for focused content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next