Showing 24 open source projects for "gitst web crawler"

View related business solutions
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to reproduce API calls programmatically. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    EasySpider

    EasySpider

    A visual no-code/code-free web crawler/spider

    A visual code-free/no-code web crawler/spider, supporting both Chinese and English.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    katana

    katana

    Fast CLI web crawler for discovering endpoints in modern web apps

    Katana is an open source command-line web crawling and spidering framework developed by ProjectDiscovery. It is designed to efficiently crawl websites and web applications in order to discover endpoints, resources, and other useful information that may not be easily visible through manual browsing. Katana focuses on speed and automation, making it suitable for use in security reconnaissance workflows and automated pipelines. Katana supports both standard HTTP crawling and headless browser...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    fess

    fess

    Open source enterprise search server for websites, files, and data

    ...Fess is built on top of OpenSearch and offers an integrated solution for crawling, indexing, and searching documents from websites, file systems, and various data stores. Fess includes a built-in crawler that can collect content from sources such as databases, CSV files, and shared storage, making it suitable for centralized knowledge discovery. It supports indexing and searching across many document formats including office documents, PDFs, and compressed archives. It also provides a web-based administrative interface that allows administrators to configure crawling targets, manage indexing tasks, and adjust search settings from a graphical dashboard.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Powerful App Monitoring Without Surprise Bills Icon
    Powerful App Monitoring Without Surprise Bills

    AppSignal starts at $23/month with all features included. No overages, no hidden fees. 30-day free trial.

    Tired of monitoring tools that punish you for scaling? AppSignal offers transparent, predictable pricing with every feature unlocked on every plan. Track errors, monitor performance, detect anomalies, and manage logs across Ruby, Python, Node.js, and more. Trusted by developers since 2012 with free dev-to-dev support. No credit card required to start your 30-day trial.
    Try AppSignal Free
  • 5
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. It also integrates with the Aria2 download utility to enable large-scale downloading of videos and images associated with collected content. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    Diskover Community Edition is an open source file system indexing and storage analytics platform designed to help organizations understand and manage large volumes of file data. It crawls file systems and indexes metadata using Elasticsearch, enabling fast search, analysis, and organization of files stored across different storage systems. It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    ScrapBot 1.40 64bits

    ScrapBot 1.40 64bits

    Task automation software for accessing and manipulating website data.

    ScrapBot is a task automation software that allows you to access, authenticate, extract, and insert data on any website. The software utilizes JavaScript to execute tasks, eliminating the need for server or additional software installations. The system can control the accessed webpage through JavaScript, and the entire navigation can be viewed in the program window. The main.js script runs in a separate frame from the navigation frame but can access all page content without any restrictions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    magnetW

    magnetW

    Magnet link aggregation search

    magnetW is based on the rule principle of magnetX , the search results of each magnetic station are uniformly formatted. There is no group in this project, only Github for code hosting and related technical exchanges, and other addresses may be risky, please distinguish carefully. This project is open source and free. There are no collection channels of any kind, such as donations, and no advertising of any kind. If you encounter anything similar to the above situation, please don't believe...
    Downloads: 4 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    X-RAY

    X-RAY

    The next web scraper, see through the <html> noise

    Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    pxer

    pxer

    Pixiv crawler userscript for downloading artwork and galleries easily

    Pxer is an open source tool designed to help users collect and download artwork from the Pixiv illustration platform. It is implemented primarily in client-side JavaScript and runs directly in the browser through a userscript environment, allowing it to integrate seamlessly with Pixiv pages. Pxer provides functionality to crawl and gather images, artwork metadata, and other related content from supported Pixiv pages. It is designed to be accessible even for users who are not developers,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    diskover

    diskover

    File system crawler and disk space usage software

    diskover is a file system crawler and disk space usage software that uses Elasticsearch to index your file metadata. diskover crawls and indexes your files on a local computer or remote storage server over network mounts. diskover helps manage your storage by identifying old and unused files and give better insights into data change "hotfiles", file duplication "dupes" and wasted space. It is designed to help deal with managing large amounts of data growth and provide detailed storage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    lightcrawler

    lightcrawler

    Website crawler that audits site pages automatically with Lighthouse

    ...It works by starting from a given URL and recursively exploring linked pages to collect a set of pages that should be analyzed. Each discovered page is then evaluated using Lighthouse, which performs checks related to performance, accessibility, and web development best practices. This allows developers to audit multiple pages of a site automatically instead of manually running Lighthouse on each individual page. Lightcrawler supports configuration through a JSON configuration file, enabling users to customize how the crawler operates and which Lighthouse audits should be executed. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Catberry

    Catberry

    Catberry is an isomorphic framework

    Catberry is an isomorphic framework for building universal front-end apps using components, Flux architecture and progressive rendering. Catberry builds a bundle for running the application in a browser as a Single Page Application. Cat-Components – similar to web-components but organized as directories, can be rendered on the server and published/installed as NPM packages. The entire architecture of the framework is built using the Service Locator pattern, which helps to manage module dependencies and create plugins, and Flux, for the data layer. Search crawler receives a full page from the server. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    OpenWebSpider
    OpenWebSpider is an Open Source multi-threaded Web Spider (robot, crawler) and search engine with a lot of interesting features!
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Node Crawler

    Node Crawler

    Web Crawler/Spider for NodeJS + server-side jQuery

    Most powerful, popular and production crawling/scraping package for Node, happy hacking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Zoozle Search & Download Suchmaschine

    Zoozle Search & Download Suchmaschine

    Zoozle 2008 - 2010 Webpage, Tools and SQL Files

    Download search engine and directory with Rapidshare and Torrent - zoozle Download Suchmaschine All The files that run the World Leading German Download Search Engine in 2010 with 500 000 unique visitors a day - all the tools you need to set up a clone. Code Contains: - PHP Files for zoozle - Perl Crawler for gathering new content to database and all other cool tools i have...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    ** Guys I have built a much more powerful Fully Featured CMS system at: https://github.com/MacdonaldRobinson/FlexDotnetCMS Macs CMS is a Flat File ( XML and SQLite ) based AJAX Content Management System. It focuses mainly on the Edit In Place editing concept. It comes with a built in blog with moderation support, user manager section, roles manager section, SEO / SEF URL
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Lan-Crawler is a crawler and indexer of public network files shared via SMB shares (Windows shares and UNIX systems running Samba). Meta data is downloaded for films and music. A dynamic Web UI is provided for searching files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MyNewsGroups :) is a Web based USENET news crawler, news reader and news poster. With the use of a DB backend, the crawler fetch the newsgroups messages ONCE only. Web based environment, SPAM Filters, Search Engine, Subscriptions and much more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    mornex

    a first-person dungeon crawler

    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Spider is web crawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    studiMaps is a web based application for visualization and analysis of social networks. It consists of two software components: a web-crawler for getting data and the web based application for visualization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB