Showing 27 open source projects for "data analysis"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    dxy-covid-19-crawler

    dxy-covid-19-crawler

    Realtime crawler for COVID-19 outbreak statistics from DXY data

    DXY-COVID-19-Crawler is a Python-based project designed to collect real-time COVID-19 infection data from the public dataset provided by Ding Xiang Yuan (DXY). The crawler periodically retrieves pandemic statistics and stores them in a database so that historical changes in the outbreak can be preserved and analyzed later. It was created to make up-to-date infection data more accessible for developers, researchers, and analysts who wanted to build visualizations or conduct data analysis during the early stages of the pandemic. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    FinalRecon

    FinalRecon

    All-in-one Python web reconnaissance tool for fast target analysis

    FinalRecon is an all-in-one web reconnaissance tool written in Python that helps security professionals gather information about a target website quickly and efficiently. It combines multiple reconnaissance techniques into a single command-line utility so users do not need to run several separate tools to collect similar data. FinalRecon focuses on providing a fast overview of a web target while maintaining accuracy in the collected results. It includes modules for gathering server...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    Diskover Community Edition is an open source file system indexing and storage analytics platform designed to help organizations understand and manage large volumes of file data. It crawls file systems and indexes metadata using Elasticsearch, enabling fast search, analysis, and organization of files stored across different storage systems. It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across infrastructure. By indexing file metadata from sources such as local file systems, network shares like NFS and SMB, and cloud storage, the tool provides a centralized way to analyze heterogeneous storage environments. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Weibo Crawler

    Weibo Crawler

    Python crawler for collecting and downloading Sina Weibo user data

    ...It also captures detailed data about each post, including the content, publishing time, topics, mentions, likes, reposts, and comments. In addition to textual data, the project can download original media from posts, such as images, videos, and Live Photo content. Collected data can be exported to structured formats such as CSV or JSON or stored in databases for further analysis and research.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 5
    watercrawl

    watercrawl

    AI-ready web crawler that extracts and structures website content

    WaterCrawl is an open source web crawling and data extraction platform designed to transform website content into structured data suitable for machine learning and AI workflows. It enables developers and researchers to crawl web pages, extract meaningful information, and convert it into formats that are easier to process and analyze. It provides a modern crawling system that can automatically navigate links, control crawl depth, and collect content from targeted sections of a website....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    finvizfinance

    finvizfinance

    Finviz analysis python library

    finvizfinance is a package that collects financial information from FinViz website. Stock charts, fundamental & technical information, insider information and stock news. Forex charts and performance. Crypto charts and performance. Screener and Group provide data frames for comparing stocks according to different filters and trading signals. Getting information (fundament, description, outer rating, stock news, inside trader) of an individual stock.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    ...Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths. These capabilities make the project suitable for building search indexers, data extraction pipelines, & SEO analysis tools.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    ...In addition to raw data collection, some spiders include basic data processing and analysis using tools such as pandas and simple visualization with matplotlib. It also contains examples of proxy pool integration and encapsulation to support more reliable crawling when working with sites that enforce request limits.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 10
    NuzeBot

    NuzeBot

    Finds interesting news headlines.

    This is a bot to finds the news you want to see. It can be made to find the news that interests you and reject everything else. View on one page the most interesting headlines from many websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    Catbird Linux is a USB pluggable Live Linux operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends, follow social media sentiment, or do other tasks in data science. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    mlscraper

    mlscraper

    ML-based HTML scraper that learns extraction rules from examples

    ...Once trained, the generated scraper can process new pages and return the extracted data in structured formats such as dictionaries or lists. This approach simplifies web scraping tasks by shifting the focus from rule-writing to example-based training. Internally, the project processes HTML documents, identifies relevant elements in the DOM, and builds extraction logic based on statistical or heuristic analysis of the training samples.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    lxspider

    lxspider

    Educational Python web scraping case collection for many sites

    lxSpider is a collection of web scraping examples designed primarily for learning and experimentation with data extraction techniques. It gathers numerous crawler implementations that demonstrate how to collect data from a wide range of websites and online services. It focuses heavily on practical cases that illustrate how different platforms handle requests, authentication parameters, and anti-scraping protections. lxSpider includes examples targeting areas such as e-commerce platforms, social media services, content sites, research databases, and information portals. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ast-hook-forjs-re

    ast-hook-forjs-re

    AST-based JavaScript reverse engineering and variable tracing toolkit

    ast-hook-for-js-RE is an open source JavaScript reverse engineering toolkit designed to help analysts locate and understand client-side encryption logic used by web applications. It works by intercepting browser traffic through a local proxy server and modifying JavaScript code before it executes in the browser. Using Abstract Syntax Tree (AST) transformations, it injects hook functions into the code to monitor variable assignments and other runtime changes during execution. This allows...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    GoogleScraper

    GoogleScraper

    Python tool for scraping search engine results from many providers

    ...GoogleScraper also includes capabilities for running multiple scraping tasks concurrently to improve performance and increase the amount of collected data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    ECommerceCrawlers

    ECommerceCrawlers

    Collection of Python ecommerce and website crawler examples projects

    ...It aims to help developers understand the full workflow of web scraping, including request simulation, data extraction, storage, and handling anti-scraping techniques. It includes crawlers for platforms such as ecommerce marketplaces, blogging platforms, recruitment sites, and social networks, providing real-world practice scenarios. Developers can study the individual project documentation to understand the analysis process.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    mzitu

    mzitu

    Python crawler that downloads image galleries and analyzes titles

    ...It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. Using text segmentation and frequency analysis, the project can create a word cloud representing common keywords found in the dataset. This makes the repository both a scraping example and a small data analysis experiment built around the collected content. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Twitter Intelligence

    Twitter Intelligence

    Twitter Intelligence OSINT project performs tracking and analysis

    A project written in Python for Twitter tracking and analysis without using Twitter API. This project is a Python 3.x application. The package dependencies are in the file requirements.txt. Run that command to install the dependencies. SQLite is used as the database. Tweet data is stored on the Tweet, User, Location, Hashtag, HashtagTweet tables. The database is created automatically. analysis.py performs analysis processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    WeChatSogou

    WeChatSogou

    Python library to crawl and retrieve data from WeChat accounts

    WechatSogou is an open source Python library designed to retrieve data from WeChat official accounts by using the Sogou WeChat search service as its data source. It provides developers with a programmatic way to search for public accounts and collect article information without manually browsing the search interface. It functions as a crawler interface that sends requests to the search engine, retrieves results, and converts the returned pages into structured data that can be used in applications or analysis pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Perl Web Scraping Project

    Perl Web Scraping Project

    Perl Web Scraping Project

    ...It is a form of copying, in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis. Web scraping a web page involves fetching it and extracting from it.[1][2] Fetching is the downloading of a page (which a browser does when you view the page). Therefore, web crawling is a main component of web scraping, to fetch pages for later processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    JAWS - Just Another Web Scraper

    JAWS - Just Another Web Scraper

    A simple Web Scraper using Regular Expression or Html Agility

    JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    newsscrape

    news headline collecting for analysis in determining the category

    newsscrape is web scraping for news headline to analyse on how it relates to a news category. - It extracts RSS feed from Google News. - Each news headline is matched against Google News category like Entertainment, Sports, etc. - Called from scheduler to collect this data at 5 minutes interval and be accumulated in a database. - It contains R statistical computing scripts to learn the pattern on words in the headline resulting a particular category. - To test its accuracy in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Job Crawler

    Job Data Collection - Web Crawler

    ...Moreover, program is going to reply on these figures, and performs a detailed analysis for the employment situation of the states of the USA. What is the hot job in your state? This report is going to explain how to design and implement solution for Job data collection system. It also includes some links for source code, class diagram, algorithm
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo