Showing 352 open source projects for "data analysis"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    a-stock-data

    a-stock-data

    Full-stack China A-Share data toolkit for AI coding assistants

    ...It is designed to work with tools such as Claude Code, Codex, and OpenClaw through context-injected Markdown and embedded Python. The repository emphasizes tested endpoints and practical usability for agent-driven financial analysis workflows. Its main value is turning fragmented A-share data sources into a unified toolset for AI-assisted research and coding.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Profile Data

    Profile Data

    Analyze computation-communication overlap in V3/R1

    profile-data is a repository that publishes profiling traces and metrics from DeepSeek’s training and inference infrastructure (especially during DeepSeek-V3 / R1 experiments). The profiling data targets insights into computation-communication overlap, pipeline scheduling (e.g. DualPipe), and how MoE / EP / parallelism strategies interact in real systems. The repository contains JSON trace files like train.json, prefill.json, decode.json, and associated assets. Users can load them into tools...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Book6_First-Course-in-Data-Science

    Book6_First-Course-in-Data-Science

    From Addition, Subtraction, Multiplication, and Division to ML

    ...The goal of the project is to make complex topics such as statistics, algorithms, and data analysis more accessible to learners by breaking concepts into clear explanations supported by code examples and diagrams. The material emphasizes a learning approach that combines theoretical knowledge with hands-on experimentation, often recommending interactive tools such as Jupyter notebooks to explore the ideas presented in the book.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    FinMind

    FinMind

    Open Data, more than 50 financial data

    In the era of big data, data is the foundation of everything. We collect more than 50 kinds of Taiwan stock related information and provide download, online analysis, and backtesting. Regardless of the program, you can download data through the api provided by FinMind, or you can download data directly from the website. After data is available, statistical analysis, regression analysis, time series analysis, machine learning, and deep learning can be performed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    ROOT

    ROOT

    Analyzing, storing and visualizing big data, scientifically

    ...ROOT provides a very efficient storage system for data models, that demonstrated to scale at the Large Hadron Collider experiments: Exabytes of scientific data are written in columnar ROOT format. ROOT comes with histogramming capabilities in an arbitrary number of dimensions, curve fitting, statistical modeling, and minimization, to allow the easy setup of a data analysis system that can query and process the data interactively or in batch mode, as well as a general parallel processing framework, RDataFrame, that can considerably speed up an analysis.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    ExtractThinker

    ExtractThinker

    ExtractThinker is a Document Intelligence library for LLMs

    ExtractThinker is a tool designed to facilitate the extraction and analysis of information from various data sources, aiding in data processing and knowledge discovery.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DeepAnalyze

    DeepAnalyze

    Autonomous LLM agent for end-to-end data science workflows

    DeepAnalyze is an open source project that introduces an agentic large language model designed to perform autonomous data science tasks from start to finish. It is built to handle the entire data science pipeline, including data preparation, analysis, modeling, visualization, and report generation without requiring continuous human guidance. DeepAnalyze is capable of conducting open-ended data research across multiple data formats such as structured tables, semi-structured files, and unstructured text, enabling flexible and comprehensive analysis workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    WeChatMsg

    WeChatMsg

    Project aimed at extracting, exporting, and analyzing chat records

    WeChatMsg repository hosts an open-source project aimed at extracting, exporting, and analyzing chat records from the WeChat messaging platform. It provides tools that read local WeChat database files and allow users to convert chat data into readable formats such as HTML, Word, and CSV, making it possible to inspect conversations outside the mobile app environment. Beyond simple export, the project includes mechanisms for analyzing chat histories and generating annual reports or visual...
    Downloads: 302 This Week
    Last Update:
    See Project
  • 9
    DeerFlow

    DeerFlow

    Deep Research framework, combining language models with tools

    DeerFlow is an open-source, community-driven “deep research” framework / multi-agent orchestration platform developed by ByteDance. It aims to combine the reasoning power of large language models (LLMs) with automated tool-use — such as web search, web crawling, Python execution, and data processing — to enable complex, end-to-end research workflows. Instead of a monolithic AI assistant, DeerFlow defines multiple specialized agents (e.g. “planner,” “searcher,” “coder,” “report generator”) that collaborate in a structured workflow, allowing tasks like literature reviews, data gathering, data analysis, code execution, and final report generation to be largely automated. ...
    Downloads: 91 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 10
    DeepBI

    DeepBI

    LLM based data scientist, AI native data application

    DeepBI is an AI-native data analysis platform. DeepBI leverages the power of large language models to explore, query, visualize, and share data from any data source. Users can use DeepBI to gain data insight and make data-driven decisions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AI-Trader

    AI-Trader

    100% Fully-Automated Agent-Native Trading

    AI-Trader is an open-source AI-powered quantitative trading framework designed to combine financial analysis, machine learning, and autonomous trading workflows into a unified research platform. The project integrates large language models, financial indicators, market analysis pipelines, and automated decision-making systems to support strategy generation and market prediction tasks. It is built to help researchers and developers experiment with AI-assisted trading strategies using historical and real-time financial data. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    scikit-learn

    scikit-learn

    Machine learning in Python

    scikit-learn is an open source Python module for machine learning built on NumPy, SciPy and matplotlib. It offers simple and efficient tools for predictive data analysis and is reusable in various contexts.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Matter AI

    Matter AI

    Matter AI is open-source AI Code Reviewer Agent

    Matter AI is an AI-powered platform designed to enhance productivity through automated content generation, data analysis, and decision support. It leverages machine learning models to process text, analyze patterns, and generate insights, making it suitable for businesses looking to optimize data-driven decision-making. Matter AI integrates with various data sources and provides customizable AI workflows tailored to different industries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Quantitative Trading System

    Quantitative Trading System

    A comprehensive quantitative trading system with AI-powered analysis

    Quantitative Trading System is a comprehensive quantitative trading platform that integrates artificial intelligence, financial data analysis, and automated strategy execution within a unified software system. The project is designed to provide an end-to-end infrastructure for building and operating algorithmic trading strategies in financial markets. It includes tools for collecting and processing market data from multiple sources, performing statistical and machine learning analysis, and generating trading signals based on quantitative models. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Discourse Network Analyzer (DNA)

    Discourse Network Analyzer (DNA)

    Discourse Network Analyzer (DNA)

    The Java software Discourse Network Analyzer (DNA) is a qualitative content analysis tool with network export facilities. You import text files and annotate statements that persons or organizations make, and the program will return network matrices of actors connected by shared concepts.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 16
    MCP BigQuery Server

    MCP BigQuery Server

    A Model Context Protocol (MCP) server that provides secure

    This MCP server provides secure, read-only access to BigQuery datasets, enabling large language models (LLMs) to safely query and analyze data through a standardized interface. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    BambooAI

    BambooAI

    A Python library powered by Language Models (LLMs)

    BambooAI is a Python library powered by large language models (LLMs) for conversational data discovery and analysis, allowing users to interact with data through natural language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Mlxtend

    Mlxtend

    A library of extension and helper modules for Python's data analysis

    Mlxtend (machine learning extensions) is a Python library of useful tools for day-to-day data science tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    tidytext

    tidytext

    Text mining using tidy tools

    tidytext brings tidy data principles to text mining by converting text into a tidy data frame format. It provides tools for tokenization, sentiment analysis, n‑gram creation, and term‑document matrices, enabling interoperability with dplyr, ggplot2, and other tidyverse workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MEDIUM_NoteBook

    MEDIUM_NoteBook

    Repository containing notebooks of my posts on Medium

    MEDIUM_NoteBook is an open-source repository that contains a collection of Jupyter notebooks and code examples originally developed to accompany technical articles published on Medium. The project provides practical demonstrations of machine learning algorithms, data analysis workflows, and visualization techniques. Each notebook typically focuses on explaining a specific concept through step-by-step examples that combine explanatory text, code, and visual outputs. The repository covers a wide variety of data science topics such as predictive modeling, data preprocessing, statistical analysis, and feature engineering. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Quadratic

    Quadratic

    Data science spreadsheet with Python & SQL

    Quadratic enables your team to work together on data analysis to deliver better results, faster. You already know how to use a spreadsheet, but you’ve never had this much power before. Quadratic is a Web-based spreadsheet application that runs in the browser and as a native app (via Electron). Our goal is to build a spreadsheet that enables you to pull your data from its source (SaaS, Database, CSV, API, etc) and then work with that data using the most popular data science tools today (Python, Pandas, SQL, JS, Excel Formulas, etc). ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 22
    tslearn

    tslearn

    The machine learning toolkit for time series analysis in Python

    The machine learning toolkit for time series analysis in Python. tslearn expects a time series dataset to be formatted as a 3D numpy array. The three dimensions correspond to the number of time series, the number of measurements per time series and the number of dimensions respectively (n_ts, max_sz, d). In order to get the data in the right format.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DataDreamer

    DataDreamer

    DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models

    DataDreamer is a tool designed to assist in the generation and manipulation of synthetic data for various applications, including testing and machine learning.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    DataProfiler

    DataProfiler

    Extract schema, statistics and entities from datasets

    DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions rather than generic knowledge alone. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo