Showing 420 open source projects for "data analysis and visualizing"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    MNE-Python

    MNE-Python

    Magnetoencephalography (MEG) and Electroencephalography EEG in Python

    Open-source Python package for exploring, visualizing, and analyzing human neurophysiological data. MNE-Python is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, EEG, sEEG, ECoG, and more. It includes modules for data input/output, preprocessing, visualization, source estimation, time-frequency analysis, connectivity analysis, machine learning, statistics, and more.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    ROOT

    ROOT

    Analyzing, storing and visualizing big data, scientifically

    ...ROOT provides a very efficient storage system for data models, that demonstrated to scale at the Large Hadron Collider experiments: Exabytes of scientific data are written in columnar ROOT format. ROOT comes with histogramming capabilities in an arbitrary number of dimensions, curve fitting, statistical modeling, and minimization, to allow the easy setup of a data analysis system that can query and process the data interactively or in batch mode, as well as a general parallel processing framework, RDataFrame, that can considerably speed up an analysis.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    VisualDL

    VisualDL

    Deep Learning Visualization Toolkit

    VisualDL, a visualization analysis tool of PaddlePaddle, provides a variety of charts to show the trends of parameters and visualizes model structures, data samples, histograms of tensors, PR curves , ROC curves and high-dimensional data distributions. It enables users to understand the training process and the model structure more clearly and intuitively so as to optimize models efficiently.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Profile Data

    Profile Data

    Analyze computation-communication overlap in V3/R1

    profile-data is a repository that publishes profiling traces and metrics from DeepSeek’s training and inference infrastructure (especially during DeepSeek-V3 / R1 experiments). The profiling data targets insights into computation-communication overlap, pipeline scheduling (e.g. DualPipe), and how MoE / EP / parallelism strategies interact in real systems. The repository contains JSON trace files like train.json, prefill.json, decode.json, and associated assets. Users can load them into tools...
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    ...FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. Use it to get hands-on with your data, including visualizing complex labels, evaluating your models, exploring scenarios of interest, identifying failure modes, finding annotation mistakes, and much more! Surveys show that machine learning engineers spend over half of their time wrangling data, but it doesn't have to be that way.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    FinMind

    FinMind

    Open Data, more than 50 financial data

    In the era of big data, data is the foundation of everything. We collect more than 50 kinds of Taiwan stock related information and provide download, online analysis, and backtesting. Regardless of the program, you can download data through the api provided by FinMind, or you can download data directly from the website. After data is available, statistical analysis, regression analysis, time series analysis, machine learning, and deep learning can be performed. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    KaTrain

    KaTrain

    Improve your Baduk skills by training with KataGo

    ...One of its key strengths is its ability to generate detailed post-game analyses, highlighting the moves that resulted in the greatest loss of points and suggesting improvements. KaTrain also includes interactive learning features such as retrying moves, exploring variations, and visualizing territory control probabilities.
    Downloads: 62 This Week
    Last Update:
    See Project
  • 8
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    ExtractThinker

    ExtractThinker

    ExtractThinker is a Document Intelligence library for LLMs

    ExtractThinker is a tool designed to facilitate the extraction and analysis of information from various data sources, aiding in data processing and knowledge discovery.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    Book6_First-Course-in-Data-Science

    Book6_First-Course-in-Data-Science

    From Addition, Subtraction, Multiplication, and Division to ML

    ...The goal of the project is to make complex topics such as statistics, algorithms, and data analysis more accessible to learners by breaking concepts into clear explanations supported by code examples and diagrams. The material emphasizes a learning approach that combines theoretical knowledge with hands-on experimentation, often recommending interactive tools such as Jupyter notebooks to explore the ideas presented in the book.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    ...It also includes utilities for visualizing audio features and analyzing patterns within sound recordings, which can be useful in applications such as speech recognition, music classification, and acoustic event detection. Because the library integrates machine learning algorithms with signal processing tools, it enables researchers to develop complete audio analysis pipelines using a single framework.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Materials Discovery: GNoME

    Materials Discovery: GNoME

    AI discovers 520000 stable inorganic crystal structures for research

    Materials Discovery (GNoME) is a large-scale research initiative by Google DeepMind focused on applying graph neural networks to accelerate the discovery of stable inorganic crystal materials. The project centers on Graph Networks for Materials Exploration (GNoME), a message-passing neural network architecture trained on density functional theory (DFT) data to predict material stability and energy formation. Using GNoME, DeepMind identified 381,000 new stable materials, later expanding the dataset to include over 520,000 materials within 1 meV/atom of the convex hull as of August 2024. The repository provides datasets, model definitions, and interactive Colabs for exploring these materials, computing decomposition energies, and visualizing chemical families. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    openTSNE

    openTSNE

    Extensible, parallel implementations of t-SNE

    openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4] [5], enabling t-SNE to scale to millions of data points, and various tricks to improve the global alignment of the resulting visualizations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    DeerFlow

    DeerFlow

    Deep Research framework, combining language models with tools

    DeerFlow is an open-source, community-driven “deep research” framework / multi-agent orchestration platform developed by ByteDance. It aims to combine the reasoning power of large language models (LLMs) with automated tool-use — such as web search, web crawling, Python execution, and data processing — to enable complex, end-to-end research workflows. Instead of a monolithic AI assistant, DeerFlow defines multiple specialized agents (e.g. “planner,” “searcher,” “coder,” “report generator”) that collaborate in a structured workflow, allowing tasks like literature reviews, data gathering, data analysis, code execution, and final report generation to be largely automated. ...
    Downloads: 247 This Week
    Last Update:
    See Project
  • 15
    DeepBI

    DeepBI

    LLM based data scientist, AI native data application

    DeepBI is an AI-native data analysis platform. DeepBI leverages the power of large language models to explore, query, visualize, and share data from any data source. Users can use DeepBI to gain data insight and make data-driven decisions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    DeepAnalyze

    DeepAnalyze

    Autonomous LLM agent for end-to-end data science workflows

    DeepAnalyze is an open source project that introduces an agentic large language model designed to perform autonomous data science tasks from start to finish. It is built to handle the entire data science pipeline, including data preparation, analysis, modeling, visualization, and report generation without requiring continuous human guidance. DeepAnalyze is capable of conducting open-ended data research across multiple data formats such as structured tables, semi-structured files, and unstructured text, enabling flexible and comprehensive analysis workflows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    scikit-learn

    scikit-learn

    Machine learning in Python

    scikit-learn is an open source Python module for machine learning built on NumPy, SciPy and matplotlib. It offers simple and efficient tools for predictive data analysis and is reusable in various contexts.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 18
    SparkyFitness

    SparkyFitness

    Track food, fitness, water, and health

    SparkyFitness is a comprehensive self-hosted fitness and wellness tracker designed to help individuals and families monitor nutrition, exercise, hydration, and body measurements in one unified platform. It provides tools for logging daily meals with nutritional breakdowns, tracking workouts with an extensive exercise database, and visualizing long-term progress using interactive charts and reports. The system also supports water intake goals, body metric logging (such as weight and measurements for different muscle groups), and customizable goals to help users stay motivated and accountable. An AI-powered nutrition coach is included, allowing users to log food, exercise, and steps through natural language chat and even upload food images for automatic analysis.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 19
    DATAGEN

    DATAGEN

    AI-driven multi-agent research assistant automating hypothesis

    DATAGEN is an AI-driven multi-agent research and data analysis platform designed to automate complex analytical workflows. The system coordinates multiple specialized AI agents that collaborate to perform tasks such as hypothesis generation, data collection, analysis, visualization, and report creation. Instead of requiring users to manually orchestrate each stage of a research process, the platform allows these agents to coordinate automatically and handle the workflow end-to-end. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DataFrame

    DataFrame

    C++ DataFrame for statistical, Financial, and ML analysis

    This is a C++ analytical library designed for data analysis similar to libraries in Python and R. For example, you would compare this to Pandas, R data.frame, or Polars. You can slice the data in many different ways. You can join, merge, and group-by the data. You can run various statistical, summarization, financial, and ML algorithms on the data. You can add your custom algorithms easily.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    WeChatMsg

    WeChatMsg

    Project aimed at extracting, exporting, and analyzing chat records

    WeChatMsg repository hosts an open-source project aimed at extracting, exporting, and analyzing chat records from the WeChat messaging platform. It provides tools that read local WeChat database files and allow users to convert chat data into readable formats such as HTML, Word, and CSV, making it possible to inspect conversations outside the mobile app environment. Beyond simple export, the project includes mechanisms for analyzing chat histories and generating annual reports or visual...
    Downloads: 224 This Week
    Last Update:
    See Project
  • 22
    AutoViz

    AutoViz

    Automatically Visualize any dataset, any size

    AutoViz is a Python data visualization library designed to automate exploratory data analysis by generating multiple visualizations with minimal code. The primary goal of the project is to help data scientists and analysts quickly understand patterns, relationships, and anomalies within datasets without manually writing complex plotting code. With a single command, the library can automatically generate dozens of charts and graphs that reveal insights into the structure and quality of the data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Clay Foundation Model

    Clay Foundation Model

    The Clay Foundation Model - An open source AI model and interface

    The Clay Foundation Model is an open-source AI model and interface designed to provide comprehensive data and insights about Earth. It aims to serve as a foundational tool for environmental monitoring, research, and decision-making by integrating various data sources and offering an accessible platform for analysis.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    BambooAI

    BambooAI

    A Python library powered by Language Models (LLMs)

    BambooAI is a Python library powered by large language models (LLMs) for conversational data discovery and analysis, allowing users to interact with data through natural language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Mlxtend

    Mlxtend

    A library of extension and helper modules for Python's data analysis

    Mlxtend (machine learning extensions) is a Python library of useful tools for day-to-day data science tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB