Showing 324 open source projects for "data"

View related business solutions
  • $300 in Free Credit for Your Google Cloud Projects Icon
    $300 in Free Credit for Your Google Cloud Projects

    Build, test, and explore on Google Cloud with $300 in free credit. No hidden charges. No surprise bills.

    Launch your next project with $300 in free Google Cloud credit—no hidden charges. Test, build, and deploy without risk. Use your credit across the Google Cloud platform to find what works best for your needs. After your credits are used, continue building with free monthly usage products. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 1
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections. ...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 2
    Cookiecutter Data Science

    Cookiecutter Data Science

    Project structure for doing and sharing data science work

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. When we think about data analysis, we often think just about the resulting reports, insights, or visualizations. While these end products are generally the main event, it's easy to focus on making the products look nice and ignore the quality of the code that generates them. Because these end products are created programmatically, code quality is still important! ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    pandas

    pandas

    Fast, flexible and powerful Python data analysis toolkit

    pandas is a Python data analysis library that provides high-performance, user friendly data structures and data analysis tools for the Python programming language. It enables you to carry out entire data analysis workflows in Python without having to switch to a more domain specific language. With pandas, performance, productivity and collaboration in doing data analysis in Python can significantly increase.
    Downloads: 89 This Week
    Last Update:
    See Project
  • 4
    CyberChef

    CyberChef

    A web app for encryption, encoding, compression and data analysis

    CyberChef, developed by GCHQ, is a versatile web application dubbed the "Cyber Swiss Army Knife." It enables users to perform a wide array of operations on data, including encryption, encoding, compression, and analysis, all within a browser interface.​
    Downloads: 62 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 5
    Metabase

    Metabase

    The simplest, fastest way to share business intelligence and analytics

    Metabase is the easiest way to let everyone in your company access business data and analytics, learn from it and ask questions. Even if you or your colleagues have no experience in SQL, you can easily summarize and visualize your data, share it and let your team ask questions about it. Metabase creates beautiful graphs and charts, with an easy-to-use dashboard where everyone can create, organize and share exceptionally visualized data.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 6
    scikit-learn

    scikit-learn

    Machine learning in Python

    scikit-learn is an open source Python module for machine learning built on NumPy, SciPy and matplotlib. It offers simple and efficient tools for predictive data analysis and is reusable in various contexts.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 7
    ECharts

    ECharts

    A powerful, interactive charting and visualization library for browser

    ECharts is a free and open source charting and visualization library that gives you an easy way to add interactive, intuitive, custom charts to your commercial products, projects, presentations and more. It offers a rich set of features that includes rendering ability for ten-million-level data, Wechart and Powerpoint support, multi-dimension data analysis, and more. It also has a number of extensions for various applications. ECharts is written in pure JavaScript, and is based on zrender, a new and lightweight canvas library.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    ...It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 9
    StarRocks

    StarRocks

    StarRocks is a next-gen sub-second MPP database for full analytics

    ...From streaming data to change data capture, StarRocks meets the data ingestion demands of real-time analytics. Scale storage and computing power horizontally and support tens of thousands of concurrent users. All of your BI tools work with StarRocks through standard SQL. StarRocks provides superior performance. It is also a unified OLAP covering most data analytics scenarios.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    EEGLAB

    EEGLAB

    EEGLAB is an open source signal processing environment

    EEGLAB is an open source, MATLAB-based interactive environment for analyzing electrophysiological signals such as EEG and MEG. It incorporates powerful tools for data import, preprocessing, independent component analysis (ICA), time-frequency analysis, artifact rejection, and visualization—all within a GUI framework that also supports scripting and plugin extensions. EEGLAB is an open source signal processing environment for electrophysiological signals running on Matlab and Octave (command line only for Octave). ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    .NET for Apache Spark

    .NET for Apache Spark

    A free, open-source, and cross-platform big data analytics framework

    .NET for Apache Spark provides high-performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data. .NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer. .NET for Apache Spark runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    Vince

    Vince

    Self Hosted Alternative To Google Analytics

    vince is a versatile tool that assists in data analysis and visualization, providing users with insights through interactive charts and graphs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    hctsa

    hctsa

    Highly comparative time-series analysis

    hctsa is a Matlab software package for running highly comparative time-series analysis. It extracts thousands of time-series features from a collection of univariate time series and includes a range of tools for visualizing and analyzing the resulting time-series feature matrix.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    DuckDB

    DuckDB

    DuckDB is an in-process SQL OLAP Database Management System

    ...For more information on the goals of DuckDB, please refer to the Why DuckDB page on our website. Processing and storing tabular datasets, e.g. from CSV or Parquet files. Interactive data analysis, e.g. Joining & aggregate multiple large tables. Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns. Large result set transfer to client. For development, DuckDB requires CMake, Python3 and a C++11 compliant compiler. Run make in the root directory to compile the sources. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    teslamate

    teslamate

    A self-hosted data logger for your Tesla

    TeslaMate is an open-source self-hosted data logger that collects and visualizes data from Tesla vehicles in real time. It provides detailed insights into driving, charging, efficiency, and battery health through intuitive dashboards powered by Grafana. TeslaMate is ideal for Tesla owners who want full control of their vehicle data, avoid cloud reliance, and access rich analytics for personal tracking or troubleshooting.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase, separator), scripts (Latin, Cyrillic) and blocks (ASCII, Cyrilic). ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Bdash

    Bdash

    Simple SQL Client for lightweight data analysis

    Simple SQL Client for lightweight data analysis. You can share the result with gist. Supports MySQL, PostgreSQL (Amazon Redshift), SQLite3, Google BigQuery, Treasure Data, Amazon Athena. You can download and install from Web Site or Releases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    LinDB

    LinDB

    LinDB is a scalable, high performance, high availability database

    ...A single server could easily support more than one million write TPS; With fundamental techniques like efficient compression storage and parallel computing, LinDB delivers highly optimized query performance. The multi-channel replication protocol supports any amount of nodes, and ensures the system's availability. Schema-free multi-dimensional data model with Metric, Tags, and Fields; The LinQL is flexible yet handy for real-time data analytics. Horizontal scalable is made simple by adding more new broker and storage nodes without too much thinking and manual operations. And the tags-based sharding strategy resolves the hotspot problem. LinDB is designed to work under a Multi-Active IDCs cloud architecture. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    XMLTV 1.4.0

    XMLTV 1.4.0

    Utilities to obtain, generate, and post-process TV listings data

    XMLTV is a suite of utilities to obtain, generate, and post-process TV listings data in XMLTV format. It provides tools to gather television listings, process listings data, and help organize TV viewing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    ...InLong was originally built at Tencent, which has served online businesses for more than 8 years, to support massive data (data scale of more than 80 trillion pieces of data per day) reporting services in big data scenarios. The entire platform has integrated 5 modules: Ingestion, Convergence, Caching, Sorting, and Management, so that the business only needs to provide data sources, data service quality, data landing clusters and data landing formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Apache Doris

    Apache Doris

    MPP-based interactive SQL data warehousing for reporting and analysis

    Apache Doris is a modern MPP analytical database product. It can provide sub-second queries and efficient real-time data analysis. With it's distributed architecture, up to 10PB level datasets will be well supported and easy to operate. Apache Doris can meet various data analysis demands, including history data reports, real-time data analysis, interactive data analysis, and exploratory data analysis. Make your data analysis easier! Support standard SQL language, compatible with MySQL protocol. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    CloudQuery

    CloudQuery

    The open-source cloud asset inventory powered by SQL

    ...Integrate CloudQuery with your current visualization, monitoring, and alerting such as Grafana. CloudQuery supports the TimescaleDB PostgreSQL extension, giving you full historical snapshots of your cloud asset inventory. Data analysis, security, auditing, and compliance. Leverage SQL to get visibility into your cloud infrastructure and SaaS applications. Build a cloud-asset inventory across any of our supported official or community providers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Kapacitor

    Kapacitor

    Open source framework for processing, monitoring, and alerting

    Open source framework for processing, monitoring, and alerting on time series data. Kapacitor is a real-time data processing engine for monitoring and alerting, specifically designed to work with time-series data from InfluxDB.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →