Showing 254 open source projects for "python data analysis"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Crowdtesting That Delivers | Testeum Icon
    Crowdtesting That Delivers | Testeum

    Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

    Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights.
    Click to perfect your product now.
  • 1
    data-diff

    data-diff

    Efficiently diff rows across two different databases

    We're excited to announce the launch of a new open-source product, data-diff that makes comparing datasets across databases fast at any scale. data-diff automates data quality checks for data replication and migration. In modern data platforms, data is constantly moving between systems, and at the modern data volume and complexity, systems go out of sync all the time. Until now, there has not been any tooling to ensure that when the data is correctly copied. Replicating data at scale, across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    HDF5

    HDF5

    Official HDF5® Library Repository

    HDF5 (Hierarchical Data Format v5) is a widely-used data management library and file format for storing large and complex scientific data sets efficiently.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 3
    DuckDB

    DuckDB

    DuckDB is an in-process SQL OLAP Database Management System

    ... data analysis, e.g. Joining & aggregate multiple large tables. Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns. Large result set transfer to client. For development, DuckDB requires CMake, Python3 and a C++11 compliant compiler. Run make in the root directory to compile the sources. For development, use make debug to build a non-optimized debug version.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 4
    Grafana

    Grafana

    The open observability and monitoring platform

    Grafana is an open source analytics and monitoring platform designed for every database. It allows you to visualize and understand your metrics through dynamic and reusable data-driven dashboards that you can create, explore and share with others. Grafana offers a multitude of visualization options and lets you explore your metrics and logs like never before. It can also be set to alert you on your most important metrics. Thousands of companies have been using Grafana to monitor everything...
    Downloads: 20 This Week
    Last Update:
    See Project
  • Test your software product anywhere in the world Icon
    Test your software product anywhere in the world

    Get feedback from real people across 190+ countries with the devices, environments, and payment instruments you need for your perfect test.

    Global App Testing is a managed pool of freelancers used by Google, Meta, Microsoft, and other world-beating software companies.
    Try us today.
  • 5
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    Jailer Database Tool

    Jailer Database Tool

    Database subsetting and relational data browsing tool

    Jailer is a tool for database subsetting, schema and data browsing. It creates small slices from your database and lets you navigate through your database following the relationships. Ideal for creating small samples of test data or for local problem analysis with relevant production data. Creates small slices from your productive database and imports the data into your development and test environment (consistent and referentially intact). Improves database performance by removing...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    SQLAlchemy

    SQLAlchemy

    The Database Toolkit for Python

    SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. SQLAlchemy provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language. An industrial strength ORM, built from the core on the identity map, unit of work, and data mapper patterns. These patterns allow the transparent persistence...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    EventStoreDB

    EventStoreDB

    The stream database optimised for event sourcing

    ... Sourcing offers some great benefits over state-oriented systems; the key ones are explained below. An event-sourced system stores your data as a series of immutable events over time, providing one of the strongest audit log options available. All state changes are kept, so it is possible to move systems backward and forwards in time which is extremely valuable for debugging and “what if” analysis.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Finance Database

    Finance Database

    This is a database of 300.000+ symbols containing Equities, ETFs, etc.

    ... unknown. This database tries to solve that. It features 300.000+ symbols containing Equities, ETFs, Funds, Indices, Currencies, Cryptocurrencies and Money Markets. It, therefore, allows you to obtain a broad overview of sectors, industries, types of investments and much more. The aim of this database is explicitly not to provide up-to-date fundamentals or stock data as those can be obtained with ease (with the help of this database) by using yfinance, FundamentalAnalysis or ThePassiveInvestor.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Sales CRM and Pipeline Management Software | Pipedrive Icon
    Sales CRM and Pipeline Management Software | Pipedrive

    The easy and effective CRM for closing deals

    Pipedrive’s simple interface empowers salespeople to streamline workflows and unite sales tasks in one workspace. Unlock instant sales insights with Pipedrive’s visual sales pipeline and fine-tune your strategy with robust reporting features and a personalized AI Sales Assistant.
    Try it for free
  • 10
    osm2pgsql

    osm2pgsql

    Import OpenStreetMap data into a PostgreSQL/PostGIS database

    osm2pgsql is a powerful tool for importing OpenStreetMap (OSM) data into a PostgreSQL/PostGIS database, enabling geographic data analysis and map rendering. It supports various rendering schemas like "flex" and "lua" to customize how data is loaded and indexed. Designed for performance and scalability, osm2pgsql is widely used in map tile generation pipelines and by GIS professionals handling large-scale spatial datasets.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    CodeChecker

    CodeChecker

    CodeChecker is an analyzer tooling, defect database

    CodeChecker is a static analysis infrastructure built on the LLVM/Clang Static Analyzer toolchain, replacing scan-build in a Linux or macOS (OS X) development environment. Executes Clang-Tidy and Clang Static Analyzer with Cross-Translation Unit analysis, Statistical Analysis (when checkers are available). Creates the JSON compilation database by wiretapping any build process (e.g., CodeChecker log -b "make"). Automatically analyzes GCC cross-compiled projects: detecting GCC or Clang compiler...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    CogDB

    CogDB

    Micro Graph Database for Python Applications

    Cog is a lightweight, embedded graph database for Go that provides a simple interface for storing and querying graph-based data structures, making it useful for knowledge representation and graph analytics.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    django-pgtrigger

    django-pgtrigger

    Write Postgres triggers for your Django models

    django-pgtrigger is a Django library for defining and managing PostgreSQL triggers directly in Python code. It allows developers to create database-level logic like automatic field updates, auditing, or validation without writing raw SQL. It’s ideal for teams that want stronger data integrity while keeping logic version-controlled.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Bracket

    Bracket

    Selfhosted tournament system

    Bracket is an open-source tool that tracks and manages data access across your PostgreSQL database. It provides visibility into which parts of your codebase are accessing which tables and columns, enabling data governance, security auditing, and architectural insights. Bracket is particularly helpful for growing teams needing better observability in complex applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    manticoresearch

    manticoresearch

    Easy to use open source fast database for search

    ... over HTTP and uses the MySQL protocol (you can use your preferred MySQL client). JSON over HTTP: to provide a more programmatic way to manage your data and schemas, Manticore provides a HTTP JSON protocol. Written fully in C++: starts fast, doesn't take much RAM, and low-level optimizations provide good performance. Can sync from MySQL/PostgreSQL/ODBC/xml/csv out of the box. Not fully ACID-compliant, but supports transactions and binlog for safe writes.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Bdash

    Bdash

    Simple SQL Client for lightweight data analysis

    Simple SQL Client for lightweight data analysis. You can share the result with gist. Supports MySQL, PostgreSQL (Amazon Redshift), SQLite3, Google BigQuery, Treasure Data, Amazon Athena. You can download and install from Web Site or Releases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Apache Impala

    Apache Impala

    Apache Impala

    ..., and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. For Apache Hive users, Impala utilizes the same metadata and ODBC driver. Like Hive, Impala supports SQL, so you don't have to worry about reinventing the implementation wheel. With Impala, more users, whether using SQL queries or BI applications, can interact with more data through a single repository and metadata stored from source through analysis.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    IoTDB

    IoTDB

    Apache IoTDB

    Apache IoTDB (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Due to its light-weight architecture, high performance and rich feature set together with its deep integration with Apache Hadoop, Spark and Flink, Apache IoTDB can meet the requirements of massive data storage, high-speed data ingestion and complex data analysis in the IoT industrial fields. In the scene of factories...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    pgai

    pgai

    A suite of tools to develop RAG, semantic search, and other AI apps

    pgai is a suite of PostgreSQL extensions developed by Timescale to empower developers in building AI applications directly within their databases. It integrates tools for vector storage, advanced indexing, and AI model interactions, facilitating the development of applications like semantic search and Retrieval-Augmented Generation (RAG) without leaving the SQL environment.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    pgsync

    pgsync

    Postgres to Elasticsearch/OpenSearch sync

    pgsync is a lightweight tool for syncing Postgres databases across environments, such as from production to staging. It allows selective table syncing, data masking, and parallel copying for fast and safe data migration. pgsync is ideal for developers who need realistic test data without exposing sensitive information.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    chDB

    chDB

    chDB is an in-process OLAP SQL Engine

    chDB is an in-process SQL OLAP Engine powered by ClickHouse. It is developed by ClickHouse, Inc and open-source contributors.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    SQL Notebook

    SQL Notebook

    SQL Notebook — Casual data exploration in SQL

    SQL Notebook is a free Windows application for querying and analyzing data across multiple sources, including SQLite, PostgreSQL, Excel, and CSV files. It combines a SQL editor with a notebook interface, allowing for data exploration, transformation, and visualization in one place. SQL Notebook is ideal for analysts and data enthusiasts.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Dokploy

    Dokploy

    Open Source Alternative to Vercel, Netlify and Heroku

    Streamline your operations with our all-in-one platform, perfect for managing projects, data, and system health with simplicity and efficiency. Simplify your project and data management, ensure robust monitoring, and secure your backups—all without the fuss over minute details. Elevate your infrastructure with tools that offer precise control, detailed monitoring, and enhanced security, ensuring seamless management and robust performance. Streamline your deployments with our PaaS. Effortlessly...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Cloudberry

    Cloudberry

    One advanced and mature open-source MPP

    Apache Cloudberry is a distributed real-time analytics engine designed for querying massive social media datasets. It integrates with Apache AsterixDB and supports efficient ad-hoc queries and aggregations across large volumes of data. Cloudberry is especially useful for dashboards, trend analysis, and time-series social data exploration.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    SurrealDB

    SurrealDB

    A scalable, distributed, collaborative, document-graph database

    With an SQL-style query language, real-time queries with highly-efficient related data retrieval, advanced security permissions for multi-tenant access, and support for performant analytical workloads, SurrealDB is the next generation serverless database. SurrealDB is the ultimate cloud database for tomorrow's applications. SurrealDB is an innovative NewSQL cloud database, suitable for serverless applications, jamstack applications, single-page applications, and traditional applications...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.