Showing 1794 open source projects for "data"

View related business solutions
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 1
    AWS Data Wrangler

    AWS Data Wrangler

    Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.

    An AWS Professional Service open-source python initiative that extends the power of Pandas library to AWS connecting DataFrames and AWS data-related services. Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON, and EXCEL). Built on top of other open-source projects like Pandas, Apache Arrow and Boto3, it offers abstracted functions to execute usual ETL tasks like load/unload data from Data Lakes, Data Warehouses, and Databases. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    ...It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    theHarvester

    theHarvester

    E-mails, subdomains and names

    ...Use it for open source intelligence (OSINT) gathering to help determine a company's external threat landscape on the internet. The tool gathers emails, names, subdomains, IPs and URLs using multiple public data sources.
    Downloads: 79 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Mihomo

    Mihomo

    A simple Python Pydantic model for Honkai

    Mihomo is a Python client library leveraging Pydantic to model parsed Honkai: Star Rail user data from the Mihomo public API. It provides structured types, type hints, and convenience methods to fetch and transform player profiles, daily stats, and character details efficiently.
    Downloads: 71 This Week
    Last Update:
    See Project
  • 6
    geckodriver

    geckodriver

    WebDriver for Firefox

    geckodriver is an implementation of WebDriver, and WebDriver can be used for widely different purposes. How you invoke geckodriver largely depends on your use case. If you are using geckodriver through Selenium, you must ensure that you have version 3.11 or greater. Because geckodriver implements the W3C WebDriver standard and not the same Selenium wire protocol older drivers are using, you may experience incompatibilities and migration problems when making the switch from FirefoxDriver to...
    Downloads: 75 This Week
    Last Update:
    See Project
  • 7
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    ...It can be used for data mining, monitoring and automated testing.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 8
    Helium Browser

    Helium Browser

    Private, fast, and honest web browser

    ...Helium blocks ads and trackers by default through an integrated, unbiased uBlock Origin extension prepackaged as a native browser component. Its UI and feature set emphasize minimalism, no “smart” recommendations, account sync, or background data collection, resulting in a distraction-free browsing experience that respects user autonomy. The browser is available across macOS, Linux, and Windows, each version built from a fully open source pipeline for reproducibility and trust. Development focuses on maintaining compatibility with modern web standards while decoupling Chromium from its Google dependencies and services.
    Downloads: 141 This Week
    Last Update:
    See Project
  • 9
    syslog-ng

    syslog-ng

    Log management solution that improves the performance of SIEM

    ...Instead of deploying multiple agents on hosts, organizations can unify their log data collection and management. syslog-ng Store Box provides automated archiving, tamper-proof encrypted storage, granular access controls to protect log data. The largest appliance can store up to 10TB of raw logs.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 10
    PostgREST

    PostgREST

    REST API for any Postgres database

    ...Writing business logic often duplicates, ignores or hobbles database structure. Object-relational mapping is a leaky abstraction leading to slow imperative code. The PostgREST philosophy establishes a single declarative source of truth: the data itself. It’s easier to ask PostgreSQL to join data for you and let its query planner figure out the details than to loop through rows yourself. It’s easier to assign permissions to db objects than to add guards in controllers. (This is especially true for cascading permissions in data dependencies.) It’s easier to set constraints than to litter code with sanity checks.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 11
    V2rayU

    V2rayU

    A tool to manage v2ray config json

    V2ray multi-user management script, wizard-style management [new|delete|modify] transmission protocol, enjoy the fun of V2ray. Quickly view server connection information, general configuration modification. Freely change the transmission configuration. Upgrade command (keep the configuration file, if the upgrade fails, please install it completely). Call v2ray official api for traffic statistics. Multi-user, multi-port management , mixed transmission protocol management is no longer a dream....
    Downloads: 30 This Week
    Last Update:
    See Project
  • 12
    Firecrawl

    Firecrawl

    Turn entire websites into LLM-ready markdown or structured data

    Crawl and convert any website into LLM-ready markdown or structured data. Built by Mendable.ai and the Firecrawl community. Includes powerful scraping, crawling, and data extraction capabilities. Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown or structured data. We crawl all accessible subpages and give you clean data for each. No sitemap is required.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Automa

    Automa

    A chrome extension for automating your browser by connecting blocks

    ...There're dozens of workflows been shared by Automa users which you can add and customize. Auto-fill forms, do a repetitive task, take a screenshot, or scrape website data, the choice is yours. You can even schedule when the automation will execute! Browse the Automa marketplace where you can share and download workflows with others.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 14
    Alluxio

    Alluxio

    Open Source Data Orchestration for the Cloud

    Alluxio is the world’s first open source data orchestration technology for analytics and AI for the cloud. It bridges the gap between computation frameworks and storage systems, bringing data from the storage tier closer to the data driven applications. This enables applications to connect to numerous storage systems through a common interface. It makes data local, more accessible and as elastic as compute.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Matomo

    Matomo

    Alternative to Google Analytics that gives you full control over data

    Google Analytics alternative that protects your data and your customers' privacy. Take back control with Matomo – a powerful web analytics platform that gives you 100% data ownership. You could lose your customers’ trust and risk damaging your reputation if people learn their data is used for Google’s “own purposes”. By choosing the ethical alternative, Matomo, you won’t make privacy sacrifices or compromise your site.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Curl

    Curl

    Command line tool and library for transferring data with URLs

    Curl is a command line tool and library for transferring data specified with URL syntax. It supports HTTP, HTTPS, FTP, FTPS, GOPHER, TFTP, SCP, SFTP, SMB, TELNET, DICT, SSL certificates, cookies, user+password authentication, and so much more! Curl is used for many different things. It's used in command lines or scripts for transferring data. It's also used in just about every device you can think of: mobile phones and tablets, television sets, printers, routers, media players and other audio equipment. ...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 17
    fluentbit

    fluentbit

    Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX

    ...No more OOM errors! Integration with all your technology, cloud-native services, containers, streaming processors, and data backends. Fully event-driven design leverages the operating system API for performance and reliability. All operations to collect and deliver data are asynchronous.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Immutable.js

    Immutable.js

    Immutable collections for JavaScript

    Immutable.js offers a collection of Persistent Immutable data structures for JavaScript. Immutable data is unchangeable once created, which makes application development so much simpler. There’s no defensive copying, and you get advanced memoization and change detection techniques with simple logic. Persistent data gives you a mutative API, one that doesn’t update data in-place but always produces new and updated data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Qualitis

    Qualitis

    Qualitis is a one-stop data quality management platform

    Qualitis is a data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. Based on Spring Boot, Qualitis submits quality model task to Linkis platform. It provides functions such as data quality model construction, data quality model execution, data quality verification, reports of data quality generation and so on. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Duplicati

    Duplicati

    Store securely encrypted backups in the cloud!

    Duplicati is a free and open source backup client for securely storing your data. Duplicati stores encrypted, incremental, compressed backups on cloud storage services and remote file servers using AES-256 encryption, keeping your data safe and always updated. It works with most storage services, including Google Cloud and Drive, Amazon S3, Microsoft Azure and OneDrive, Dropbox, FTPOpenStack Storage (Swift), SSH (SFTP), WebDAV, Tencent Cloud Object Storage (COS), and more! ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 21
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 22
    CloudQuery

    CloudQuery

    The open-source cloud asset inventory powered by SQL

    ...Integrate CloudQuery with your current visualization, monitoring, and alerting such as Grafana. CloudQuery supports the TimescaleDB PostgreSQL extension, giving you full historical snapshots of your cloud asset inventory. Data analysis, security, auditing, and compliance. Leverage SQL to get visibility into your cloud infrastructure and SaaS applications. Build a cloud-asset inventory across any of our supported official or community providers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    SPX

    SPX

    A simple & straight-to-the-point PHP profiling extension

    ...Multi metrics capable: 22 are currently supported (various time & memory metrics, included files, objects in use, I/O...). Able to collect data without losing context. For example Xhprof (and potentially its forks) aggregates data per caller / callee pairs, which implies the loss of the full call stack and forbids timeline or Flamegraph based analysis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    PostHog

    PostHog

    PostHog provides open-source web & product analytics

    PostHog is an all‑in‑one open‑source platform for product and web analytics—offering event-based analytics, session recording, feature flagging, A/B testing, cohorts, and more—that you can self‑host, with full support for data privacy and enterprise compliance. Sync data from external tools like Stripe, Hubspot, your data warehouse, and more. Query it alongside your product data. Run custom filters and transformations on your incoming data. Send it to 25+ tools or any webhook in real time or batch export large amounts to your warehouse. Capture traces, generations, latency, and cost for your LLM-powered app.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    SQLPad

    SQLPad

    Web-based SQL editor run in your own private cloud

    A web app for writing and running SQL queries and visualizing the results. Supports Postgres, MySQL, SQL Server, ClickHouse, Crate, Vertica, Trino, Presto, SAP HANA, Cassandra, Snowflake, Google BigQuery, SQLite, TiDB, and many more via ODBC. The docker image runs on port 3000 and uses /var/lib/sqlpad for the embedded database directory. latest tag is continuously built from latest commit in repo. Only use that if you want to live on the edge, otherwise use specific version tags to ensure...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →