Showing 3125 open source projects for "data"

View related business solutions
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 1
    odd-collector

    odd-collector

    Open-source metadata collector based on ODD Specification

    ODD Collector is a lightweight service that gathers metadata from all your data sources. Push-client is a provider which sends information directly to the central repository of the Platform. ODDRN (Open Data Discovery Resource Name) is a unique resource name that identifies entities such as data sources, data entities, dataset fields etc. It is used to build lineage and update metadata.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    CrowdAnki

    CrowdAnki

    Plugin for Anki SRS designed to facilitate cooperation

    CrowdAnki is a plugin for http://ankisrs.net/ that allows users to import and export decks/notes and all relevant information in a JSON format. The main purpose is to facilitate crowd-sourcing for Anki decks and notes. Starting with version 0.6 it also features a close integration with Git. Providing you with the ability to automatically maintain a history of edits for your decks. My goal here is to provide a user-friendly description of collaboration workflow. In order to do that, I looked...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    SageMaker Inference Toolkit

    SageMaker Inference Toolkit

    Serve machine learning models within a Docker container

    Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code. A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the container is deployed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 5
    blockfrost-python

    blockfrost-python

    Python 3 SDK for the Blockfrost.io API

    API for Cardano decentralized blockchain. Accessing and processing information stored on the blockchain is not trivial. We provide abstraction between you and blockchain data, taking away the burden of complexity, so you can focus on what really matters - developing your applications. Our basic tier is and always will be free of charge. We nurture development and the Cardano ecosystem. However, if you want to support us, please consider upgrading.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Knobjex Info Manager

    Tool for PIM, mind-mapping, quality-management, knowledge-base.

    Also available for knowledge-enthusiasts: https://github.com/some-avail/freekwensie Knobjex 5.01 released partially; windows setup forthcoming; go to tab "Files" to download . Added shortcut keys (chapter 5.3) , dark themes better supported, removed bugs. No database-changes. Knobjex (short for Knowledge Objects) is an information-manager. It has many potential uses, such as calendar, task-list and sticky notes. Knobjex can also handle more advanced use-cases such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Example Streamlit

    Example Streamlit

    Example Streamlit app that you can fork to test out share.streamlit.io

    streamlit-example is an open source sample app created by the Streamlit team to demonstrate how to quickly build and deploy applications with Streamlit. The repository contains a minimal Python app (streamlit_app.py) that can be customized by editing the source file. It is designed for use with share.streamlit.io, allowing developers to fork the repo and instantly deploy their own interactive app. The project includes basic dependencies defined in requirements.txt and supports containerized...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Non-Formatting USB Repair

    Non-Formatting USB Repair

    Repair Any Corrupt Usb Drive Without Formatting. Download Now!

    Revive Your USB Drive Without Losing Data! Unleash the power of our cutting-edge USB Drive Repair Tool – the ultimate solution for salvaging your USB drives without sacrificing precious data. No more formatting headaches or data loss worries. 🛠 Key Features: Non-Destructive Repair: Fix and recover your USB drive without losing files or formatting the drive. Comprehensive Recovery: Rescue files from corruption, bad sectors, and other issues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    TextGen

    TextGen

    textgen, Text Generation models

    Implementation of Text Generation models. textgen implements a variety of text generation models, including UDA, GPT2, Seq2Seq, BART, T5, SongNet and other models, out of the box. UDA, non-core word replacement. EDA, simple data augmentation technique: similar words, synonym replacement, random word insertion, deletion, replacement. This project refers to Google's UDA (non-core word replacement) algorithm and EDA algorithm, based on TF-IDF to replace some unimportant words in sentences with synonyms, random word insertion, deletion, replacement, etc. method, generating new text and implementing text augmentation This project realizes the back translation function based on Baidu translation API, first translate Chinese sentences into English, and then translate English into new Chinese. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 10
    MaxFEM

    MaxFEM

    Software for electromagnetic simulation

    MaxFem is an open software package for electromagnetic simulation by using finite element methods. The package can solve problems in electrostatics, direct current, magnetostatics and eddy-currents. Since version 0.4.0, MaxFEM requires Python 3. We have moved the installers to the MaxFEM website (see below). In order to improve MaxFEM, we will require you to fill out a simple form before downloading them.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    State of Open Source AI

    State of Open Source AI

    Clarity in the current fast-paced mess of Open Source innovation

    This repository is the source for a book (or large written work) titled “The State of Open Source AI”. The goal of the project is to bring clarity to the rapidly evolving open-source AI ecosystem by documenting trends, models, tools, standards, deployment practices, and challenges. It acts as both a snapshot and a guide: readers can see what’s “hot now” in open AI infrastructure, what open licensing or governance issues are emerging, how deployment options compare, and what gaps remain....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    s3cmd

    s3cmd

    Command line tool for managing Amazon S3 and CloudFront services

    Open-source tool to access Amazon S3 file storage. S3cmd is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage. Lots of features and options have been added to s3cmd since its very first release in 2008.... we recently counted more than 60 command line options, including multipart uploads, encryption, incremental backup, s3 sync, ACL and Metadata management, S3 bucket size, bucket policies, and more!
    Leader badge
    Downloads: 1,034 This Week
    Last Update:
    See Project
  • 13
    Autolabel

    Autolabel

    Label, clean and enrich text datasets with LLMs

    Autolabel is a Python library to label, clean and enrich datasets with Large Language Models (LLMs). Autolabel data for NLP tasks such as classification, question-answering and named entity recognition, entity matching and more. Seamlessly use commercial and open-source LLMs from providers such as OpenAI, Anthropic, HuggingFace, Google and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Asteroid

    Asteroid

    The PyTorch-based audio source separation toolkit for researchers

    ...Extending the toolkit with new features is simple. Add a new filterbank, separator architecture, dataset or even recipe very easily. Recipes provide an easy way to reproduce results with data preparation, system design, training and evaluation in a single script. This is an essential tool for the community! The default logger is TensorBoard in all the recipes. From the recipe folder, you can run the following to visualize the logs of all your runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    odd-collector-gcp

    odd-collector-gcp

    Open-source GCP metadata collector based on ODD Specification

    ODD Collector GCP is a lightweight service which gathers metadata from all your Google Cloud Platform data sources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    databooks

    databooks

    A CLI tool to reduce the friction between data scientists

    databooks is a package to ease the collaboration between data scientists using Jupyter notebooks, by reducing the number of git conflicts between different notebooks and resolution of git conflicts when encountered. Simply specify the paths for notebook files to remove metadata. By doing so, we can already avoid many of the conflicts. Specify the paths for notebook files with conflicts to be fixed.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    nmeasim

    nmeasim

    Simulate GPS+AIS sentences and send to a navigation SW

    NMEASIM sends NMEA 0183 sentences to a navigation SW such as OpenCpn via a serial com port or a UDP channel. These NMEA sentences simulate data coming from a GPS(sentence $GPRMC...) and an AIS receiver(sentence !AIVDM...) All the parameters : name, mmsi, initial position (lat/long), COG, SOG for my boat as well as for AIS targets, and I/O configuration (com port...) can be edited in the 'datasimul.json' (file created by the app with default values in V1.0) Just Download the nmeasim.exe file (green button above) and launch it. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    ADB Sync

    ADB Sync

    Synchronize files between a PC and Android device using ADB

    ...By comparing file states between the host and the device, adb-sync efficiently updates only changed files, reducing transfer time and bandwidth usage. The tool also supports reverse synchronization, allowing users to copy data from an Android device back to their PC. While this project has been deprecated in favor of better-adb-sync, it remains a lightweight and effective option for managing file transfers and backups over USB debugging connections.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    CloudI: A Cloud at the lowest level
    CloudI is an open-source private cloud computing framework for efficient, secure, and internal data processing. CloudI provides scaling for previously unscalable source code with efficient fault-tolerant execution of ATS, C/C++, Erlang/Elixir, Go, Haskell, Java, JavaScript/node.js, OCaml, Perl, PHP, Python, Ruby, or Rust services. The bare essentials for efficient fault-tolerant processing on a cloud!
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    CostPal

    CostPal

    CostPal is your personal finance manager

    ...It's purpose is to act as a scratchpad, not a standalone application for managing personal finance - for that you have CostPal. Currently only Dropbox is supported as a cloud storage used for data exchange.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ChatFred

    ChatFred

    Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting

    .... ⤓ Install on the Alfred Gallery or download it over GitHub and add your OpenAI API key. If you have used ChatGPT or DALL·E 2, you already have an OpenAI account. Otherwise, you can sign up here - You will receive $5 in free credit, no payment data is required. Afterward you can create your API key. To start a conversation with ChatGPT either use the keyword cf, setup the workflow as a fallback search in Alfred or create your custom hotkey to directly send the clipboard content to ChatGPT.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DirectXDiagnostic Tool Opener

    DirectXDiagnostic Tool Opener

    Open DirectXDiagnostic Tool without any commands

    Comprehensive Hardware Analysis, Monitoring and Reporting for Windows. Exhausting information about hardware components displayed in hierarchy unfolding into deep details. Useful for obtaining a detailed hardware inventory report or checking of various hardware-related parameters. Real-time monitoring of a variety of system and hardware parameters covering CPUs, GPUs, mainboards, drives, peripherals, etc. Useful for detection of overheating, overload, performance loss or failure...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    PyTorch Implementation of SDE Solvers

    PyTorch Implementation of SDE Solvers

    Differentiable SDE solvers with GPU support and efficient sensitivity

    This library provides stochastic differential equation (SDE) solvers with GPU support and efficient backpropagation. examples/demo.ipynb gives a short guide on how to solve SDEs, including subtle points such as fixing the randomness in the solver and the choice of noise types. examples/latent_sde.py learns a latent stochastic differential equation, as in Section 5 of [1]. The example fits an SDE to data, whilst regularizing it to be like an Ornstein-Uhlenbeck prior process. The model can be loosely viewed as a variational autoencoder with its prior and approximate posterior being SDEs. The program outputs figures to the path specified by <TRAIN_DIR>. Training should stabilize after 500 iterations with the default hyperparameters. examples/sde_gan.py learns an SDE as a GAN, as in [2], [3]. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    xTuring

    xTuring

    Easily build, customize and control your own LLMs

    xTuring is an open-source AI personalization software. xTuring makes it easy to build and control LLMs by providing a simple interface to personalize LLMs to your own data and application. xTuring provides fast, efficient and simple fine-tuning of LLMs, such as LLaMA, GPT-J, Galactica, and more. By providing an easy-to-use interface for fine-tuning LLMs to your own data and application, xTuring makes it simple to build, customize and control LLMs. The entire process can be done inside your computer or in your private cloud, ensuring data privacy and security.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB