Showing 3125 open source projects for "data"

View related business solutions
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AIβ€”with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhereβ€”across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1

    Bagels administration toolkit

    The system are composed by one mobile application and others resources

    The bagels system is one complete simple and easy integration for mobile tickets. Using this application you can enable administration task easy to your restaurant. The system permit integration with one web database structure. Also the projet use a simple mobile customizableapplication.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    footswitch2

    footswitch2

    Audio Transcription software for Linux (Vlc) with a foot pedal

    Footswitch 2 is a media player for transcribers on Linux. Written in python and using the python bindings for VLC it allows a transcriber to control the audio or video with a USB footpedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do not yet own a footpedal/footswitch. Control of the media player from LibreOffice can be via Hotkeys or an integrated...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    SBEVSL is a collaborative project between Dowling and RIT on the development of a Structural Biology Extensible Visualization Scripting Language, so that users can move freely among various molecular graphics tools, such as rasmol, pymol, raster3d, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    footswitch3

    footswitch3

    Audio Transcription software for Linux (Gstreamer) with a foot pedal

    Footswitch 3 is a media player for transcribers on Linux. Written in python using the python bindings for Gstreamer it allows a transcriber to control the audio or video with a foot pedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do not yet own a foot pedal/foot switch. Control of the media player from LibreOffice can be via Hotkeys or an integrated...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine typesβ€”or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 5
    HealthFusion

    HealthFusion

    AI Disease Detections System

    HealthFusion has identified a critical business problem, which is the lack of accessibility and timely detection of multiple diseases. The traditional approach of detecting diseases is time-consuming, expensive, and not accessible to everyone, especially in remote areas. This problem can lead to delayed diagnosis and treatment, which can have serious consequences for patients. The proposed solution, HealthFusion, is novel and practical as it offers a comprehensive solution to detect...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Neural Network Visualization

    Neural Network Visualization

    Project for processing neural networks and rendering to gain insights

    nn_vis is a minimalist visualization tool for neural networks written in Python using OpenGL and Pygame. It provides an interactive, graphical representation of how data flows through neural network layers, offering a unique educational experience for those new to deep learning or looking to explain it visually. By animating input, weights, activations, and outputs, the tool demystifies neural network operations and helps users intuitively grasp complex concepts. Its lightweight codebase is great for customization and teaching purposes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    UnionML

    UnionML

    Build and deploy machine learning microservices

    ...UnionML is an open-source Python framework built on top of Flyteβ„’, unifying the complex ecosystem of ML tools into a single interface. Combine the tools that you love using a simple, standardized API so you can stop writing so much boilerplate and focus on what matters: the data and the models that learn from them. Fit the rich ecosystem of tools and frameworks into a common protocol for machine learning. Using industry-standard machine learning methods, implement endpoints for fetching data, training models, serving predictions (and much more) to write a complete ML stack in one place. Data science, ML engineering, and MLOps practitioners can all gather around UnionML apps as a way of defining a single source of truth about your ML system’s behavior. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SPyQL

    SPyQL

    Query data on the command line with SQL-like SELECTs powered by Python

    ...SPyQL is a query language that combines the simplicity and structure of SQL with the power and readability of Python. SPyQL offers a command-line interface that allows running SPyQL queries on top of text data (e.g. CSV, JSON). Data can come from files but also from data streams, such as as Kafka, or from databases such as PostgreSQL. Basically, data can come from any command that outputs text :-). More, data can be generated by a Python expression! And since SPyQL also writes to different formats, it allows to easily convert between data formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CPT

    CPT

    CPT: A Pre-Trained Unbalanced Transformer

    A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation. We replace the old BERT vocabulary with a larger one of size 51271 built from the training data, in which we 1) add missing 6800+ Chinese characters (most of them are traditional Chinese characters); 2) remove redundant tokens (e.g. Chinese character tokens with ## prefix); 3) add some English tokens to reduce OOV. Position Embeddings We extend the max_position_embeddings from 512 to 1024. We initialize the new version of models with the old version of checkpoints with vocabulary alignment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenanceβ€”typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • 10
    Pentest-Tools

    Pentest-Tools

    A collection of custom security tools for quick needs.

    Pentest-Tools is a collection of penetration testing scripts and utilities designed to help security professionals and ethical hackers perform vulnerability assessments. It includes a wide range of tools for tasks like web scraping, reconnaissance, data extraction, and network analysis. The suite is modular, allowing users to choose the tools that best fit their specific pentesting needs, from web application analysis to network penetration testing.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    Twinify

    Twinify

    Privacy-preserving generation of a synthetic twin to a data set

    ...For the latter, twinify also offers automatic modeling for easy building of models fitting the data. If you have existing experience with NumPyro you can also implement your own model directly. Often data that would be very useful for the scientific community is subject to privacy regulations and concerns and cannot be shared. Differentially private data sharing allows generating of synthetic data that is statistically similar to the original data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    TextBox

    TextBox

    A text generation library with pre-trained language models github.com

    ...From a model perspective, we incorporate 47 pre-trained language models/modules covering the categories of general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight models (modules). From a training perspective, we support 4 pre-training objectives and 4 efficient and robust training strategies, such as distributed data parallel and efficient generation. Compared with the previous version of TextBox, this extension mainly focuses on building a unified, flexible, and standardized framework for better supporting PLM-based text generation models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    SQLBucket

    SQLBucket

    Lightweight library to write, orchestrate and test your SQL ETL

    SQLBucket is a lightweight framework to help write, orchestrate and validate SQL data pipelines. It gives the possibility to set variables and introduces some control flow using the fantastic Jinja2 library. It also implements a very simplistic unit and integration test framework where you can validate the results of your ETL in the form of SQL checks. With SQLBucket, you can apply TDD principles when writing data pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    VAERity

    VAERity

    Uncovering truth in data

    VAERity is a free, open source tool to graphically explore the VAERS data set. It aims to eventually expand in scope to allow fast querying of arbitrary large datasets. It utilizes vaex and pandas as required to provide a balance of speed and query flexibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    FairScale is a collection of PyTorch performance and scaling primitives that pioneered many of the ideas now used for large-model training. It introduced Fully Sharded Data Parallel (FSDP) style techniques that shard model parameters, gradients, and optimizer states across ranks to fit bigger models into the same memory budget. The library also provides pipeline parallelism, activation checkpointing, mixed precision, optimizer state sharding (OSS), and auto-wrapping policies that reduce boilerplate in complex distributed setups. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Karate Club

    Karate Club

    An API Oriented Open-source Python Framework for Unsupervised Learning

    Karate Club is an unsupervised machine learning extension library for NetworkX. Karate Club consists of state-of-the-art methods to do unsupervised learning on graph-structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping community detection methods. Implemented methods cover a wide range of network science (NetSci, Complenet), data mining (ICDM, CIKM, KDD), artificial intelligence (AAAI, IJCAI) and machine learning (NeurIPS, ICML, ICLR) conferences, workshops, and pieces from prominent journals.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    quantitative

    quantitative

    Quantized transactions python3

    ...The repo is evidently tied to a popular video series (on Bilibili) that reportedly drew substantial attention, suggesting the material is meant to be both educational and hands-on. The README and associated lessons walk the user through implementing algorithms, likely covering data handling, backtesting, and maybe simple trading logic. As an open-source educational resource, it’s designed for Python users interested in automatic trading, algorithmic strategies, and financial data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DDPM-CD

    DDPM-CD

    Remote sensing change detection using denoising diffusion models

    This is the Pytorch implementation of Remote Sensing Change Detection using Denoising Diffusion Probabilistic Models. The generated images contain objects that we commonly see in real remote sensing images, such as buildings, trees, roads, vegetation, water surfaces, etc., demonstrating the powerful ability of the diffusion models to extract key semantics that can be further used in remote sensing change detection. We fine-tune a light-weight change detection head which takes multi-level...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PyNanoLab

    PyNanoLab

    data analysis and Visualization with matplotlib

    PyNanoLab contains a variety of tools to complete the data analysis, statistics, curve fitting, and basic machine learning application. Visualization in pynanolab is based on matplotlib. The setup tools is desinged to control and set-up all the details of the figure with a GUI.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ArkID

    ArkID

    Enterprise IDaaS/IAM platform system

    ...Achieve centralized and safe storage of corporate organizational structure and identity information of massive personnel. Establish a correspondence in multiple dimensions and securely integrate enterprise identity data sources. To achieve efficient and unified management of enterprise personnel, organizational structure, and application of information on a platform.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    DeepCTR

    DeepCTR

    Package of deep-learning based CTR models

    ...You can use any complex model with model.fit(), and model.predict(). Provide tf.keras.Model like interface for quick experiment. Provide tensorflow estimator interface for large scale data and distributed training. It is compatible with both tf 1.x and tf 2.x. With the great success of deep learning,DNN-based techniques have been widely used in CTR prediction task. The data in CTR estimation task usually includes high sparse,high cardinality categorical features and some dense numerical features. Since DNN are good at handling dense numerical features,we usually map the sparse categorical features to dense numerical through embedding technique.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    hui

    hui

    hewies user interface - 3D scientific visualisation tool

    Python project with goal to provide FOSS library to extract, analyse and visualise data in a 3D fashion. The instance will connect to a data source, ods sheet, csv, sql DB, pyodbc the instance will analyse and/or transform the data to be presented to the visualisation functionality the instance will visualise the data in a 3D fashion, likely using third party FOSS
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Shennina

    Shennina

    Automating Host Exploitation with AI

    ...Shennina is integrated with Metasploit and Nmap for performing the attacks, as well as being integrated with an in-house Command-and-Control Server for exfiltrating data from compromised machines automatically. Shennina scans a set of input targets for available network services, uses its AI engine to identify recommended exploits for the attacks, and then attempts to test and attack the targets. If the attack succeeds, Shennina proceeds with the post-exploitation phase. The AI engine is initially trained against live targets to learn reliable exploits against remote services. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    EnCodec

    EnCodec

    State-of-the-art deep learning based audio codec

    ...The model can operate in real time and supports variable bandwidths, bitrates, and multi-band audio. Encodec has applications in speech and music compression, generative modeling, and efficient data transmission for communication systems. The repository includes pretrained checkpoints, PyTorch inference code, and examples for integrating Encodec as a module in downstream generative or streaming systems.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    BCI

    BCI

    BCI: Breast Cancer Immunohistochemical Image Generation

    ...The routine evaluation of HER2 is conducted with immunohistochemical techniques (IHC), which is very expensive. Therefore, for the first time, we propose a breast cancer immunohistochemical (BCI) benchmark attempting to synthesize IHC data directly with the paired hematoxylin and eosin (HE) stained images.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB