Search Results for "data leakage detection python"

31 projects for "data leakage detection python" with 1 filter applied:

  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Deequ

    Deequ

    Deequ is a library built on top of Apache Spark

    ...It also includes a little domain-specific language called DQDL (Data Quality Definition Language) which allows declarative specification of quality rules. Users typically run Deequ before feeding data downstream (to ML pipelines, analytics, or production systems), enabling early detection and isolation of data errors. There is also a Python wrapper, PyDeequ, for users who prefer working from Python environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 3
    Anomalib

    Anomalib

    An anomaly detection library comprising state-of-the-art algorithms

    Anomalib is an open-source deep learning library focused on anomaly detection and localization tasks, collecting state-of-the-art algorithms and tools under one modular framework. It provides implementations of leading anomaly detection methods drawn from current research, as well as a full set of utilities for training, evaluating, benchmarking, and deploying these models on both public and private datasets. Anomalib emphasizes flexibility and reproducibility: you can use its simple APIs to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Unredact

    Unredact

    A simple tool for reading in poorly redacted documents

    Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and...
    Downloads: 24 This Week
    Last Update:
    See Project
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 5
    Granite TSFM

    Granite TSFM

    Foundation Models for Time Series

    granite-tsfm collects public notebooks, utilities, and serving components for IBM’s Time Series Foundation Models (TSFM), giving practitioners a practical path from data prep to inference for forecasting and anomaly-detection use cases. The repository focuses on end-to-end workflows: loading data, building datasets, fine-tuning forecasters, running evaluations, and serving models. It documents the currently supported Python versions and points users to where the core TSFM models are hosted and how to wire up service components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Earth Engine API

    Earth Engine API

    Python and JavaScript bindings for calling the Earth Engine API

    ...Developers authenticate once, work interactively in notebooks or the Code Editor, and export results to Cloud Storage, Drive, or asset collections. Visualization helpers render tiled layers and charts so analysts can iterate quickly on workflows like land-cover mapping, change detection, or time-series analysis. By combining petabyte-scale data with concise functional transforms, the API turns complex remote-sensing pipelines into reproducible scripts that are easy to share.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    ChatGPT Retrieval Plugin

    ChatGPT Retrieval Plugin

    The ChatGPT Retrieval Plugin lets you easily find personal documents

    The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Gwyddion

    Gwyddion

    Scanning probe microscopy data visualisation and analysis

    A data visualization and processing tool for scanning probe microscopy (SPM, i.e. AFM, STM, MFM, SNOM/NSOM, ...) and profilometry data, useful also for general image and 2D data analysis.
    Leader badge
    Downloads: 1,197 This Week
    Last Update:
    See Project
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 10
    Detic

    Detic

    Code release for "Detecting Twenty-thousand Classes

    Detic (“Detecting Twenty-thousand Classes using Image-level Supervision”) is a large-vocabulary object detector that scales beyond fully annotated datasets by leveraging image-level labels. It decouples localization from classification, training a strong box localizer on standard detection data while learning classifiers from weak supervision and large image-tag corpora. A shared region proposal backbone feeds a flexible classification head that can expand to tens of thousands of categories...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    wxMEdit

    wxMEdit

    wxMEdit, Cross-platform Text/Hex Editor, Improved Version of MadEdit

    •Added automatically checking for updates •Added bookmark support •Added right-click context menu for each tab •Added purging histories support •Added selecting a line by triple click •Added FreeBASIC syntax file •Added an option to place configuration files into %APPDATA% directory under Windows •Improved support for Find/Replace •Improved Mac OS X support •Improved system integration under Windows •Improved encoding detection result •Improved Hex editing support •Added more...
    Leader badge
    Downloads: 110 This Week
    Last Update:
    See Project
  • 12
    Shennina

    Shennina

    Automating Host Exploitation with AI

    Shennina is an automated host exploitation framework. The mission of the project is to fully automate the scanning, vulnerability scanning/analysis, and exploitation using Artificial Intelligence. Shennina is integrated with Metasploit and Nmap for performing the attacks, as well as being integrated with an in-house Command-and-Control Server for exfiltrating data from compromised machines automatically. Shennina scans a set of input targets for available network services, uses its AI engine...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    MAE (Masked Autoencoders)

    MAE (Masked Autoencoders)

    PyTorch implementation of MAE

    MAE (Masked Autoencoders) is a self-supervised learning framework for visual representation learning using masked image modeling. It trains a Vision Transformer (ViT) by randomly masking a high percentage of image patches (typically 75%) and reconstructing the missing content from the remaining visible patches. This forces the model to learn semantic structure and global context without supervision. The encoder processes only the visible patches, while a lightweight decoder reconstructs the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Zenoss Community Edition

    Zenoss Community Edition

    Zenoss - Intelligent IT Operations Management

    Zenoss provides software-defined IT operations for the world’s largest organizations. We deliver the ultimate level of IT service health with simplicity by providing the most granular and intelligent IT service modeling possible, at any scale, and sharing these unique insights with other IT operations management (ITOM) tools to make them more efficient. Zenoss Community Edition is not a “demo” or trial version of Zenoss Enterprise or Zenoss Cloud! Before You install Zenoss Community...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    PyTorchVideo

    PyTorchVideo

    A deep learning library for video understanding research

    PyTorchVideo is a deep learning library for video understanding, providing modular components and pretrained models for tasks like action recognition, video classification, detection, and self-supervised learning. It is tightly integrated with PyTorch and PyTorch Lightning, offering flexible APIs for building and training spatiotemporal networks. The library includes efficient implementations of state-of-the-art architectures such as SlowFast, X3D, and MViT, optimized for both research...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PyExfil

    PyExfil

    A Python Package for Data Exfiltration

    PyExfil was born as a PoC and kind of a playground and grew to be something a bit more. In my eyes it’s still a messy PoC that needs a lot more work and testing to become stable. The purpose of PyExfil is to set as many exfiltrations, and now also communication, techniques that CAN be used by various threat actors/malware around to bypass various detection and mitigation tools and techniques. You can track changes at the official GitHub page. Putting it simply, it’s meant to be used as a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    FusionCatcher

    Somatic fusion-genes finder for RNA-seq data

    FusionCatcher searches for novel/known somatic fusion genes, translocations, and chimeras in RNA-seq data (paired-end reads from Illumina NGS platforms like Solexa and HiSeq) from diseased samples. The aims of FusionCatcher are: - very good detection rate for finding candidate fusion genes, - very easy to use (i.e. no a priori knowledge of databases and bioinformatics is needed in order to run FusionCatcher), - very good detection of challenging fusion genes, like for example IGH...
    Leader badge
    Downloads: 32 This Week
    Last Update:
    See Project
  • 18
    VoteNet

    VoteNet

    Deep Hough Voting for 3D Object Detection in Point Clouds

    VoteNet is a 3D object detection framework for point clouds that combines deep point set networks with a Hough voting mechanism to localize and classify objects in 3D space. It tackles the challenge that object centroids in 3D scenes often don’t lie on any input surface point by having each point “vote” for potential object centers; these votes are then clustered to propose object hypotheses. Once cluster centers are formed, the network regresses bounding boxes around them and classifies...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    maskrcnn-benchmark

    maskrcnn-benchmark

    Fast, modular reference implementation of Instance Segmentation

    Mask R-CNN Benchmark is a PyTorch-based framework that provides high-performance implementations of object detection, instance segmentation, and keypoint detection models. Originally built to benchmark Mask R-CNN and related models, it offers a clean, modular design to train and evaluate detection systems efficiently on standard datasets like COCO. The framework integrates critical components—region proposal networks (RPNs), RoIAlign layers, mask heads, and backbone architectures such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    reditools

    RNA editing detection by NGS data

    REDItools are python scripts developed with the aim to study RNA editing at genomic scale by next generation sequencing data. RNA editing is a post-transcriptional phenomenon involving the insertion/deletion or substitution of specific bases in precise RNA localizations. In human, RNA editing occurs by deamination of cytosine to uridine (C-to-U) or mostly by the adenosine to inosine (A-to-I) conversion through ADAR enzymes. A-to-I substitutions may have profound functional consequences and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    AlienVault OSSIM

    AlienVault OSSIM

    Open Source SIEM

    OSSIM, AlienVault’s Open Source Security Information and Event Management (SIEM) product, provides event collection, normalization and correlation. For more advanced functionality, AlienVault Unified Security Management (USM) builds on OSSIM with these additional capabilities: * Log management * Advanced threat detection with a continuously updated library of pre-built correlation rules * Actionable threat intelligence updates from AlienVault Labs Security Research Team * Rich...
    Leader badge
    Downloads: 64 This Week
    Last Update:
    See Project
  • 22
    Creates interactive plots of data lines over a time scale. Automatic detection of input format and scale will result in instantaneous results without any need for configuration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    This is a sophisticated & integrated simulation and analysis environment for dynamical systems models of physical systems (ODEs, DAEs, maps, and hybrid systems). It supports symbolic math, optimization, continuation, data analysis, biological apps...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    QuickNXS

    QuickNXS

    Polarized ToF reflectivity raw data analysis tool

    Data evaluation tool for the magnetism reflectometer at the spallation neutron source (BL-4A@SNS). Reads raw nexus files (HDF5) of histogrammed or event mode data to create reflectivity curves and 2D Q-maps.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    sitecheck

    Modular web site spider for web developers.

    More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB