Showing 230 open source projects for "extract"

View related business solutions
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 1
    Compose

    Compose

    A machine learning tool for automated prediction engineering

    ...It allows you to structure prediction problems and generate labels for supervised learning. An end user defines an outcome of interest by writing a labeling function, then runs a search to automatically extract training examples from historical data. Its result is then provided to Featuretools for automated feature engineering and subsequently to EvalML for automated machine learning. Prediction problems are structured by using a label maker and a labeling function. The label maker automatically extracts data along the time index to generate labels. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    DDPM-CD

    DDPM-CD

    Remote sensing change detection using denoising diffusion models

    ...The generated images contain objects that we commonly see in real remote sensing images, such as buildings, trees, roads, vegetation, water surfaces, etc., demonstrating the powerful ability of the diffusion models to extract key semantics that can be further used in remote sensing change detection. We fine-tune a light-weight change detection head which takes multi-level feature representations from the pre-trained diffusion model as inputs and outputs change prediction map.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Emb-GAM

    Emb-GAM

    An interpretable and efficient predictor using pre-trained models

    ...In contrast, generalized additive models (GAMs) can maintain interpretability but often suffer from poor prediction performance due to their inability to effectively capture feature interactions. In this work, we aim to bridge this gap by using pre-trained neural language models to extract embeddings for each input before learning a linear model in the embedding space. The final model (which we call Emb-GAM) is a transparent, linear function of its input features and feature interactions. Leveraging the language model allows Emb-GAM to learn far fewer linear coefficients, model larger interactions, and generalize well to novel inputs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    hui

    hui

    hewies user interface - 3D scientific visualisation tool

    Python project with goal to provide FOSS library to extract, analyse and visualise data in a 3D fashion. The instance will connect to a data source, ods sheet, csv, sql DB, pyodbc the instance will analyse and/or transform the data to be presented to the visualisation functionality the instance will visualise the data in a 3D fashion, likely using third party FOSS
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    ediViewer

    ediViewer

    View, Edit and extract transactions from (PESC) standard Edi Files.

    View, Edit and extract edi transactions from PESC standard approved Edi files. The EdiViewer has been tested to work with the following [PESC standards] (https://www.pesc.org/pesc-approved-standards-1.html) 1. TS 189 Application for Admission to Educational Institutions 2. TS 130 Educational Record (Transcript)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    AnimeGAN

    AnimeGAN

    A simple PyTorch Implementation of Generative Adversarial Networks

    A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. The images are generated from a DCGAN model trained on 143,000 anime character faces for 100 epochs. Manipulating latent codes enables the transition from images in the first row to the last row. The images are not clean, some outliers can be observed, which degrades the quality of the generated images. Anime-style images of 126 tags are collected from danbooru.donmai.us using the crawler tool...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    mlscraper

    mlscraper

    ML-based HTML scraper that learns extraction rules from examples

    mlscraper is a Python library designed to automatically extract structured data from HTML pages without requiring developers to manually write CSS selectors or XPath rules. Instead of defining extraction logic by hand, users provide a few examples of the data they want to retrieve from a webpage. It analyzes those examples within the HTML document and determines patterns or rules that can be used to extract the same type of information from similar pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    SysBioTK

    A protein database management toolkit

    ...Several search functions are also included in order to filter the proteins in the library. Support for different screenings, such as control groups, is also included. The tool also allows to extract the GeneOntology of the protein/gene list and is able to perform several statiscal tests on the data. This is a renamed and improved version of ProtDB - https://protdb.sourceforge.io
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Smart Contract Sanctuary

    Smart Contract Sanctuary

    A home for ethereum smart contracts

    ...Contains smart contract sources for various networks, grouped by the first two chars of the contract address. A scriptable semantic grep utility for solidity (crunch numbers, find specific contracts, extract data) Semgrep is a fast, open-source, static analysis tool for finding bugs and enforcing code standards at editor, commit, and CI time, and now supports Solidity! A powerful online code search service that can be used to search the sanctuary without cloning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    instagram-profilecrawl

    instagram-profilecrawl

    Instagram profile crawler that extracts posts, tags, and stats

    instagram-profilecrawl is a Python-based automation script designed to collect publicly available information from Instagram profiles. It crawls profile data such as follower counts, post information, hashtags, and other engagement-related metadata. It operates by automating a web browser using Selenium and performing requests to gather structured information from the platform. instagram-profilecrawl can analyze multiple usernames in a single run and store the extracted information locally...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    pyWhat

    pyWhat

    Identify emails, IP addresses, and more

    ...Given inputs such as hex strings, URLs, email addresses, IP addresses, credit card numbers, cryptocurrency wallets, or entire .pcap capture files, it scans for structured patterns and tells you what it finds. The tool is recursive: it can traverse files and directories to extract meaningful entities, which is useful when analyzing malware samples, network captures, or code repositories at scale. It offers powerful filters called “tags” and distributions that let you narrow results to specific categories like bug bounties, cryptocurrencies, or AWS-related artifacts. For automation and integration, pyWhat provides a CLI with options for rarity filtering, sorting, and JSON export, as well as an API that can be imported into other Python programs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    VoiceFixer

    VoiceFixer

    General Speech Restoration

    VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that reconstructs a high-quality waveform from those features. Unlike many single-purpose noise reduction tools, VoiceFixer targets a “general speech restoration” problem (GSR), capable of handling multiple types of distortions at once, which makes it suitable for old recordings, phone-call audio, amateur voice recordings, or archival media. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 13
    BlooketHack

    BlooketHack

    One of the First Blooket hacks online

    First you must download python 3.7. On install you do need to check the "Add to path" option when you can do so. Second you must download the code here on GitHub. Third you need to extract the files from the .zip file you downloaded. Finally, just double click the main.py file. Original code by: kgsensei. Works on most game modes, Gold Quest (Tested - Working) Tower Defense (Tested - Working) Café (Tested - Working) Factory (Tested - Working) Racing (Tested - Working) Classic (Tested - Working) and Crypto (Tested - Working).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    pytube

    pytube

    A lightweight, dependency-free Python library

    Pytube is a lightweight, dependency-free Python library that enables downloading YouTube videos and audio streams with minimal setup. It supports video resolution selection, progressive or adaptive streams, and caption downloads. Pytube is ideal for automation scripts, archiving tools, and media applications that need to interface with YouTube content programmatically.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    HostHunter

    HostHunter

    OSINT reconnaissance tool for discovering hostnames from IP addresses

    HostHunter is an open source reconnaissance tool designed to discover and extract hostnames associated with a large set of IPv4 or IPv6 addresses. It helps security professionals map IP addresses to virtual hostnames using a combination of OSINT data sources and active reconnaissance techniques. This approach enables users to identify hidden or additional services that may be hosted behind a single IP address.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    pandas-datareader

    pandas-datareader

    Extract data from a wide range of Internet sources

    Up-to-date remote data access for pandas. Works for multiple versions of pandas. Install using pip and then import and use one of the data readers. This example reads 5-years of 10-year constant maturity yields on U.S. government bonds. Stable documentation is available on github.io. A second copy of the stable documentation is hosted on read the docs for more details.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Osintgram

    Osintgram

    Osintgram is a OSINT tool on Instagram

    Osintgram is an OSINT (Open Source Intelligence) tool designed to extract, analyze, and store information from public Instagram profiles. It allows users to retrieve data like followers, hashtags, stories, tagged posts, and locations. The tool is often used by researchers and security analysts for data gathering, footprinting, and investigative purposes related to social media profiling.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 18
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    ...It takes inspiration from architectures like FastSpeech, FastSpeech 2, FastPitch, and Transformer TTS, and extends them with its own aligner and forward models. The system separates alignment learning and acoustic modeling: an autoregressive Transformer is used as an aligner to extract phoneme-to-frame durations, while a non-autoregressive “ForwardTransformer” generates mel-spectrograms conditioned on text and durations. This design addresses common autoregressive issues such as repetition, skipped words, and unstable attention, and results in robust, fast synthesis where all frames are predicted in parallel. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Jupyter Notebooks as PDF

    Jupyter Notebooks as PDF

    Save Jupyter Notebooks as PDF

    ...Unfortunately not all PDF viewers know how to deal with attachments. PDF viewers known to support downloading of file attachments are: Acrobat Reader, pdf.js and evince. The pdftk CLI program can also extract attached files from a PDF. Preview for OSX does not know how to display/give you access to attachments of PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ruia

    ruia

    Async Python framework for fast and flexible web scraping spiders

    ...It provides a structured approach to building scraping projects through components such as data items, spiders, middleware, and plugins. Developers can define structured fields to extract information from HTML content and process responses asynchronously to improve crawling performance. It also supports middleware and plugin systems that allow customization of request handling, response processing, and additional functionality.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    CNN for Image Retrieval
    cnn-for-image-retrieval is a research-oriented project that demonstrates the use of convolutional neural networks (CNNs) for image retrieval tasks. The repository provides implementations of CNN-based methods to extract feature representations from images and use them for similarity-based retrieval. It focuses on applying deep learning techniques to improve upon traditional handcrafted descriptors by learning features directly from data. The code includes training and evaluation scripts that can be adapted for custom datasets, making it useful for experimenting with retrieval systems in computer vision. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Deep Exemplar-based Video Colorization

    Deep Exemplar-based Video Colorization

    The source code of CVPR 2019 paper "Deep Exemplar-based Colorization"

    ...Experiments show our result is superior to the state-of-the-art methods both quantitatively and qualitatively. In order to colorize your own video, it requires to extract the video frames, and provide a reference image as an example.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    BeaEngine 5

    BeaEngine 5

    BeaEngine disasm project

    BeaEngine is a C library designed to decode instructions from 16-bit, 32-bit and 64-bit intel architectures. It includes standard instructions set and instructions set from FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VMX, CLMUL, AES, MPX, AVX, AVX2, AVX512 (VEX & EVEX prefixes), CET, BMI1, BMI2, SGX, UINTR, KL, TDX and AMX extensions. If you want to analyze malicious codes and more generally obfuscated codes, BeaEngine sends back a complex structure that describes precisely the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    .... --- Motu is a high efficient and robust Web Server which fills the gap between heterogeneous Data Providers to End Users. Motu handles, extracts and transforms oceanographic huge volumes of data without performance collapse. This client enables to extract and download data through a python command line Indesol project sample: http://www.indeso.web.id/indeso_wp/index.php/faq/30-6-how-to-write-and-run-the-script-to-download-indeso-met-ocean-data
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or updated with new crawls. The repository documents practical concerns like HTTP failures, snapshot differences, and stats JSONs, reflecting community use across many languages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB