Showing 41 open source projects for "extract"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Volatility

    Volatility

    An advanced memory forensics framework

    Volatility is a widely used open-source framework for analyzing memory captures (RAM dumps) from Windows, Linux, and macOS systems. It enables investigators and malware analysts to extract process lists, network connections, DLLs, strings, artifacts, and more. Volatility supports many plugins for detecting hidden processes, malware, rootkits, and event tracing. It’s essential in digital forensics and incident response workflows.
    Downloads: 114 This Week
    Last Update:
    See Project
  • 2
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Ethereum ETL

    Ethereum ETL

    Python scripts for ETL (extract, transform and load) jobs for Ethereum

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery. Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and...
    Downloads: 275 This Week
    Last Update:
    See Project
  • 6
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 58 This Week
    Last Update:
    See Project
  • 7
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    ...Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring and automated testing.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 8
    pikepdf

    pikepdf

    A Python library for reading and writing PDF, powered by QPDF

    pikepdf is a Python library allowing the creation, manipulation, and repair of PDFs. It provides a Pythonic wrapper around the C++ PDF content transformation library, QPDF. Python + QPDF = “py” + “qpdf” = “pyqpdf”, which looks like a dyslexia test and is no fun to type. But say “pyqpdf” out loud, and it sounds like “pikepdf”. pikepdf is a library intended for developers who want to create, manipulate, parse, repair, and abuse the PDF format. It supports reading and write PDFs, including...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    pep484 stubs for Django

    pep484 stubs for Django

    PEP-484 stubs for Django

    ...You can show your support by liking the PR. This project does not affect your runtime at all. It only affects mypy type checking process. The current implementation uses Django's runtime to extract information about models, so it might crash if your installed apps or models.py are broken.
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    Translate Toolkit

    Translate Toolkit

    Useful localization tools with Python API for building localization

    ...Allowing you and your translators to work on industry-standard translation formats. Search for pattern matches. Run tests that adapt to languages and source projects. Extract terminology. A large toolset to allow you to increase localization quality. The code is available for you to add new formats, project types, localization tests and language modules. Adapting the toolkit to your project and needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Python-Spider

    Python-Spider

    Python3 web crawler practice

    Python-Spider is a repository intended to teach or provide examples for writing web spiders / crawlers in Python — part of a broader learning and resource collection by its author. The code and documentation are oriented toward beginners or intermediate learners who want to learn how to fetch, parse, and extract data from websites programmatically. As part of the author’s public learning-path repositories, python-spider likely includes examples of HTTP requests, HTML parsing, maybe concurrency or scheduling to crawl multiple pages, and techniques to handle common web-scraping issues. For people wanting to get hands-on with building scrapers, collecting data, or learning how to navigate web programming in Python, this repository acts as a didactic reference or starting point. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Wapiti

    Wapiti

    Wapiti is a web-application vulnerability scanner

    Wapiti is a vulnerability scanner for web applications. It currently search vulnerabilities like XSS, SQL and XPath injections, file inclusions, command execution, XXE injections, CRLF injections, Server Side Request Forgery, Open Redirects... It use the Python 3 programming language.
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 14
    PyAppExec

    PyAppExec

    Launcher that prepares Python/deps and runs your app like OS-native

    ...It locates or installs the required Python runtime, provisions an isolated virtual environment, installs your project’s pip requirements, and handles any external tools requirements or dependencies (e.g., FFmpeg) with version checks and auto-download/extract on Windows/macOS/Linux. The Qt-based installer can scaffold pyappexec.ini, copy/rename the launcher, and (on macOS) bundle a self-contained .app with icons. The optional run-time GUI captures logs, while CLI mode stays lean for automation. Config is driven by a simple INI per OS, so app IDs, entry points, requirements, and paths are declarative. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    FloPy

    A python application to extract rows/columns from .csv and .xls files.

    FloPy is a utility application that simplifies the process of loading, filtering, and saving specific data from various file formats, including ".csv", ".xls", and ".xlsx". Its user-friendly graphical interface (GUI) ensures that anyone can use the application with ease. ### Author ### This project was created by SUVANKAR BANERJEE. You can find more about the author and contribute to the project on GitHub. ### License ### FloPy is released under the MIT License, promoting open...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    PoJamas aims to provide a Python and tools for loading, processing, and producing .cr2, pz3 (crz, pzz) files compatible with the SmithMicro (e-frontier) Poser character animation application. PoJamas is composed of: - Python library - Python Wavefront (.obj) 3D viewer based on GLFW - LibreOffice/Python Application (to ease the library and the viewer usage) As of 2020, the project is ported in Python3 As of 2021 this project proposes a 3D viewer for Wavefront files...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Smart Contract Sanctuary

    Smart Contract Sanctuary

    A home for ethereum smart contracts

    ...Contains smart contract sources for various networks, grouped by the first two chars of the contract address. A scriptable semantic grep utility for solidity (crunch numbers, find specific contracts, extract data) Semgrep is a fast, open-source, static analysis tool for finding bugs and enforcing code standards at editor, commit, and CI time, and now supports Solidity! A powerful online code search service that can be used to search the sanctuary without cloning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CNN for Image Retrieval
    cnn-for-image-retrieval is a research-oriented project that demonstrates the use of convolutional neural networks (CNNs) for image retrieval tasks. The repository provides implementations of CNN-based methods to extract feature representations from images and use them for similarity-based retrieval. It focuses on applying deep learning techniques to improve upon traditional handcrafted descriptors by learning features directly from data. The code includes training and evaluation scripts that can be adapted for custom datasets, making it useful for experimenting with retrieval systems in computer vision. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    BeaEngine 5

    BeaEngine 5

    BeaEngine disasm project

    BeaEngine is a C library designed to decode instructions from 16-bit, 32-bit and 64-bit intel architectures. It includes standard instructions set and instructions set from FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VMX, CLMUL, AES, MPX, AVX, AVX2, AVX512 (VEX & EVEX prefixes), CET, BMI1, BMI2, SGX, UINTR, KL, TDX and AMX extensions. If you want to analyze malicious codes and more generally obfuscated codes, BeaEngine sends back a complex structure that describes precisely the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Flasgger

    Flasgger

    Easy OpenAPI specs and Swagger UI for your Flask API

    Flasgger is a Flask extension to extract OpenAPI-Specification from all Flask views registered in your API. Flasgger also comes with SwaggerUI embedded so you can access it and visualize and interact with your API resources. Flasgger also provides validation of the incoming data, using the same specification it can validate if the data received as a POST, PUT, PATCH is valid against the schema defined using YAML, Python dictionaries or Marshmallow Schemas.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    gditools

    gditools

    A Python program/library aimed at GD-ROM image files.

    This Python program/library is designed to handle GD-ROM image (GDI) files. It can be used to list files, extract data, generate sorttxt file, extract bootstrap (IP.BIN) file and more. This project can be used in standalone mode, in interactive mode or as a library in another Python program (check the 'addons' folder to learn how). For your convenience, you can use the gditools.py GUI program supplied in the Files section (optional).
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 22
    cnn-text-classification-tf

    cnn-text-classification-tf

    Convolutional Neural Network for Text Classification in Tensorflow

    ...Based loosely on Kim’s influential paper on CNNs for sentence classification, this codebase demonstrates how to preprocess text data, convert words into learned embeddings, and apply multiple convolution filters to extract n-gram features that are then pooled and fed into a classifier. The project includes scripts for training, evaluation, and data handling, making it easy to run experiments on datasets such as movie reviews or other labeled text collections. By breaking down the model into understandable components, it serves as a practical reference for students and practitioners learning how deep learning models handle text beyond traditional bag-of-words approaches.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Image classification models for Keras

    Image classification models for Keras

    Keras code and weights files for popular deep learning models

    All architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at ~/.keras/keras.json. For instance, if you have set image_dim_ordering=tf, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, "Width-Height-Depth". Pre-trained weights can be automatically loaded upon instantiation (weights='imagenet'...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    PyInstaller Extractor

    Extract contents of a Windows executable file created by pyinstaller

    MIGRATED TO GITHUB https://github.com/extremecoders-re/pyinstxtractor This is a python script to extract the contents of a PyInstaller generated Windows executable file. The contents of the pyz file (usually pyc files) present inside the executable are also extracted. The pyc files are made valid so that a python bytecode decompiler will recognise it. Script can run on both python 2.x and 3.x Pyinstaller versions 2.0, 2.1, 3.0, 3.1 and 3.2 are supported
    Leader badge
    Downloads: 189 This Week
    Last Update:
    See Project
  • 25
    LightProfiler

    LightProfiler

    Profiler for Oracle extended SQL trace files

    ...It generates detailed resource profile for extended SQL trace files (10046 event), containing information about consuming of response time (by events, by cursors, etc.), data files usage, error analysis (SQL, PL/SQL) and much more. Also it contain tools for additional processing of trace files (extract session data, splitting files) and for management of database's sessions (disconnecting, tracing, monitor parameters, blocking locks, events and etc.)
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB