Showing 48 open source projects for "extract"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 1
    EMV NFC Paycard Enrollment

    EMV NFC Paycard Enrollment

    A Java library used to read and extract data from NFC EMV credit cards

    Java library used to read and extract public data from NFC EMV credit cards.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 2
    PdfPig

    PdfPig

    Read and extract text and other content from PDFs in C#

    This project allows users to read and extract text and other content from PDF files. In addition the library can be used to create simple PDF documents containing text and geometrical shapes.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Autopsy

    Autopsy

    AutopsyÂź is a digital forensics platform and graphical interface

    AutopsyÂź is a digital forensics platform and graphical interface to The Sleuth KitÂź and other digital forensics tools. It can be used by law enforcement, military, and corporate examiners to investigate what happened on a computer. You can even use it to recover photos from your camera's memory card. Autopsy was designed to be intuitive out of the box. Installation is easy and wizards guide you through every step. All results are found in a single tree. See the intuitive page for more...
    Downloads: 84 This Week
    Last Update:
    See Project
  • 4
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 10 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Spatie Crawler

    Spatie Crawler

    An easy to use, powerful crawler implemented in PHP

    Spatie Crawler is a PHP library that allows developers to crawl websites and extract information efficiently. It can be used for web scraping, link checking, or automated testing of web pages. The library is simple to use and supports customizable crawling strategies, including controlling crawl depth and handling redirects. It’s suitable for building crawlers that navigate large or dynamically generated websites.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    refactoring.nvim

    refactoring.nvim

    The Refactoring library based off the Refactoring book

    refactoring.nvim is a Neovim plugin developed to bring powerful automated code refactoring capabilities to one of the most popular text editors among programmers, giving developers a suite of refactoring operations that streamline repetitive restructuring tasks inside the editor. Built around an intuitive set of commands and a Lua API, the plugin allows users to extract and inline variables or functions, pull blocks of code into new files, and modify code structure without leaving the comfort of Neovim’s modal interface. It integrates with built-in Neovim selection modes and can work with third-party tools like Telescope to present refactoring options quickly, enabling rapid transformation of code patterns. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    unipdf

    unipdf

    Golang PDF library for creating and processing PDF files (pure go)

    UniDoc UniPDF is a PDF library for Go (golang) with capabilities for creating and reading, processing PDF files. The library is written and supported by FoxyUtils.com, where the library is used to power many of its services. Every release of our libraries is automatically tested against known vulnerabilities and do not pass unless everything is remediated. All changes are carefully reviewed by our team.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Ksoup

    Ksoup

    Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML

    Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML, extracting HTML tags, attributes, and text, and encoding and decoding HTML entities. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pikepdf

    pikepdf

    A Python library for reading and writing PDF, powered by QPDF

    pikepdf is a Python library allowing the creation, manipulation, and repair of PDFs. It provides a Pythonic wrapper around the C++ PDF content transformation library, QPDF. Python + QPDF = “py” + “qpdf” = “pyqpdf”, which looks like a dyslexia test and is no fun to type. But say “pyqpdf” out loud, and it sounds like “pikepdf”. pikepdf is a library intended for developers who want to create, manipulate, parse, repair, and abuse the PDF format. It supports reading and write PDFs, including...
    Downloads: 5 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Article Extractor

    Article Extractor

    To extract main article from given URL with Node.js

    A Node.js library for extracting main content from web articles, removing unnecessary clutter like ads and navigation elements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    dategrep

    dategrep

    Print lines matching a time range

    dategrep is a command-line utility designed to extract lines from log files that fall within a specified time range. It efficiently processes large log files by performing a binary search to locate the relevant entries, making it a valuable tool for system administrators and developers analyzing time-specific events.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Contour

    Contour

    Modern C++ Terminal Emulator

    contour is a modern and actually fast, modal, virtual terminal emulator, for everyday use. It is aimed at power users with a modern feature mindset. Available on all 4 major platforms, Linux, OS/X, FreeBSD, Windows. GPU-accelerated rendering. Font ligatures support (such as in Fira Code). Unicode: Emoji support (-: 🌈 💝 😛 đŸ‘Ș - including ZWJ, VS15, VS16 emoji :-) Unicode: Grapheme cluster support. Bold and italic fonts. High-DPI support. Vertical Line Markers (quickly jump to markers in your...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    NGX-Translate

    NGX-Translate

    The internationalization (i18n) library for Angular

    ...The main part of the library is named core. You can use it on its own, but it is usually a good idea to add a loader to load your translations into your application. You can also extract the strings from your code with the extractor. This makes it really easy to start and maintain your translations. By default, there is no loader available. You can add translations manually using setTranslation but it is better to use a loader. You can write your own loader, or import an existing one.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    linaria

    linaria

    Zero-runtime CSS in JS library

    ...Optionally use any CSS preprocessor such as Sass or PostCSS. Easily find where the style was defined with CSS source maps. Linaria currently supports webpack and Rollup to extract the CSS at build time. Optionally, add the @linaria preset to your Babel configuration at the end of the presets list to avoid errors when importing the components in your server code or tests. Linaria can be used with any framework, with additional helpers for React.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Python-Spider

    Python-Spider

    Python3 web crawler practice

    Python-Spider is a repository intended to teach or provide examples for writing web spiders / crawlers in Python — part of a broader learning and resource collection by its author. The code and documentation are oriented toward beginners or intermediate learners who want to learn how to fetch, parse, and extract data from websites programmatically. As part of the author’s public learning-path repositories, python-spider likely includes examples of HTTP requests, HTML parsing, maybe concurrency or scheduling to crawl multiple pages, and techniques to handle common web-scraping issues. For people wanting to get hands-on with building scrapers, collecting data, or learning how to navigate web programming in Python, this repository acts as a didactic reference or starting point. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Doctrine Lexer

    Doctrine Lexer

    Base library for a lexer that can be used in Recursive Descent Parsers

    PHP Doctrine Lexer parser library that can be used in Top-Down, Recursive Descent Parsers. This lexer is used in Doctrine Annotations and in Doctrine ORM (DQL). To write your own parser you just need to extend Doctrine\Common\Lexer\AbstractLexer and implement three abstract methods. These methods define the lexical catchable and non-catchable patterns and a method for returning the type of a token and filtering the value if necessary. The Lexer is responsible for giving you an API to walk...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    ...It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to expand its capabilities, focusing on versatile data extraction, platform support, and seamless integration with various systems. DocWire SDK is dedicated to streamlining data processing, reducing development time and costs, and harnessing the potential of AI. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    Portable OpenAL Sound

    Portable OpenAL Sound

    Concurrent, asynchronous sounds package for Ada apps.

    ...It provides sound-playing capabilities for Ada apps to * asynchronously start and stop music/sound loops, * initiate transient sounds, * allow unlimited sound concurrency. It is suitable for any Ada application that needs music, sound loops or transient sound effects; eg. games. The proper command to extract the archive and maintain the directory structure is "7z x filename".
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    AssetStudio

    AssetStudio

    AssetStudio is a tool for exploring, extracting and exporting assets

    AssetStudio is a cross-platform tool for exploring, extracting, and exporting assets from Unity games—supporting assetbundles and built-in assets. It handles textures, sprites, audio, meshes, shaders, and more, exporting to formats like png, bmp, mp3, wav. The original is archived (supports Unity ≀2022.1); forks like AssetStudio2024 add support for newer Unity versions and Lua asset decompiling.
    Downloads: 1,282 This Week
    Last Update:
    See Project
  • 23
    7-Zip-JBinding

    7-Zip-JBinding

    Java wrapper for 7z archiver engine

    Native (JNI) cross-platform library to extract (password protected, multi-part) 7z Zip Rar Tar Split Lzma Iso HFS GZip Cpio BZip2 Z Arj Chm Lhz Cab Nsis Deb Rpm Wim Udf archives and create 7z, Zip, Tar, GZip & BZip2 from Java.
    Leader badge
    Downloads: 30 This Week
    Last Update:
    See Project
  • 24

    PHP Simple HTML DOM Parser

    A php based DOM parser.

    A simple PHP HTML DOM parser written in PHP5+, supports invalid HTML, and provides a very easy way to find, extract and modify the HTML elements of the dom. jquery like syntax allow sophisticated finding methods for locating the elements you care about.
    Leader badge
    Downloads: 1,794 This Week
    Last Update:
    See Project
  • 25
    JavaScript Load Image

    JavaScript Load Image

    Load images provided as File or Blob objects or via URL

    ...It is often used in upload flows where images need to be previewed, resized, rotated, or cropped before being sent to a server, reducing bandwidth and improving user experience. The library can interpret image metadata, including Exif and IPTC tags, and can extract embedded thumbnails when present, which is handy for building fast image pickers or galleries. One of its key capabilities is correcting Exif Orientation, so photos taken on mobile devices appear upright without additional server-side processing. It also offers the option to preserve or restore image headers when resizing, which helps retain metadata or orientation information in the re-encoded image.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB