Showing 62 open source projects for "extract"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    EMV NFC Paycard Enrollment

    EMV NFC Paycard Enrollment

    A Java library used to read and extract data from NFC EMV credit cards

    Java library used to read and extract public data from NFC EMV credit cards.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 2
    PdfPig

    PdfPig

    Read and extract text and other content from PDFs in C#

    This project allows users to read and extract text and other content from PDF files. In addition the library can be used to create simple PDF documents containing text and geometrical shapes.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    Autopsy

    Autopsy

    Autopsy® is a digital forensics platform and graphical interface

    Autopsy® is a digital forensics platform and graphical interface to The Sleuth Kit® and other digital forensics tools. It can be used by law enforcement, military, and corporate examiners to investigate what happened on a computer. You can even use it to recover photos from your camera's memory card. Autopsy was designed to be intuitive out of the box. Installation is easy and wizards guide you through every step. All results are found in a single tree. See the intuitive page for more...
    Downloads: 75 This Week
    Last Update:
    See Project
  • 4
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 6 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Spatie Crawler

    Spatie Crawler

    An easy to use, powerful crawler implemented in PHP

    Spatie Crawler is a PHP library that allows developers to crawl websites and extract information efficiently. It can be used for web scraping, link checking, or automated testing of web pages. The library is simple to use and supports customizable crawling strategies, including controlling crawl depth and handling redirects. It’s suitable for building crawlers that navigate large or dynamically generated websites.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    refactoring.nvim

    refactoring.nvim

    The Refactoring library based off the Refactoring book

    refactoring.nvim is a Neovim plugin developed to bring powerful automated code refactoring capabilities to one of the most popular text editors among programmers, giving developers a suite of refactoring operations that streamline repetitive restructuring tasks inside the editor. Built around an intuitive set of commands and a Lua API, the plugin allows users to extract and inline variables or functions, pull blocks of code into new files, and modify code structure without leaving the comfort of Neovim’s modal interface. It integrates with built-in Neovim selection modes and can work with third-party tools like Telescope to present refactoring options quickly, enabling rapid transformation of code patterns. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    unipdf

    unipdf

    Golang PDF library for creating and processing PDF files (pure go)

    UniDoc UniPDF is a PDF library for Go (golang) with capabilities for creating and reading, processing PDF files. The library is written and supported by FoxyUtils.com, where the library is used to power many of its services. Every release of our libraries is automatically tested against known vulnerabilities and do not pass unless everything is remediated. All changes are carefully reviewed by our team.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Ksoup

    Ksoup

    Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML

    Ksoup is a lightweight Kotlin Multiplatform library for parsing HTML, extracting HTML tags, attributes, and text, and encoding and decoding HTML entities. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • 10
    pikepdf

    pikepdf

    A Python library for reading and writing PDF, powered by QPDF

    pikepdf is a Python library allowing the creation, manipulation, and repair of PDFs. It provides a Pythonic wrapper around the C++ PDF content transformation library, QPDF. Python + QPDF = “py” + “qpdf” = “pyqpdf”, which looks like a dyslexia test and is no fun to type. But say “pyqpdf” out loud, and it sounds like “pikepdf”. pikepdf is a library intended for developers who want to create, manipulate, parse, repair, and abuse the PDF format. It supports reading and write PDFs, including...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Article Extractor

    Article Extractor

    To extract main article from given URL with Node.js

    A Node.js library for extracting main content from web articles, removing unnecessary clutter like ads and navigation elements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Point Cloud Library

    Point Cloud Library

    A standalone, large scale, open project for 2D/3D image processing

    The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing. PCL is released under the terms of the BSD license, and thus free for commercial and research use. Whether you’ve just discovered PCL or you’re a long time veteran, this page contains links to a set of resources that will help consolidate your knowledge on PCL and 3D processing. An additional Wiki resource for developers is available too. To simplify both usage and...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 13
    dategrep

    dategrep

    Print lines matching a time range

    dategrep is a command-line utility designed to extract lines from log files that fall within a specified time range. It efficiently processes large log files by performing a binary search to locate the relevant entries, making it a valuable tool for system administrators and developers analyzing time-specific events.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Contour

    Contour

    Modern C++ Terminal Emulator

    contour is a modern and actually fast, modal, virtual terminal emulator, for everyday use. It is aimed at power users with a modern feature mindset. Available on all 4 major platforms, Linux, OS/X, FreeBSD, Windows. GPU-accelerated rendering. Font ligatures support (such as in Fira Code). Unicode: Emoji support (-: 🌈 💝 😛 👪 - including ZWJ, VS15, VS16 emoji :-) Unicode: Grapheme cluster support. Bold and italic fonts. High-DPI support. Vertical Line Markers (quickly jump to markers in your...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    NGX-Translate

    NGX-Translate

    The internationalization (i18n) library for Angular

    ...The main part of the library is named core. You can use it on its own, but it is usually a good idea to add a loader to load your translations into your application. You can also extract the strings from your code with the extractor. This makes it really easy to start and maintain your translations. By default, there is no loader available. You can add translations manually using setTranslation but it is better to use a loader. You can write your own loader, or import an existing one.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    linaria

    linaria

    Zero-runtime CSS in JS library

    ...Optionally use any CSS preprocessor such as Sass or PostCSS. Easily find where the style was defined with CSS source maps. Linaria currently supports webpack and Rollup to extract the CSS at build time. Optionally, add the @linaria preset to your Babel configuration at the end of the presets list to avoid errors when importing the components in your server code or tests. Linaria can be used with any framework, with additional helpers for React.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Python-Spider

    Python-Spider

    Python3 web crawler practice

    Python-Spider is a repository intended to teach or provide examples for writing web spiders / crawlers in Python — part of a broader learning and resource collection by its author. The code and documentation are oriented toward beginners or intermediate learners who want to learn how to fetch, parse, and extract data from websites programmatically. As part of the author’s public learning-path repositories, python-spider likely includes examples of HTTP requests, HTML parsing, maybe concurrency or scheduling to crawl multiple pages, and techniques to handle common web-scraping issues. For people wanting to get hands-on with building scrapers, collecting data, or learning how to navigate web programming in Python, this repository acts as a didactic reference or starting point. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    GitHub Workflows Kt

    GitHub Workflows Kt

    Authoring GitHub Actions workflows in Kotlin

    ...to configure complicated scenarios which lead to complicated files that are difficult to write and maintain. Who among us hasn't accidentally used the wrong indentation, missed a possibility to extract a reusable piece of code, or been confused by ambiguous types? The power of a generic-purpose would come in handy in these cases. We're developing GitHub-workflows-kt to solve these and other problems, so you can create GitHub Workflows with confidence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DeepLinkDispatch

    DeepLinkDispatch

    Annotation-based library for making deep link handling better

    ...You can’t easily indicate the parameters that you would expect in the URI that you are filtering for. For complex deep links, you are likely to have to write a parsing mechanism to extract out the parameters, or worse, have such similar code distributed amongst many Activities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    libtar

    libtar

    libtar is a lightweight C# library for extracting TAR archives.

    libtar is a lightweight C# library for extracting TAR archives. It provides a simple API to extract all files from TAR archives. Nuget: https://www.nuget.org/packages?q=libtar
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Doctrine Lexer

    Doctrine Lexer

    Base library for a lexer that can be used in Recursive Descent Parsers

    PHP Doctrine Lexer parser library that can be used in Top-Down, Recursive Descent Parsers. This lexer is used in Doctrine Annotations and in Doctrine ORM (DQL). To write your own parser you just need to extend Doctrine\Common\Lexer\AbstractLexer and implement three abstract methods. These methods define the lexical catchable and non-catchable patterns and a method for returning the type of a token and filtering the value if necessary. The Lexer is responsible for giving you an API to walk...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    ...It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to expand its capabilities, focusing on versatile data extraction, platform support, and seamless integration with various systems. DocWire SDK is dedicated to streamlining data processing, reducing development time and costs, and harnessing the potential of AI. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Portable OpenAL Sound

    Portable OpenAL Sound

    Concurrent, asynchronous sounds package for Ada apps.

    ...It provides sound-playing capabilities for Ada apps to * asynchronously start and stop music/sound loops, * initiate transient sounds, * allow unlimited sound concurrency. It is suitable for any Ada application that needs music, sound loops or transient sound effects; eg. games. The proper command to extract the archive and maintain the directory structure is "7z x filename".
    Downloads: 5 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB