Showing 102 open source projects for "extract"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Allure Report

    Allure Report

    Flexible, lightweight multi-language test reporting tool

    Allure Report is a flexible, lightweight multi-language test reporting tool. It provides clear graphical reports and allows everyone involved in the development process to extract the maximum of information from the everyday testing process. Allure Report is a flexible multi-language test report tool to show you a detailed representation of what has been tested end extract max from the everyday execution of tests. Allure Report is capable to build unified reports for dozens of testing tools across eleven programming languages on several CI/CD systems.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 2
    LosslessCut

    LosslessCut

    The swiss army knife of lossless video/audio editing

    ...The main feature is lossless trimming and cutting of video and audio files, which is great for saving space by rough-cutting your large video files taken from a video camera, GoPro, drone, etc. It lets you quickly extract the good parts from your videos and discard many gigabytes of data without doing a slow re-encode and thereby losing quality. Or you can add a music or subtitle track to your video without needing to encode. Everything is extremely fast because it does an almost direct data copy, fueled by the awesome FFmpeg which does all the grunt work. ...
    Downloads: 590 This Week
    Last Update:
    See Project
  • 3
    TikTok MCP

    TikTok MCP

    Model Context Protocol (MCP) with TikTok integration

    The TikTok MCP integrates TikTok access into AI applications like Claude AI via TikNeuron. It enables analysis and interaction with TikTok content to determine virality factors and extract video content. ​
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and...
    Downloads: 298 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Tailwind CSS

    Tailwind CSS

    A utility-first CSS framework for rapid UI development

    Rapidly build modern websites without ever leaving your HTML. A utility-first CSS framework packed with classes like flex, pt-4, text-center and rotate-90 that can be composed to build any design, directly in your markup. Utility classes help you work within the constraints of a system instead of littering your stylesheets with arbitrary values. They make it easy to be consistent with color choices, spacing, typography, shadows, and everything else that makes up a well-engineered design...
    Downloads: 92 This Week
    Last Update:
    See Project
  • 6
    Gitingest

    Gitingest

    Create prompt-friendly codebase digests from any Git repository URL

    ...The generated output is optimized for prompt usage, helping AI models understand codebases more effectively without requiring manual file aggregation. In addition to producing the code digest, Gitingest also calculates statistics about the extracted content such as repository structure, total size of the extract, and token count. Gitingest can be used as a command line utility or integrated directly into Python applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. In addition to simple text extraction, Scribe.js supports writing or injecting a high-quality invisible text layer back into PDFs, effectively making them searchable and improving usability for indexing or accessibility. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    JSONPath Plus

    JSONPath Plus

    A fork of JSONPath

    Analyse, transform, and selectively extract data from JSON documents (and JavaScript objects). JSON path-plus expands on the original specification to add some additional operators and makes explicit some behaviors the original did not spell out. Try the browser demo or Runkit (Node).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    fx

    fx

    Command-line tool and terminal JSON viewer

    fx can work in two modes, cli and interactive. To start interactive mode pipe any JSON into fx. One of the frequent operations is mapping some function on an array. You can pass any number of anonymous functions for reducing JSON. fx provides a function save which will save everything in place and return saved object. This function can be only used with filename as first argument to fx command. Create .fxrc file in $HOME directory, and require any packages or define global functions. To be...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • 10
    npmhub

    npmhub

    A browser extension to explore npm dependencies on GitHub

    npmhub is a browser extension that enhances GitHub repositories by displaying a list of npm dependencies directly on the repository page. This makes it easier for developers to inspect a project's dependencies without navigating to external sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Article Extractor

    Article Extractor

    To extract main article from given URL with Node.js

    A Node.js library for extracting main content from web articles, removing unnecessary clutter like ads and navigation elements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    ...At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite. But beyond manual editing, it also offers a programmable layer so developers can write scripts to batch process documents, generate templated reports, or extract structured data from PDFs for integration in workflows. The design emphasizes quality and compatibility: output PDFs render accurately across readers, preserve metadata, and support interactive elements like hyperlinks and form fields.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    AgentQL MCP

    AgentQL MCP

    Model Context Protocol server that integrates AgentQL's data

    The AgentQL MCP Server is a Model Context Protocol (MCP) server that integrates AgentQL's data extraction capabilities, enabling users to extract structured data from web pages using natural language prompts. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Pot Desktop

    Pot Desktop

    A cross-platform software for text translation and recognition

    Pot-Desktop is a cross-platform productivity tool aimed at helping users quickly translate, perform OCR (optical character recognition), and synthesize speech for selected text or images — all with minimal friction. It supports picking text via mouse selection (“highlight-and-translate”), clipboard listening, or screenshot-based OCR; this makes it ideal for reading webpages, documents, images — or any on-screen text — and instantly getting translations or text extraction. The tool supports...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    Cloud Commander

    Cloud Commander

    Cloud Commander file manager for the web with console and editor

    ...Adapts to screen size. 3 built-in editors with support of syntax highlighting: Dword, Edward and Deepword. Console with support of the default OS command line. Written in JavaScript/Node.js. Built-in archives pack: zip and tar.gz. Built-in archives extract: zip, tar, gz, bz2, .tar.gz and .tar.bz2 (with help of inly). Cloud Commander could be used as middleware for node.js applications based on socket.io and express. The docker images are provided for multiple architectures and types. Config would be read from home directory, hosts root file system would be mount to /mnt/fs, 8000 port would be exposed to hosts port.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    MCP Server RAG Web Browser

    MCP Server RAG Web Browser

    A MCP Server for the RAG Web Browser Actor

    The MCP Server for the RAG Web Browser Actor allows AI assistants and LLMs to perform web searches and extract information from web pages. It facilitates interaction with the web, enabling up-to-date context retrieval for AI applications. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Zotero

    Zotero

    Tool to help you collect, organize, annotate, cite, and share research

    Zotero is a powerful, free, open-source research management application designed to help students, academics, and professionals collect, organize, annotate, cite, and share research sources and materials for papers, projects, or books. It can save web pages, PDFs, books, articles, and more with metadata, automatically extract bibliographic information, and organize items into collections and tag systems, while supporting notes and annotations directly alongside references. Zotero’s interface integrates with word processors like Microsoft Word and LibreOffice to generate formatted citations and bibliographies in many styles, and it can sync libraries across devices or share them with collaborators. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    owllook

    owllook

    Vertical novel search engine with unified reading and tracking tools

    ...It focuses on providing a simple and comfortable reading experience with features such as searching for books, following updates, bookmarking chapters, and maintaining a personal bookshelf. It aggregates results from multiple search engines and applies parsing rules to extract novel metadata, chapters, and content in a consistent format. Owllook also includes functionality for tracking reading history, displaying rankings based on search activity, and recommending books using a similarity-based approach. Owllook is built using asynchronous technologies to support efficient data retrieval and responsive interactions while reading or searching.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    ...Open Semantic Search includes an ETL framework that can ingest documents, process them through analysis steps, and enrich the data with extracted information such as named entities and metadata. It also supports optical character recognition to extract text from images and scanned documents, including images embedded inside PDF files. It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    PapersGPT

    PapersGPT

    A powerful Zotero AI and MCP plugin with ChatGPT, Gemini 3.1, Claude

    PapersGPT is an AI-powered plugin that integrates directly into Zotero to transform how researchers interact with academic papers and literature collections. It enables users to chat with individual PDFs or entire collections, allowing them to extract insights, generate summaries, and explore connections between documents without leaving the Zotero environment. The plugin supports a wide range of state-of-the-art language models, including GPT, Claude, Gemini, and open-source alternatives, giving users flexibility in choosing performance, cost, and privacy trade-offs. One of its most powerful features is its ability to process large volumes of academic content quickly, enabling tasks such as literature reviews, theoretical analysis, and research synthesis to be completed significantly faster. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    goober

    goober

    A less than 1KB css-in-js solution

    ...You'll find as, forwarded, CSS, keyframes, styled and so much more. Easily access your common sizes, colors, and anything really with the use of a theme. On the server, you can easily extract the CSS for the current state with extractCss. The initial thought of goober was a CSS-in-js solution at the cost of peanuts. Hence the peanuts emoji. By using goober, you are practically getting back space in size to build more of the needed features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    browserable

    browserable

    Open source and self-hostable browser automation library for AI agents

    Browserable is an open-source browser automation framework designed specifically for AI agents that need to interact with web interfaces in a human-like way. The project provides tools that allow automated agents to navigate websites, click buttons, fill out forms, and extract information from pages without manual scripting of each step. Built primarily in JavaScript, the framework offers both a developer-friendly SDK and a REST API that allow integration with AI applications and automation pipelines. It is designed to be self-hostable, which means developers can deploy and run it on their own infrastructure without relying on third-party services. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Critical

    Critical

    Extract & Inline Critical-path CSS in HTML pages

    Critical extracts & inlines critical-path (above-the-fold) CSS from HTML. Generate and inline critical-path CSS. Generate critical-path CSS. Generate and minify critical-path CSS. Generate, minify and inline critical-path CSS. Generate and return output via callback. Generate and return output via promise. When your site is adaptive and you want to deliver critical CSS for multiple screen resolutions this is a useful option. note, (your final output will be minified as to eliminate duplicate...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    videodl

    videodl

    Lightweight Python tool for downloading videos from many platforms

    ...It supports numerous video platforms across both Chinese and international streaming ecosystems, enabling users to fetch content from many popular services through a unified interface. Videodl works by implementing platform-specific client modules that extract video information and download links from supported services. Videodl can integrate with external command-line utilities to improve downloading performance, handle streaming formats such as HLS, and manage encrypted or segmented media streams. Additional utilities can also enable faster downloads, resume interrupted transfers, and process complex playlist structures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages. Several scripts also incorporate multi-threading and proxy usage to improve scraping efficiency and help avoid common anti-scraping limitations. In addition to raw data collection, some spiders include basic data processing and analysis using tools such as pandas and simple visualization with matplotlib. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB