Showing 11 open source projects for "data extraction"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    AgentQL MCP

    AgentQL MCP

    Model Context Protocol server that integrates AgentQL's data

    The AgentQL MCP Server is a Model Context Protocol (MCP) server that integrates AgentQL's data extraction capabilities, enabling users to extract structured data from web pages using natural language prompts. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Vectorize MCP Server

    Vectorize MCP Server

    Official Vectorize MCP Server

    The Vectorize MCP Server is a Model Context Protocol server that integrates with Vectorize, offering advanced vector retrieval and text extraction capabilities. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Wiseflow

    Wiseflow

    Enhance any agent's browser use skill

    Wiseflow is an open-source information extraction and knowledge discovery system designed to collect, filter, and organize valuable information from large volumes of online content. The platform continuously monitors specified sources such as websites, social platforms, and other digital channels to identify relevant data according to user-defined interests or topics. By combining web crawling, content parsing, and large language model analysis, the system extracts concise insights from raw information streams and converts them into structured data that can be stored or analyzed. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Magnitude

    Magnitude

    Vision AI browser agent for automation, testing, and extraction

    ...This approach allows the agent to generalize better across complex and modern websites, making it more robust than traditional selector-based automation tools. Browser Agent by Magnitude supports a wide range of capabilities including navigation, interaction, data extraction, and automated verification through built-in testing features. Developers can use it to automate repetitive web tasks, integrate services without APIs, or build advanced browser-based agents. It also provides flexible abstraction levels, allowing both high-level task execution and precise low-level control of actions like mouse movements and keyboard input.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    The Web MCP

    The Web MCP

    A powerful Model Context Protocol (MCP) server

    Bright Data’s Web MCP server gives AI assistants robust, real-time web capabilities through an MCP interface designed to avoid blocks, rate limits, and CAPTCHAs. It presents search, crawl, navigate, and extraction tools that agents can call directly, replacing brittle scraping prompts with typed operations. The README markets it as a “gateway” to the live web so assistants don’t fall back to stale training data. Bright Data also advertises a getting-started tier with a free monthly allotment, plus options for remote or self-hosted operation depending on governance needs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    ...It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    ...Wan2GP provides a full web-based interface that simplifies interaction with complex generative pipelines, making it easier to configure prompts, models, and rendering settings. It also integrates a wide range of utilities such as prompt enhancement, mask editing, motion design, and extraction tools for pose, depth, and flow data to support advanced video workflows.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 8
    DeepCamera

    DeepCamera

    Open-Source AI Camera. Empower any camera/CCTV

    ...SharpAI yolov7_reid is an open-source Python application that leverages AI technologies to detect intruders with traditional surveillance cameras. The source code is here It leverages Yolov7 as a person detector, FastReID for person feature extraction, Milvus the local vector database for self-supervised learning to identify unseen persons, Labelstudio to host images locally and for further usage such as label data and train your own classifier. It also integrates with Home-Assistant to empower smart homes with AI technology.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Markdownify MCP Server

    Markdownify MCP Server

    Convert files and web content into clean, usable Markdown easily

    ...It supports formats such as PDFs, images, audio with transcription, DOCX, XLSX, and PPTX, along with web sources like YouTube transcripts, Bing results, and general webpages. Markdownify MCP is designed to simplify content extraction and make data easier to read, share, and reuse in structured workflows. Developers can install dependencies, build, and run the server locally, then extend functionality by modifying its TypeScript-based tools and server logic. It also allows retrieval of existing Markdown files, making it useful for documentation, research, and AI-assisted workflows. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 10
    browserable

    browserable

    Open source and self-hostable browser automation library for AI agents

    Browserable is an open-source browser automation framework designed specifically for AI agents that need to interact with web interfaces in a human-like way. The project provides tools that allow automated agents to navigate websites, click buttons, fill out forms, and extract information from pages without manual scripting of each step. Built primarily in JavaScript, the framework offers both a developer-friendly SDK and a REST API that allow integration with AI applications and automation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Firecrawl MCP Server

    Firecrawl MCP Server

    Adds powerful web scraping and search to Cursor and Claude

    firecrawl-mcp-server is the official MCP integration for Firecrawl that brings high-recall web scraping, crawling, and search into IDEs and agent runtimes. It exposes tools for single-page scrape, multi-URL batch jobs, site discovery, and search enrichment, returning cleaned, structured content suitable for downstream LLM reasoning. The server is designed to run with Firecrawl’s hosted API or self-hosted deployments, making it flexible for enterprise data-governance requirements. Built-in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB