Showing 5 open source projects for "visual object net"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Save Up to 91% on Cloud Compute With Spot VMs Icon
    Save Up to 91% on Cloud Compute With Spot VMs

    Automatic sustained-use discounts. One free VM per month. No negotiation needed.

    Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.
    Try Free
  • 1
    InternGPT

    InternGPT

    Open source demo platform where you can easily showcase your AI models

    ...The framework connects multiple specialized AI models that perform tasks such as object detection, segmentation, captioning, and visual editing while coordinating them through a central conversational interface. This architecture enables the system to plan actions, execute visual operations, and return results in a coherent dialogue with the user.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    ...It also brings advanced perception capabilities, including spatial grounding, object recognition, OCR across 32 languages, and robust handling of challenging inputs like low-light or distorted text.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    LLM Vision

    LLM Vision

    Visual intelligence for your home.

    LLM Vision is an open-source integration for Home Assistant that adds multimodal large language model capabilities to smart home environments. The project enables Home Assistant to analyze images, video files, and live camera feeds using vision-capable AI models. Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process events from surveillance platforms such as Frigate and convert them into meaningful summaries, notifications, or structured data for automation workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    LISA

    LISA

    LISA: Reasoning Segmentation via Large Language Model

    ...The project introduces a framework where a large language model can interpret natural language instructions and produce segmentation masks that highlight relevant regions in an image. Instead of relying solely on predefined object categories, the model is capable of reasoning about complex textual queries and translating them into visual segmentation outputs. This approach allows the system to identify objects or regions in images based on semantic descriptions, contextual reasoning, and world knowledge. The model integrates multimodal capabilities by combining language understanding with visual perception so that text instructions guide the segmentation process. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs by 54% Icon
    Cut Data Warehouse Costs by 54%

    Easily migrate from Snowflake, Redshift, or Databricks with free tools.

    BigQuery delivers 54% lower TCO with exabyte scale and flexible pricing. Free migration tools handle the SQL translation automatically.
    Try Free
  • 5
    hCaptcha Challenger

    hCaptcha Challenger

    Gracefully face hCaptcha challenge with multimodal llms

    hCaptcha Challenger is an open-source automation framework designed to solve hCaptcha verification challenges using computer vision models and multimodal reasoning techniques. The project integrates machine learning models capable of analyzing visual captcha tasks and identifying the correct responses required to pass the verification process. Instead of relying on third-party captcha-solving services or browser scripts, the system operates independently by using pretrained neural networks that can classify images, detect objects, and interpret spatial relationships. The framework includes support for multiple types of captcha challenges such as object selection, drag-and-drop puzzles, and image labeling tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo