object free download - SourceForge

hCaptcha Challenger

Gracefully face hCaptcha challenge with multimodal llms

...Instead of relying on third-party captcha-solving services or browser scripts, the system operates independently by using pretrained neural networks that can classify images, detect objects, and interpret spatial relationships. The framework includes support for multiple types of captcha challenges such as object selection, drag-and-drop puzzles, and image labeling tasks. It implements an agent-style workflow where the system interprets the challenge prompt, selects the appropriate vision model, and generates the required interaction automatically.

Downloads: 10 This Week

Last Update: 2026-03-06

See Project

Qwen-Image

Qwen-Image is a powerful image generation foundation model

...The model excels not only in text rendering but also in a wide range of artistic styles, including photorealistic, impressionist, anime, and minimalist aesthetics. Qwen-Image supports sophisticated editing tasks such as style transfer, object insertion and removal, detail enhancement, and even human pose manipulation, making it suitable for both professional and casual users. It also includes advanced image understanding capabilities like object detection, semantic segmentation, depth and edge estimation, and novel view synthesis.

1 Review

Downloads: 2 This Week

Last Update: 2026-02-10

See Project

InternGPT

Open source demo platform where you can easily showcase your AI models

...Unlike traditional chat systems that rely solely on text prompts, InternGPT allows users to interact with visual content using both language and nonverbal signals such as pointing or highlighting objects within images. The framework connects multiple specialized AI models that perform tasks such as object detection, segmentation, captioning, and visual editing while coordinating them through a central conversational interface. This architecture enables the system to plan actions, execute visual operations, and return results in a coherent dialogue with the user.

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

LLM Vision

Visual intelligence for your home.

LLM Vision is an open-source integration for Home Assistant that adds multimodal large language model capabilities to smart home environments. The project enables Home Assistant to analyze images, video files, and live camera feeds using vision-capable AI models. Instead of relying only on traditional object detection pipelines, it allows users to send prompts about visual content and receive contextual descriptions or answers about what is happening in camera footage. The system can process events from surveillance platforms such as Frigate and convert them into meaningful summaries, notifications, or structured data for automation workflows. ...

Downloads: 5 This Week

Last Update: 2026-05-26

See Project

Qwen-2.5-VL

Qwen2.5-VL is the multimodal large language model series

Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation...

Downloads: 11 This Week

Last Update: 2026-01-30

See Project

Kaleidoscope-SDK

User toolkit for analyzing and interfacing with Large Language Models

...It provides a simple interface to launch LLMs on an HPC cluster, asking them to perform basic features like text generation, but also retrieve intermediate information from inside the model, such as log probabilities and activations. Users must authenticate using their Vector Institute cluster credentials. This can be done interactively instantiating a client object. This will generate an authentication token that will be used for all subsequent requests. The token will expire after 30 days, at which point the user will be prompted to re-authenticate.

Downloads: 0 This Week

Last Update: 2024-07-10

See Project

LISA

LISA: Reasoning Segmentation via Large Language Model

...The project introduces a framework where a large language model can interpret natural language instructions and produce segmentation masks that highlight relevant regions in an image. Instead of relying solely on predefined object categories, the model is capable of reasoning about complex textual queries and translating them into visual segmentation outputs. This approach allows the system to identify objects or regions in images based on semantic descriptions, contextual reasoning, and world knowledge. The model integrates multimodal capabilities by combining language understanding with visual perception so that text instructions guide the segmentation process. ...

Downloads: 0 This Week

Last Update: 2026-03-06

See Project

EvaDB

Database system for building simpler and faster AI-powered application

...Running these deep learning models on large document or video datasets is costly and time-consuming. For example, the state-of-the-art object detection model takes multiple GPU years to process just a week’s videos from a single traffic monitoring camera. Besides the money spent on hardware, these models also increase the time that you spend waiting for the model inference to finish.

Downloads: 6 This Week

Last Update: 2023-11-19

See Project

Search Results for "object"

Showing 8 open source projects for "object"

hCaptcha Challenger

Qwen-Image

InternGPT

LLM Vision

Qwen-2.5-VL

Kaleidoscope-SDK

LISA

EvaDB

Search Results for "object"

Showing 8 open source projects for "object"

hCaptcha Challenger

Qwen-Image

InternGPT

LLM Vision

Qwen-2.5-VL

Kaleidoscope-SDK

LISA

EvaDB

Related Searches

Related Categories