visual\ free download

Screenshot to Code

A neural network that transforms a design mock-up into static websites

Screenshot-to-code is a tool or prototype that attempts to convert UI screenshots (e.g., of mobile or web UIs) into code representations, likely generating layouts, HTML, CSS, or markup from image inputs. It is part of a research/proof-of-concept domain in UI automation and image-to-UI code generation. Mapping visual design to code constructs. Code/UI layout (HTML, CSS, or markup). Examples/demo scripts showing “image UI code”.

Downloads: 0 This Week

Last Update: 2025-09-26

See Project

Phi-3-MLX

Phi-3.5 for Mac: Locally-run Vision and Language Models

Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.

Downloads: 1 This Week

Last Update: 2025-03-13

See Project

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution

...It includes utilities to fine-tune vision-language embeddings, compute prompt or adapter updates, and benchmark across transfer and retention metrics. MetaCLIP is especially suited for real-world settings where a model must continuously incorporate new visual categories or domains over time.

Downloads: 0 This Week

Last Update: 2025-10-07

See Project

VGGT

[CVPR 2025 Best Paper Award] VGGT

VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose or depth), making the system more robust to challenging viewpoints and textures. ...

Downloads: 0 This Week

Last Update: 2025-10-11

See Project

ConvNeXt

Code release for ConvNeXt model

...It revisits classic ResNet-style backbones through the lens of transformer design trends—large kernel sizes, inverted bottlenecks, layer normalization, and GELU activations—to bridge the performance gap between convolutions and attention-based models. ConvNeXt’s clean, hierarchical structure makes it efficient for both pretraining and fine-tuning across a wide range of visual recognition tasks. It achieves competitive or superior results on ImageNet and downstream datasets while being easier to deploy and train than transformers. The repository provides pretrained models, training recipes, and ablation studies demonstrating how incremental design choices collectively yield state-of-the-art performance.

Downloads: 0 This Week

Last Update: 2025-10-06

See Project

DensePose

A real-time approach for mapping all human pixels of 2D RGB images

...The repository includes the DensePose network architecture, training code, pretrained models, and dataset tools for annotation and visualization. DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.

Downloads: 5 This Week

Last Update: 2025-10-06

See Project

Show Facebook Computer Vision Tags

Chrome Extension that displays automated image tags from Facebook

...Since Facebook uses a computer-vision model to analyse user-uploaded images and generate alt-text tags for accessibility (e.g., “Image may contain: golf, grass, outdoor and nature”), this extension surfaces those hidden tags directly in the UI—revealing what kind of information Facebook infers about images (objects present, activities being done, environment). The purpose is educational and somewhat cautionary: to help users understand the scope of visual inference and privacy issues. Once installed, the extension overlays those tags on images in the timeline, making visible what is typically hidden metadata. The project is relatively lightweight but has garnered attention due to its privacy transparency angle.

Downloads: 0 This Week

Last Update: 2025-11-14

See Project

Search Results for "visual\"

7 projects for "visual\" with 2 filters applied:

Screenshot to Code

Phi-3-MLX

MetaCLIP

VGGT

ConvNeXt

DensePose

Show Facebook Computer Vision Tags

Search Results for "visual\"

7 projects for "visual\" with 2 filters applied:

Screenshot to Code

Phi-3-MLX

MetaCLIP

VGGT

ConvNeXt

DensePose

Show Facebook Computer Vision Tags

Related Searches

Related Categories