data vision free download

40 projects for "data vision" with 2 filters applied:

Artificial Intelligence BSD Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
1

Vision Transformer Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA

This repository provides a from-scratch, minimalist implementation of the Vision Transformer (ViT) in PyTorch, focusing on the core architectural pieces needed for image classification. It breaks down the model into patch embedding, positional encoding, multi-head self-attention, feed-forward blocks, and a classification head so you can understand each component in isolation. The code is intentionally compact and modular, which makes it easy to tinker with hyperparameters, depth, width, and...

Downloads: 3 This Week

Last Update: 2026-01-28
See Project
2

DeiT (Data-efficient Image Transformers)

Official DeiT repository

DeiT (Data-efficient Image Transformers) shows that Vision Transformers can be trained competitively on ImageNet-1k without external data by using strong training recipes and knowledge distillation. Its key idea is a specialized distillation strategy—including a learnable “distillation token”—that lets a transformer learn effectively from a CNN or transformer teacher on modest-scale datasets.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
3

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution

MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation across base and target domains to measure how well the model retains its general knowledge while specializing as needed. It includes utilities to fine-tune vision-language embeddings, compute prompt or adapter updates, and benchmark across transfer and retention metrics. ...

Downloads: 1 This Week

Last Update: 2025-10-07
See Project
4

NVIDIA Isaac GR00T

NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and finetuning, preserving language understanding and improving generalization. Streamlined MLP connection between vision encoder and LLM with added layer normalization.

Downloads: 0 This Week

Last Update: 2025-11-05
See Project
Find Hidden Risks in Windows Task Scheduler
Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.

Download Free Tool
5

DINOv3

Reference PyTorch implementation and models for DINOv3

DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...

Downloads: 10 This Week

Last Update: 2025-11-20
See Project
6

VGGT

[CVPR 2025 Best Paper Award] VGGT

VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose...

Downloads: 1 This Week

Last Update: 2025-10-11
See Project
7

DINOv2

PyTorch code and models for the DINOv2 self-supervised learning

DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval, detection, and segmentation—often requiring little or no fine-tuning. ...

Downloads: 5 This Week

Last Update: 2025-12-22
See Project
8

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...

Downloads: 16 This Week

Last Update: 2026-01-27
See Project
9

OSWorld

Benchmarking Multimodal Agents for Open-Ended Tasks

OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)

Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.

Learn More
10

Seamless Communication

Foundational Models for State-of-the-Art Speech and Text Translation

Seamless Communication is a research project focused on building more integrated, low-latency multimodal communication between humans and AI agents. The motivation is to move beyond “text in, text out” and enable direct, live, multi-turn exchange involving language, gesture, gaze, vision, and modality switching without user friction. The system architecture includes a real-time multimodal signal pipeline for audio, video, and sensor data, a dialog manager that can decide when to act (speak, gesture, point) or query, and a cross-modal reasoning layer that fuses perception with semantic context. The research prototype includes components for visual grounding (understanding when a user references something in view), gesture recognition and synthesis, and turn-taking mechanisms that mirror human conversational timing. ...

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
11

Armadillo

fast C++ library for linear algebra & scientific computing

* Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...

Downloads: 2,294 This Week

Last Update: 2025-12-16
See Project
12

PIFuHD

High-Resolution 3D Human Digitization from A Single Image

PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a...

Downloads: 5 This Week

Last Update: 2025-10-06
See Project
13

Self-learning-Computer-Science

Resources to learn computer science in your spare time

Self-learning Computer Science is a curated, open-source guide repository designed to help learners independently study computer science topics using high-quality university-level resources. The author (an undergraduate CS student) assembled links to courses from institutions like MIT, UC Berkeley, Stanford, etc., covering mathematics, programming, data structures/algorithms, computer architecture, machine learning, software engineering and more. It’s aimed at learners who find traditional...

Downloads: 0 This Week

Last Update: 2025-11-05
See Project
14

MAE (Masked Autoencoders)

PyTorch implementation of MAE

MAE (Masked Autoencoders) is a self-supervised learning framework for visual representation learning using masked image modeling. It trains a Vision Transformer (ViT) by randomly masking a high percentage of image patches (typically 75%) and reconstructing the missing content from the remaining visible patches. This forces the model to learn semantic structure and global context without supervision. The encoder processes only the visible patches, while a lightweight decoder reconstructs the...

Downloads: 2 This Week

Last Update: 2025-10-06
See Project
15

PyCls

Codebase for Image Classification Research, written in PyTorch

pycls is a focused PyTorch codebase for image classification research that emphasizes reproducibility and strong, transparent baselines. It popularized families like RegNet and supports classic architectures (ResNet, ResNeXt) with clean implementations and consistent training recipes. The repository includes highly tuned schedules, augmentations, and regularization settings that make it straightforward to match reported accuracy without guesswork. Distributed training and mixed precision are...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
16

VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint

...By using only 2D detections (such as those from OpenPose or Detectron), it enables markerless 3D pose estimation with relatively lightweight computational requirements. The framework includes pretrained models, data preprocessing utilities, visualization tools, and evaluation scripts for standard benchmarks like Human3.6M. VideoPose3D has been used widely in computer vision research for human motion understanding, activity recognition, and animation generation.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
17

Integrating Vision Toolkit

The Integrating Vision Toolkit (IVT) is a powerful and fast C++ computer vision library with an easy-to-use object-oriented architecture. It offers its own multi-platform GUI toolkit. OpenCV is integrated optionally. Website: http://ivt.sourceforge.net

2 Reviews

Downloads: 4 This Week

Last Update: 2019-11-28
See Project
18

maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation

Mask R-CNN Benchmark is a PyTorch-based framework that provides high-performance implementations of object detection, instance segmentation, and keypoint detection models. Originally built to benchmark Mask R-CNN and related models, it offers a clean, modular design to train and evaluate detection systems efficiently on standard datasets like COCO. The framework integrates critical components—region proposal networks (RPNs), RoIAlign layers, mask heads, and backbone architectures such as...

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
19

MatlabFunc

Matlab codes for feature learning

MatlabFunc is a collection of MATLAB functions developed by the ZJULearning group to support various tasks in computer vision, machine learning, and numerical computation. The repository brings together a wide range of utility scripts, algorithms, and implementations that serve as building blocks for research and development. These functions cover areas such as matrix operations, optimization, data processing, and visualization, making them broadly applicable across different research domains. ...

Downloads: 3 This Week

Last Update: 6 days ago
See Project
20

ImageApp - Java Advanced Imaging GUI

An IDE for people interested in Machine Vision/Image Processing. Written in Java, using JAI. It allows users to view image data and also provides a drag and drop environment that users can create/execute graphs of JAI operators.

Downloads: 0 This Week

Last Update: 2017-01-21
See Project
21

nagiosvision

Nagios is a plugin designed to send alerts based on the identification of objects or faces your server using computer vision and a simple webcam.

1 Review

Downloads: 0 This Week

Last Update: 2015-08-31
See Project
22

Mobile Robot Programming Toolkit (MRPT)

**MOVED TO GITHUB** ==> https://github.com/MRPT/mrpt

**MOVED TO GITHUB** ==> https://github.com/MRPT/mrpt The Mobile Robot Programming Toolkit (MRPT) is an extensive, cross-platform, and open source C++ library aimed for robotics researchers to design and implement algorithms about Localization, SLAM, Navigation, computer vision. http://www.mrpt.org/

2 Reviews

Downloads: 6 This Week

Last Update: 2015-02-02
See Project
23

ProximityForest

Efficient Approximate Nearest Neighbors for General Metric Spaces

A proximity forest is a data structure that allows for efficient computation of approximate nearest neighbors of arbitrary data elements in a metric space. See: O'Hara and Draper, "Are You Using the Right Approximate Nearest Neighbor Algorithm?", WACV 2013 (best student paper award). One application of a ProximityForest is given in the following CVPR publication: Stephen O'Hara and Bruce A.

Downloads: 0 This Week

Last Update: 2015-03-26
See Project
24

Fish4Knowledge Project

Analysis of undersea fish videos

The Fish4knowledge project investigated: information abstraction and storage methods for analyzing undersea video data (from 10E+15 pixels to 10E+12 units of information), machine and human vocabularies for detecting & describing fish, flexible process architectures to process the data and scientific queries and effective specialised user query interfaces. A combination of computer vision, database storage, workflow and human computer interaction methods were used to achieve this. ...

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
25

QVision: Computer Vision Library for Qt

Computer vision and image processing library for Qt.

This library contains among other things a set of graphical widgets for video output, performance evaluation and augmented reality. The library also provides classes for several data types usually required by computer vision and image processing applications such as vectors, matrices, quaternions and images. Thanks to a large number of wrapper functions these objects can be used with highly efficient functionality from third party libraries such as OpenCV, GNU Scientific Library, Computational Geometry Algorithms Library, Intel's Math Kernel Library and Integrated Performance Primitives, the Octave library, etc...

Downloads: 0 This Week

Last Update: 2013-07-02
See Project