51 projects for "data vision" with 1 filter applied:

  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • 1
    Vision Transformer Pytorch

    Vision Transformer Pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA

    This repository provides a from-scratch, minimalist implementation of the Vision Transformer (ViT) in PyTorch, focusing on the core architectural pieces needed for image classification. It breaks down the model into patch embedding, positional encoding, multi-head self-attention, feed-forward blocks, and a classification head so you can understand each component in isolation. The code is intentionally compact and modular, which makes it easy to tinker with hyperparameters, depth, width, and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Perfect Roadmap To Learn Data Science

    Perfect Roadmap To Learn Data Science

    Basic To Intermediate Python data science guide

    Perfect Roadmap To Learn Data Science In 2025 is an extended, updated learning pathway curated for the modern data-science landscape — blending classical data-analysis, statistics, machine learning, deep learning, computer vision, NLP, as well as current deployment and MLOps practices to prepare learners for data-science careers in 2025. The roadmap is organized to guide learners systematically: starting with Python fundamentals and math/statistics, then progressing through classical machine-learning, deep-learning, data preprocessing, feature engineering, and onto domain-specific applications like computer vision or NLP, ending with deployment, real-world project construction, and best practices for production readiness. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    The Grand Complete Data Science Guide

    The Grand Complete Data Science Guide

    Data Science Guide With Videos And Materials

    The Grand Complete Data Science Materials is a repository curated by a data-science educator that aggregates a wide range of learning resources — from basic programming and math foundation to advanced topics in machine learning, deep learning, natural language processing, computer vision, and deployment practices — into a structured, centralized collection aimed at learners seeking a comprehensive path to data science mastery.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    DeiT (Data-efficient Image Transformers)
    DeiT (Data-efficient Image Transformers) shows that Vision Transformers can be trained competitively on ImageNet-1k without external data by using strong training recipes and knowledge distillation. Its key idea is a specialized distillation strategy—including a learnable “distillation token”—that lets a transformer learn effectively from a CNN or transformer teacher on modest-scale datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    MetaCLIP

    MetaCLIP

    ICLR2024 Spotlight: curation/training code, metadata, distribution

    MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation across base and target domains to measure how well the model retains its general knowledge while specializing as needed. It includes utilities to fine-tune vision-language embeddings, compute prompt or adapter updates, and benchmark across transfer and retention metrics. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and finetuning, preserving language understanding and improving generalization. Streamlined MLP connection between vision encoder and LLM with added layer normalization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DINOv3

    DINOv3

    Reference PyTorch implementation and models for DINOv3

    DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 8
    VGGT

    VGGT

    [CVPR 2025 Best Paper Award] VGGT

    VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval, detection, and segmentation—often requiring little or no fine-tuning. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-First Supply Chain Management Icon
    AI-First Supply Chain Management

    Supply chain managers, executives, and businesses seeking AI-powered solutions to optimize planning, operations, and decision-making across the supply

    Logility is a market-leading provider of AI-first supply chain management solutions engineered to help organizations build sustainable digital supply chains that improve people’s lives and the world we live in. The company’s approach is designed to reimagine supply chain planning by shifting away from traditional “what happened” processes to an AI-driven strategy that combines the power of humans and machines to predict and be ready for what’s coming. Logility’s fully integrated, end-to-end platform helps clients know faster, turn uncertainty into opportunity, and transform the supply chain from a cost center to an engine for growth.
    Learn More
  • 10
    Datumaro

    Datumaro

    Dataset Management Framework, a Python library and a CLI tool to build

    ...It’s especially useful when you’re dealing with heterogeneous data sources or need to prepare complex datasets for machine learning workflows, freeing you from writing custom scripts for every format conversion.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    CoreNet

    CoreNet

    CoreNet: A library for training deep neural networks

    CoreNet is Apple’s internal deep learning framework for distributed neural network training, designed for high scalability, low-latency communication, and strong hardware efficiency. It focuses on enabling large-scale model training across clusters of GPUs and accelerators by optimizing data flow and parallelism strategies. CoreNet provides abstractions for data, tensor, and pipeline parallelism, allowing models to scale without code duplication or heavy manual configuration. Its distributed runtime manages synchronization, load balancing, and mixed-precision computation to maximize throughput while minimizing communication bottlenecks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 13
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Seamless Communication

    Seamless Communication

    Foundational Models for State-of-the-Art Speech and Text Translation

    Seamless Communication is a research project focused on building more integrated, low-latency multimodal communication between humans and AI agents. The motivation is to move beyond “text in, text out” and enable direct, live, multi-turn exchange involving language, gesture, gaze, vision, and modality switching without user friction. The system architecture includes a real-time multimodal signal pipeline for audio, video, and sensor data, a dialog manager that can decide when to act (speak, gesture, point) or query, and a cross-modal reasoning layer that fuses perception with semantic context. The research prototype includes components for visual grounding (understanding when a user references something in view), gesture recognition and synthesis, and turn-taking mechanisms that mirror human conversational timing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...
    Leader badge
    Downloads: 2,218 This Week
    Last Update:
    See Project
  • 16
    iJEPA

    iJEPA

    Official codebase for I-JEPA

    ...This objective sidesteps generative pixel losses and avoids heavy negative sampling, producing features that transfer strongly with linear probes and minimal fine-tuning. The design scales naturally with Vision Transformer backbones and flexible masking strategies, and it trains stably at large batch sizes. i-JEPA’s predictions are made in embedding space, which is computationally efficient and better aligned with downstream discrimination tasks. The repository provides training recipes, data pipelines, and evaluation code that clarify which masking patterns and architectural choices matter most.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    Self-learning-Computer-Science

    Self-learning-Computer-Science

    Resources to learn computer science in your spare time

    Self-learning Computer Science is a curated, open-source guide repository designed to help learners independently study computer science topics using high-quality university-level resources. The author (an undergraduate CS student) assembled links to courses from institutions like MIT, UC Berkeley, Stanford, etc., covering mathematics, programming, data structures/algorithms, computer architecture, machine learning, software engineering and more. It’s aimed at learners who find traditional...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    ml-surveys

    ml-surveys

    Survey papers summarizing advances in deep learning, NLP, CV, graphs

    ...It is particularly useful for researchers, data scientists, or engineers who want conceptual clarity about where a field stands, what problems remain, and what techniques are most established.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MAE (Masked Autoencoders)

    MAE (Masked Autoencoders)

    PyTorch implementation of MAE

    MAE (Masked Autoencoders) is a self-supervised learning framework for visual representation learning using masked image modeling. It trains a Vision Transformer (ViT) by randomly masking a high percentage of image patches (typically 75%) and reconstructing the missing content from the remaining visible patches. This forces the model to learn semantic structure and global context without supervision. The encoder processes only the visible patches, while a lightweight decoder reconstructs the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Flux3D.jl

    Flux3D.jl

    3D computer vision library in Julia

    Flux3D.jl is a 3D vision library, written completely in Julia. This package utilizes Flux.jl and Zygote.jl as its building blocks for training 3D vision models and for supporting differentiation. This package also have support of CUDA GPU acceleration with CUDA.jl.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PyCls

    PyCls

    Codebase for Image Classification Research, written in PyTorch

    pycls is a focused PyTorch codebase for image classification research that emphasizes reproducibility and strong, transparent baselines. It popularized families like RegNet and supports classic architectures (ResNet, ResNeXt) with clean implementations and consistent training recipes. The repository includes highly tuned schedules, augmentations, and regularization settings that make it straightforward to match reported accuracy without guesswork. Distributed training and mixed precision are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    VideoPose3D

    VideoPose3D

    Efficient 3D human pose estimation in video using 2D keypoint

    ...By using only 2D detections (such as those from OpenPose or Detectron), it enables markerless 3D pose estimation with relatively lightweight computational requirements. The framework includes pretrained models, data preprocessing utilities, visualization tools, and evaluation scripts for standard benchmarks like Human3.6M. VideoPose3D has been used widely in computer vision research for human motion understanding, activity recognition, and animation generation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    maskrcnn-benchmark

    maskrcnn-benchmark

    Fast, modular reference implementation of Instance Segmentation

    Mask R-CNN Benchmark is a PyTorch-based framework that provides high-performance implementations of object detection, instance segmentation, and keypoint detection models. Originally built to benchmark Mask R-CNN and related models, it offers a clean, modular design to train and evaluate detection systems efficiently on standard datasets like COCO. The framework integrates critical components—region proposal networks (RPNs), RoIAlign layers, mask heads, and backbone architectures such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    MatlabFunc

    MatlabFunc

    Matlab codes for feature learning

    MatlabFunc is a collection of MATLAB functions developed by the ZJULearning group to support various tasks in computer vision, machine learning, and numerical computation. The repository brings together a wide range of utility scripts, algorithms, and implementations that serve as building blocks for research and development. These functions cover areas such as matrix operations, optimization, data processing, and visualization, making them broadly applicable across different research domains. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next