Showing 28 open source projects for "model-builder"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Colossal-AI

    Colossal-AI

    Making large AI models cheaper, faster and more accessible

    ...Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Raster Vision

    Raster Vision

    Open source framework for deep learning satellite and aerial imagery

    ...Raster Vision allows engineers to quickly and repeatably configure pipelines that go through core components of a machine learning workflow: analyzing training data, creating training chips, training models, creating predictions, evaluating models, and bundling the model files and configuration for easy deployment. The input to a Raster Vision pipeline is a set of images and training data, optionally with Areas of Interest (AOIs) that describe where the images are labeled. The output of a Raster Vision pipeline is a model bundle that allows you to easily utilize models in various deployment scenarios.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    SAM 2

    SAM 2

    The repository provides code for running inference with SAM 2

    SAM2 is a next-generation version of the Segment Anything Model (SAM), designed to improve performance, generalization, and efficiency in promptable image segmentation tasks. It retains the core promptable interface—accepting points, boxes, or masks—but incorporates architectural and training enhancements to produce higher-fidelity masks, better boundary adherence, and robustness to complex scenes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively. Improving data quality and understanding your model’s failure modes are the most impactful ways to boost the performance of your model. FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    MetaCLIP

    MetaCLIP

    ICLR2024 Spotlight: curation/training code, metadata, distribution

    ...The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation across base and target domains to measure how well the model retains its general knowledge while specializing as needed. It includes utilities to fine-tune vision-language embeddings, compute prompt or adapter updates, and benchmark across transfer and retention metrics. MetaCLIP is especially suited for real-world settings where a model must continuously incorporate new visual categories or domains over time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Training Data is the art of supervising machines through data. This includes the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Vision Transformer Pytorch

    Vision Transformer Pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA

    This repository provides a from-scratch, minimalist implementation of the Vision Transformer (ViT) in PyTorch, focusing on the core architectural pieces needed for image classification. It breaks down the model into patch embedding, positional encoding, multi-head self-attention, feed-forward blocks, and a classification head so you can understand each component in isolation. The code is intentionally compact and modular, which makes it easy to tinker with hyperparameters, depth, width, and attention dimensions. Because it stays close to vanilla PyTorch, you can integrate custom datasets and training loops without framework lock-in. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency. We also feature a deep integration with the Hugging Face Hub, allowing you to easily load and share a dataset with the wider NLP community. There are currently over 2658 datasets, and more than 34 metrics available. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    torchvision

    torchvision

    Datasets, transforms and models specific to Computer Vision

    The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. We recommend Anaconda as Python package management system. Torchvision currently supports Pillow (default), Pillow-SIMD, which is a much faster drop-in replacement for Pillow with SIMD, if installed will be used as the default. Also, accimage, if installed can be activated by calling torchvision.set_image_backend('accimage'), libpng, which can be installed via conda conda install libpng or any of the package managers for debian-based and RHEL-based Linux distributions, and libjpeg, which can be installed via conda conda install jpeg or any of the package managers for debian-based and RHEL-based Linux distributions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    VGGT

    VGGT

    [CVPR 2025 Best Paper Award] VGGT

    VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose or depth), making the system more robust to challenging viewpoints and textures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SAHI

    SAHI

    A lightweight vision library for performing large object detection

    A lightweight vision library for performing large-scale object detection & instance segmentation. Object detection and instance segmentation are by far the most important fields of applications in Computer Vision. However, detection of small objects and inference on large images are still major issues in practical usage. Here comes the SAHI to help developers overcome these real-world problems with many vision utilities. Detection of small objects and objects far away in the scene is a major...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    fastai

    fastai

    Deep learning library

    fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    OpenFieldAI - AI Open Field Test Tracker

    OpenFieldAI - AI Open Field Test Tracker

    OpenFieldAI is an AI based Open Field Test Rodent Tracker

    OpenFieldAI use AI-CNN to track rodents movement with pretrained OFAI models , or user could create their own model with YOLOv8 for inferencing. The software generates Centroid graph, Heat map and Line path and a spreadsheet containing all calculated parameters like - Speed - Time in and out of ROI - Distance - Entries/Exits for single/multiple pre-recorded videos or live webcam video. The ROI is assigned automatically in multiple video input , and can be manually given in single input...
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 15
    Hiera

    Hiera

    A fast, powerful, and simple hierarchical vision transformer

    ...The core idea is to use straightforward hierarchical attention with a minimal set of architectural “bells and whistles,” achieving competitive or superior accuracy while being markedly faster at inference and often faster to train. The repository provides installation options (from source or Torch Hub), a model zoo with pre-trained checkpoints, and code for evaluation and fine-tuning on standard benchmarks. Documentation emphasizes that model weights may have separate licensing and that the code targets practical experimentation for both research and downstream tasks. Community discussions cover topics like dataset pretrains, integration in other frameworks, and comparisons with related implementations. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    LLaVA

    LLaVA

    Visual Instruction Tuning: Large Language-and-Vision Assistant

    Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    pipeless

    pipeless

    A computer vision framework to create and deploy apps in minutes

    ...You provide some functions that are executed for new video frames and Pipeless takes care of everything else. You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    CoTracker

    CoTracker

    CoTracker is a model for tracking any point (pixel) on a video

    CoTracker is a learning-based point tracking system that jointly follows many user-specified points across a video, rather than tracking each point independently. By reasoning about all tracks together, it can maintain temporal consistency, handle mutual occlusions, and reduce identity swaps when trajectories cross. The model takes sparse point queries on one frame and predicts their sub-pixel locations and a visibility score for every subsequent frame, producing long, coherent trajectories. Its transformer-style architecture aggregates information both along time and across points, allowing it to recover tracks even after brief disappearances. The repository ships with inference scripts, pretrained weights, and simple interfaces to seed points, run tracking, and export trajectories for downstream tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Detectron

    Detectron

    FAIR's research platform for object detection research

    ...Built on Caffe2 with custom CUDA/C++ operators, it provided reference implementations for models like Faster R-CNN, Mask R-CNN, RetinaNet, and Feature Pyramid Networks. The framework emphasized a clean configuration system, strong baselines, and a “model zoo” so researchers could compare results under consistent settings. It includes training and evaluation pipelines that handle multi-GPU setups, standard datasets, and common augmentations, which helped standardize experimental practice in detection research. Visualization utilities and diagnostic scripts make it straightforward to inspect predictions, proposals, and losses while training. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    ...The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a mesh via marching cubes. It also uses a two-stage architecture: a coarse global model followed by local refinement patches to capture fine detail, balancing global consistency and local detail. The repo includes training pipelines, dataset loaders (for Multi-POP, etc.), and inference scripts for mesh output including depth maps for postprocessing. To help practical use, there are utilities for normal estimation, texture back-projection, mesh cleanup, and integration with rendering pipelines.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    ImageAI

    ImageAI

    A python library built to empower developers

    ...ImageAI is widely used around the world by professionals, students, research groups and businesses. ImageAI provides API to recognize 1000 different objects in a picture using pre-trained models that were trained on the ImageNet-1000 dataset. The model implementations provided are SqueezeNet, ResNet, InceptionV3 and DenseNet. ImageAI provides API to detect, locate and identify 80 most common objects in everyday life in a picture using pre-trained models that were trained on the COCO Dataset.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    ConvNeXt

    ConvNeXt

    Code release for ConvNeXt model

    ConvNeXt is a modernized convolutional neural network (CNN) architecture designed to rival Vision Transformers (ViTs) in accuracy and scalability while retaining the simplicity and efficiency of CNNs. It revisits classic ResNet-style backbones through the lens of transformer design trends—large kernel sizes, inverted bottlenecks, layer normalization, and GELU activations—to bridge the performance gap between convolutions and attention-based models. ConvNeXt’s clean, hierarchical structure...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Face Mask Detection

    Face Mask Detection

    Face Mask Detection system based on computer vision and deep learning

    ...The absence of large datasets of ‘with_mask’ images has made this task cumbersome and challenging. Our face mask detector doesn't use any morphed masked images dataset and the model is accurate. Owing to the use of MobileNetV2 architecture, it is computationally efficient, thus making it easier to deploy the model to embedded systems (Raspberry Pi, Google Coral, etc.).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    PyCls

    PyCls

    Codebase for Image Classification Research, written in PyTorch

    ...The repository includes highly tuned schedules, augmentations, and regularization settings that make it straightforward to match reported accuracy without guesswork. Distributed training and mixed precision are first-class, enabling fast experiments on multi-GPU setups with simple, declarative configs. Model definitions are concise and modular, making it easy to prototype new blocks or swap backbones while keeping the rest of the pipeline unchanged. Pretrained weights and evaluation scripts cover common datasets, and the logging/metric stack is designed for quick comparison across runs. Practitioners use pycls both as a baseline factory and as a scaffold for new classification backbones.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    DensePose is a computer vision system that maps all human pixels in an RGB image to the 3D surface of a human body model. It extends human pose estimation from predicting joint keypoints to providing dense correspondences between 2D images and a canonical 3D mesh (such as the SMPL model). This enables detailed understanding of human shape, motion, and surface appearance directly from images or videos. The repository includes the DensePose network architecture, training code, pretrained models, and dataset tools for annotation and visualization. ...
    Downloads: 42 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next