Showing 120 open source projects for "resolution"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Point-E

    Point-E

    Point cloud diffusion for 3D model synthesis

    point-e is the official repository for Point-E, a generative model developed by OpenAI that produces 3D point clouds from textual (or image) prompts. Its principal advantage is speed: it can generate 3D assets in just 1–2 minutes on a single GPU, which is significantly faster than many competing text-to-3D models. The model works via a two-stage diffusion approach: first, it uses a text → image diffusion network to produce a synthetic 2D view consistent with the prompt; then a second...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Karlo

    Karlo

    Text-conditional image generation model based on OpenAI's unCLIP

    Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details only in the small number of denoising steps. We train all components from scratch on 115M image-text pairs including COYO-100M, CC3M, and CC12M. In the case of Prior and Decoder, we use ViT-L/14 provided by OpenAI’s CLIP repository. Unlike the original implementation of unCLIP, we replace the trainable transformer in the decoder into the text encoder in ViT-L/14 for efficiency. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    BCI

    BCI

    BCI: Breast Cancer Immunohistochemical Image Generation

    Breast Cancer Immunohistochemical Image Generation through Pyramid Pix2pix. We have released the trained model on BCI and LLVIP datasets. We host a competition for breast cancer immunohistochemistry image generation on Grand Challenge. Project pix2pix provides a python script to generate pix2pix training data in the form of pairs of images {A,B}, where A and B are two different depictions of the same underlying scene, these can be pairs {HE, IHC}. Then we can learn to translate A(HE images)...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    ruDALL-E

    ruDALL-E

    Generate images from texts. In Russian

    We present a family of generative models from SberDevices and Sber AI! Models allow you to create images that did not exist before. All you need is a text description in Russian or another language. Try to create unique images together with generative artists using your own formulations. Ask generative artists to depict something special for you as well. The Kandinsky 2.0 model uses the reverse diffusion method and creates colorful images on various topics in a matter of seconds by text...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 5
    Mask2Former

    Mask2Former

    Code release for "Masked-attention Mask Transformer

    ...A pixel decoder fuses multi-scale features and feeds masked attention in the transformer so each query focuses computation on its current spatial support. This leads to accurate masks with sharp boundaries and strong small-object performance while remaining efficient on high-resolution inputs. The project provides extensive configurations and pretrained models across popular benchmarks like COCO, ADE20K, and Cityscapes. Built on top of Detectron2, it includes training scripts, inference tools, and visualization utilities that make experimentation straightforward.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    RQ-Transformer

    RQ-Transformer

    Implementation of RQ Transformer, autoregressive image generation

    ...I also think there is something deeper going on, and have generalized this to any number of dimensions. You can use it by importing the HierarchicalCausalTransformer. For autoregressive (AR) modeling of high-resolution images, vector quantization (VQ) represents an image as a sequence of discrete codes. A short sequence length is important for an AR model to reduce its computational costs to consider long-range interactions of codes. However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Piano transcription

    Piano transcription

    Task of transcribing piano recordings into MIDI files

    Piano transcription is an open-source high-resolution piano transcription system by ByteDance that converts raw audio recordings of piano performance into symbolic MIDI files — detecting note onsets, offsets, pitch, velocity, and even pedal usage. The system is implemented in Python (PyTorch) and is capable of accurate transcription of polyphonic piano recordings, even with complex passages and pedal techniques, making it suitable for classical piano music.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Fashion-MNIST

    Fashion-MNIST

    A MNIST-like fashion product database

    ...It was designed as a direct replacement for the original MNIST handwritten digits dataset, maintaining the same structure and image size so that researchers could easily switch datasets without modifying their experimental pipelines. The dataset consists of 70,000 images in total, with 60,000 examples used for training and 10,000 reserved for testing. Each image has a resolution of 28 by 28 pixels and belongs to one of ten clothing classes, making it suitable for evaluating classification models. Because the dataset represents real-world objects rather than handwritten digits, it offers a more challenging benchmark for testing machine learning algorithms.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    GANformer

    GANformer

    Generative Adversarial Transformers

    ...The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. The model iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation and can thus be seen as a generalization of the successful StyleGAN network. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    GiantMIDI-Piano

    GiantMIDI-Piano

    Classical piano MIDI dataset

    GiantMIDI-Piano is a large-scale symbolic classical piano music dataset built by applying the piano_transcription system on a vast collection of piano performance recordings. The dataset contains thousands of piano works, spanning a large number of composers and styles, with each piece transcribed into high-precision MIDI files capturing note events, pedal usage, velocities, etc. It provides a resource for music information retrieval (MIR), symbolic music modeling, composer classification,...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    DnCNN

    DnCNN

    Beyond a Gaussian Denoiser: Residual Learning of Deep CNN

    ...DnCNN is a feedforward convolutional neural network that learns to predict the residual noise (i.e. noise map) from a noisy input image, which is then subtracted to yield a clean image. This formulation allows efficient denoising, supports blind Gaussian noise (i.e. unknown noise levels), and can be extended to related tasks like image super-resolution or JPEG deblocking in some variants. The repository includes training code (using MatConvNet / MATLAB), demo scripts, pretrained models, and evaluation routines. Single model handling multiple noise levels.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    VoiceFixer

    VoiceFixer

    General Speech Restoration

    VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    qiji-font

    qiji-font

    Typeface from Ming Dynasty woodblock printed books

    ...A work in progress. Named in honor of 閔齊伋, a 16th-century printer. Intended to be used with Kenyan-lang, the Classical Chinese programming language. Download high-resolution PDFs and split pages into images. Manually lay a grid on top of each page to generate bounding boxes for characters (potentially replaceable by an automatic corner-detection algorithm). Generate a low-poly mask for each character on the grid, and save the thumbnails (using OpenCV). First, red channel is subtracted from the grayscale, in order to clean the annotations printed in red ink. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Old Photo Restoration

    Old Photo Restoration

    Bringing Old Photo Back to Life (CVPR 2020 oral)

    We propose to restore old photos that suffer from severe degradation through a deep learning approach. Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize. Therefore, we propose a novel triplet domain translation network by leveraging real photos along with massive synthetic image pairs. Specifically, we train two...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    YOLOv4-large

    YOLOv4-large

    Scaled-YOLOv4: Scaling Cross Stage Partial Network

    ...The project provides a PyTorch implementation of the Scaled-YOLOv4 framework, which extends the original YOLOv4 architecture using Cross Stage Partial (CSP) networks and new scaling techniques. Unlike earlier object detection systems that only scale depth or width, this architecture scales multiple aspects of the neural network including structure, resolution, and channel configuration. This scaling strategy enables the model to adapt to different hardware environments while maintaining a strong balance between speed and detection accuracy. The repository includes multiple model variants such as YOLOv4-tiny, YOLOv4-CSP, and large-scale configurations designed for high-performance detection tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Pytorch Points 3D

    Pytorch Points 3D

    Pytorch framework for doing deep learning on point clouds

    ...We aim to build a tool that can be used for benchmarking SOTA models, while also allowing practitioners to efficiently pursue research into point cloud analysis, with the end goal of building models which can be applied to real-life applications. Task driven implementation with dynamic model and dataset resolution from arguments. Core implementation of common components for point cloud deep learning - greatly simplifying the creation of new models. 4 Base Convolution base classes to simplify the implementation of new convolutions. Each base class supports a different data format.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    FixRes

    FixRes

    Reproduces results of "Fixing the train-test resolution discrepancy"

    ...The repository includes pretrained models, feature embeddings, and evaluation scripts corresponding to the experiments reported in the NeurIPS 2019 paper “Fixing the train-test resolution discrepancy.”
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    ...Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). Trainer provides a variety of built-in Callback functions to facilitate experiment recording, exception capture, etc. Automatic download of some datasets and pre-trained models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    GIMP ML

    GIMP ML

    AI for GNU Image Manipulation Program

    ...It enables the use of recent advances in computer vision to the conventional image editing pipeline. Applications from deep learning such as monocular depth estimation, semantic segmentation, mask generative adversarial networks, image super-resolution, de-noising and coloring have been incorporated with GIMP through Python-based plugins. Additionally, operations on images such as edge detection and color clustering have also been added. GIMP-ML relies on standard Python packages such as numpy, scikit-image, pillow, pytorch, open-cv, scipy. In addition, GIMP-ML also aims to bring the benefits of using deep learning networks used for computer vision tasks to routine image processing workflows.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    EfficientNet Keras

    EfficientNet Keras

    Implementation of EfficientNet model. Keras and TensorFlow Keras

    ...Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Linux-Intelligent-Ocr-Solution

    Linux-Intelligent-Ocr-Solution

    Easy-OCR solution and Tesseract trainer for GNU/Linux

    Linux-intelligent-ocr-solution Lios is a free and open source software for converting print in to text using either scanner or a camera, It can also produce text out of scanned images from other sources such as Pdf, Image, Folder containing Images or screenshot. Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package. Forum : https://groups.google.com/forum/#!forum/lios Video Tutorial :...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    Image Super-Resolution (ISR)

    Image Super-Resolution (ISR)

    Super-scale your images and run experiments with Residual Dense

    The goal of this project is to upscale and improve the quality of low-resolution images. This project contains Keras implementations of different Residual Dense Networks for Single Image Super-Resolution (ISR) as well as scripts to train these networks using content and adversarial loss components. Docker scripts and Google Colab notebooks are available to carry training and prediction. Also, we provide scripts to facilitate training on the cloud with AWS and Nvidia-docker with only a few commands. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Tiny

    Tiny

    Tiny Face Detector, CVPR 2017

    ...Pretrained model provided (ResNet101-based, plus alternatives). Demo and evaluation scripts for benchmark datasets. Use of “foveal descriptors” to incorporate context for low-resolution faces. Pretrained model provided (ResNet101-based, plus alternatives).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    vid2vid

    vid2vid

    Pytorch implementation of our method for high-resolution

    ...It uses generative adversarial networks combined with temporal modeling strategies to maintain coherence and reduce flickering artifacts. The framework is capable of producing high-resolution outputs and is widely used in research related to video synthesis, animation, and simulation. It also supports diverse input modalities, making it flexible for different types of video generation tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Replica Dataset

    Replica Dataset

    High-fidelity indoor 3D dataset for AI simulation and robotics

    Replica Dataset is a high-quality 3D dataset of realistic indoor environments designed to advance research in computer vision, robotics, and embodied AI. Developed by Facebook Research (now Meta AI), it features accurate geometric reconstructions, high-resolution and high dynamic range textures, and comprehensive semantic annotations. Each environment contains detailed models of real-world spaces, including rooms, furniture, glass, and mirror surfaces. The dataset also provides semantic and instance segmentations, planar decomposition, and navigation meshes, making it highly suitable for simulation, visual perception, and autonomous navigation tasks. ...
    Downloads: 764 This Week
    Last Update:
    See Project
Auth0 Logo