Showing 114 open source projects for "paper"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    Deep Feature Rotation Multimodal Image

    Deep Feature Rotation Multimodal Image

    Implementation of Deep Feature Rotation for Multimodal Image

    ...I provide some in the data/content and data/style and you can try to use them easily. We provide a visual comparison between other rotation angles that do not appear in the paper. The rotation angles will produce a very diverse number of outputs. This has proven the effectiveness of our method with other methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to worry about finding stuff again. I feed documents right from the post box into the scanner and then shred them. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SimSiam

    SimSiam

    PyTorch implementation of SimSiam

    SimSiam is a PyTorch implementation of “Exploring Simple Siamese Representation Learning” by Xinlei Chen and Kaiming He. The project introduces a minimalist approach to self-supervised learning that avoids negative pairs, momentum encoders, or large memory banks—key complexities of prior contrastive methods. SimSiam learns image representations by maximizing similarity between two augmented views of the same image through a Siamese neural network with a stop-gradient operation, preventing...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    TimeSformer

    TimeSformer

    The official pytorch implementation of our paper

    TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods. The official implementation in PyTorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    FixRes

    FixRes

    Reproduces results of "Fixing the train-test resolution discrepancy"

    ...FixRes demonstrates that a mismatch between training and testing resolutions often leads to suboptimal accuracy, and fine-tuning the classifier and batch normalization layers at higher test resolutions significantly enhances performance. The repository includes pretrained models, feature embeddings, and evaluation scripts corresponding to the experiments reported in the NeurIPS 2019 paper “Fixing the train-test resolution discrepancy.”
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Deep Exemplar-based Video Colorization

    Deep Exemplar-based Video Colorization

    The source code of CVPR 2019 paper "Deep Exemplar-based Colorization"

    The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization". End-to-end network for exemplar-based video colorization. The main challenge is to achieve temporal consistency while remaining faithful to the reference style. To address this issue, we introduce a recurrent framework that unifies the semantic correspondence and color propagation steps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Image GPT

    Image GPT

    Large-scale autoregressive pixel model for image generation by OpenAI

    Image-GPT is the official research code and models from OpenAI’s paper Generative Pretraining from Pixels. The project adapts GPT-2 to the image domain, showing that the same transformer architecture can model sequences of pixels without altering its fundamental structure. It provides scripts to download pretrained checkpoints of different model sizes (small, medium, large) trained on large-scale datasets and includes utilities for handling color quantization with a 9-bit palette. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    ALAE

    ALAE

    Adversarial Latent Autoencoders

    ALAE (Adversarial Latent Autoencoders) is a deep learning research implementation that combines autoencoders with generative adversarial networks to produce high-quality image synthesis models. The project implements the architecture introduced in the CVPR research paper on Adversarial Latent Autoencoders, which focuses on improving generative modeling by learning latent representations aligned with adversarial training objectives. Unlike traditional GANs that directly generate images from random noise, ALAE uses an encoder-decoder architecture that maps images into a structured latent space and then reconstructs them through adversarial training. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 10
    Multi-Agent Emergence Environments

    Multi-Agent Emergence Environments

    Environment generation code for the paper "Emergent Tool Use"

    multi-agent-emergence-environments is an open source research environment framework developed by OpenAI for the study of emergent behaviors in multi-agent systems. It was designed for the experiments described in the paper and blog post “Emergent Tool Use from Multi-Agent Autocurricula”, which investigated how complex cooperative and competitive behaviors can evolve through self-play. The repository provides environment generation code that builds on the mujoco-worldgen package, enabling dynamic creation of simulated physical environments. Developers can construct custom environments by combining modular components such as Boxes, Ramps, and RandomWalls using a flexible layering approach that reduces code duplication. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or updated with new crawls. The repository documents practical concerns like HTTP failures, snapshot differences, and stats JSONs, reflecting community use across many languages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    EfficientNet Keras

    EfficientNet Keras

    Implementation of EfficientNet model. Keras and TensorFlow Keras

    ...Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Multilingual Speech Synthesis

    Multilingual Speech Synthesis

    An implementation of Tacotron 2 that supports multilingual experiments

    This repository provides synthesized samples, training and evaluation data, source code, and parameters for the paper One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech. It contains an implementation of Tacotron 2 that supports multilingual experiments and that implements different approaches to encoder parameter sharing. It presents a model combining ideas from Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning, End-to-End Code-Switched TTS with Mix of Monolingual Recordings, and Contextual Parameter Generation for Universal Neural Machine Translation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Magnitude

    Magnitude

    A fast, efficient universal vector embedding utility package

    ...It is primarily intended to be a simpler / faster alternative to Gensim but can be used as a generic key-vector store for domains outside NLP. It offers unique features like out-of-vocabulary lookups and streaming of large models over HTTP. Published in our paper at EMNLP 2018 and available on arXiv.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Reliable Metrics for Generative Models

    Reliable Metrics for Generative Models

    Code base for the precision, recall, density, and coverage metrics

    ...Because it does not differentiate the fidelity and diversity aspects of the generated images, recent papers have introduced variants of precision and recall metrics to diagnose those properties separately. In this paper, we show that even the latest version of the precision and recall (Kynkäänniemi et al., 2019) metrics are not reliable yet. For example, they fail to detect the match between two identical distributions, they are not robust against outliers, and the evaluation hyperparameters are selected arbitrarily. We propose density and coverage metrics that solve the above issues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PixelCNN

    PixelCNN

    Code for the paper "PixelCNN++: A PixelCNN Implementation..."

    ...It also includes scripts for reproducing key experimental results from the paper, such as conditional sampling on datasets like CIFAR-10. The project serves as both a research reference and a practical framework for experimenting with autoregressive generative models. Although archived, PixelCNN has influenced a wide range of later work in generative modeling, including advancements in image transformers and diffusion models.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    DeepSDF

    DeepSDF

    Learning Continuous Signed Distance Functions for Shape Representation

    DeepSDF is a deep learning framework for continuous 3D shape representation using Signed Distance Functions (SDFs), as presented in the CVPR 2019 paper DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation by Park et al. The framework learns a continuous implicit function that maps 3D coordinates to their corresponding signed distances from object surfaces, allowing compact, high-fidelity shape modeling. Unlike traditional discrete voxel grids or meshes, DeepSDF encodes shapes as continuous neural representations that can be smoothly interpolated and used for reconstruction, generation, and analysis. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    gpt2-client

    gpt2-client

    Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, etc.

    ...It is the successor to the GPT (Generative Pre-trained Transformer) model trained on 40GB of text from the internet. It features a Transformer model that was brought to light by the Attention Is All You Need paper in 2017. The model has 4 versions - 124M, 345M, 774M, and 1558M - that differ in terms of the amount of training data fed to it and the number of parameters they contain. Finally, gpt2-client is a wrapper around the original gpt-2 repository that features the same functionality but with more accessiblity, comprehensibility, and utilty. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DCVGAN

    DCVGAN

    DCVGAN: Depth Conditional Video Generation, ICIP 2019.

    This paper proposes a new GAN architecture for video generation with depth videos and color videos. The proposed model explicitly uses the information of depth in a video sequence as additional information for a GAN-based video generation scheme to make the model understands scene dynamics more accurately. The model uses pairs of color video and depth video for training and generates a video using the two steps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PyTorch pretrained BigGAN

    PyTorch pretrained BigGAN

    PyTorch implementation of BigGAN with pretrained weights

    An op-for-op PyTorch reimplementation of DeepMind's BigGAN model with the pre-trained weights from DeepMind. This repository contains an op-for-op PyTorch reimplementation of DeepMind's BigGAN that was released with the paper Large Scale GAN Training for High Fidelity Natural Image Synthesis. This PyTorch implementation of BigGAN is provided with the pretrained 128x128, 256x256 and 512x512 models by DeepMind. We also provide the scripts used to download and convert these models from the TensorFlow Hub models. This reimplementation was done from the raw computation graph of the Tensorflow version and behave similarly to the TensorFlow version (variance of the output difference of the order of 1e-5). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SSD

    SSD

    A PyTorch Implementation of Single Shot MultiBox Detector

    SSD is a PyTorch implementation of the Single Shot MultiBox Detector, a well-known object detection architecture introduced in the original SSD paper. It is built to help users train, evaluate, and experiment with object detection models using PyTorch rather than the original Caffe implementation. The repository includes the major components needed for an object detection workflow, including training scripts, evaluation scripts, demos, and utility modules. It supports commonly used benchmark datasets such as PASCAL VOC and MS COCO, and it also provides scripts to simplify downloading and setting up those datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    DetectAndTrack

    DetectAndTrack

    The implementation of an algorithm presented in the CVPR18 paper

    DetectAndTrack is the reference implementation for the CVPR 2018 paper “Detect-and-Track: Efficient Pose Estimation in Videos,” focusing on human keypoint detection and tracking across video frames. The system combines per-frame pose detection with a tracking mechanism to maintain identities over time, enabling efficient multi-person pose estimation in video. Code and instructions are organized to replicate paper results and to serve as a starting point for researchers working on pose in video. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Tacotron-2

    Tacotron-2

    DeepMind's Tacotron-2 Tensorflow implementation

    Tacotron-2 is a TensorFlow implementation of DeepMind’s Tacotron-2 end-to-end text-to-speech architecture, which predicts mel spectrograms from raw text and then feeds them to a neural vocoder such as WaveNet. It reproduces the original paper’s hyperparameters exactly via paper_hparams.py, while also offering a tuned hparams.py with extra improvements that often yield better audio quality in practice. The repository is structured as a full training pipeline: dataset preparation,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Improved GAN

    Improved GAN

    Code for the paper "Improved Techniques for Training GANs"

    Improved-GAN is the official code release from OpenAI accompanying the research paper Improved Techniques for Training GANs. It provides implementations of experiments conducted on datasets such as MNIST, SVHN, CIFAR-10, and ImageNet. The project focuses on demonstrating enhanced training methods for Generative Adversarial Networks, addressing stability and performance issues that were common in earlier GAN models.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Finetune Transformer LM

    Finetune Transformer LM

    Code for "Improving Language Understanding by Generative Pre-Training"

    finetune-transformer-lm is a research codebase that accompanies the paper “Improving Language Understanding by Generative Pre-Training,” providing a minimal implementation focused on fine-tuning a transformer language model for evaluation tasks. The repository centers on reproducing the ROCStories Cloze Test result and includes a single-command training workflow to run the experiment end to end. It documents that runs are non-deterministic due to certain GPU operations and reports a median accuracy over multiple trials that is slightly below the single-run result in the paper, reflecting expected variance in practice. ...
    Downloads: 3 This Week
    Last Update:
    See Project
Auth0 Logo