Page 4 | paper free download

Showing 114 open source projects for "paper"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
1

Deep Feature Rotation Multimodal Image

Implementation of Deep Feature Rotation for Multimodal Image

...I provide some in the data/content and data/style and you can try to use them easily. We provide a visual comparison between other rotation angles that do not appear in the paper. The rotation angles will produce a very diverse number of outputs. This has proven the effectiveness of our method with other methods.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
2

Paperless-ng

A supercharged version of paperless, scan, index and archive docs

Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to worry about finding stuff again. I feed documents right from the post box into the scanner and then shred them. ...

Downloads: 0 This Week

Last Update: 2022-03-04
See Project
3

SimSiam

PyTorch implementation of SimSiam

SimSiam is a PyTorch implementation of “Exploring Simple Siamese Representation Learning” by Xinlei Chen and Kaiming He. The project introduces a minimalist approach to self-supervised learning that avoids negative pairs, momentum encoders, or large memory banks—key complexities of prior contrastive methods. SimSiam learns image representations by maximizing similarity between two augmented views of the same image through a Siamese neural network with a stop-gradient operation, preventing...

Downloads: 4 This Week

Last Update: 6 days ago
See Project
4

TimeSformer

The official pytorch implementation of our paper

TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods. The official implementation in PyTorch...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

DeepSpeech

Open source embedded speech-to-text engine

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.

Downloads: 4 This Week

Last Update: 2021-04-08
See Project
6

FixRes

Reproduces results of "Fixing the train-test resolution discrepancy"

...FixRes demonstrates that a mismatch between training and testing resolutions often leads to suboptimal accuracy, and fine-tuning the classifier and batch normalization layers at higher test resolutions significantly enhances performance. The repository includes pretrained models, feature embeddings, and evaluation scripts corresponding to the experiments reported in the NeurIPS 2019 paper “Fixing the train-test resolution discrepancy.”

Downloads: 4 This Week

Last Update: 6 days ago
See Project
7

Deep Exemplar-based Video Colorization

The source code of CVPR 2019 paper "Deep Exemplar-based Colorization"

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization". End-to-end network for exemplar-based video colorization. The main challenge is to achieve temporal consistency while remaining faithful to the reference style. To address this issue, we introduce a recurrent framework that unifies the semantic correspondence and color propagation steps.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
8

Image GPT

Large-scale autoregressive pixel model for image generation by OpenAI

Image-GPT is the official research code and models from OpenAI’s paper Generative Pretraining from Pixels. The project adapts GPT-2 to the image domain, showing that the same transformer architecture can model sequences of pixels without altering its fundamental structure. It provides scripts to download pretrained checkpoints of different model sizes (small, medium, large) trained on large-scale datasets and includes utilities for handling color quantization with a 9-bit palette. ...

Downloads: 8 This Week

Last Update: 7 days ago
See Project
9

ALAE

Adversarial Latent Autoencoders

ALAE (Adversarial Latent Autoencoders) is a deep learning research implementation that combines autoencoders with generative adversarial networks to produce high-quality image synthesis models. The project implements the architecture introduced in the CVPR research paper on Adversarial Latent Autoencoders, which focuses on improving generative modeling by learning latent representations aligned with adversarial training objectives. Unlike traditional GANs that directly generate images from random noise, ALAE uses an encoder-decoder architecture that maps images into a structured latent space and then reconstructs them through adversarial training. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
10

Multi-Agent Emergence Environments

Environment generation code for the paper "Emergent Tool Use"

multi-agent-emergence-environments is an open source research environment framework developed by OpenAI for the study of emergent behaviors in multi-agent systems. It was designed for the experiments described in the paper and blog post “Emergent Tool Use from Multi-Agent Autocurricula”, which investigated how complex cooperative and competitive behaviors can evolve through self-play. The repository provides environment generation code that builds on the mujoco-worldgen package, enabling dynamic creation of simulated physical environments. Developers can construct custom environments by combining modular components such as Boxes, Ramps, and RandomWalls using a flexible layering approach that reduces code duplication. ...

Downloads: 0 This Week

Last Update: 6 days ago
See Project
11

CC-Net

Tools to download and cleanup Common Crawl data

cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or updated with new crawls. The repository documents practical concerns like HTTP failures, snapshot differences, and stats JSONs, reflecting community use across many languages. ...

Downloads: 0 This Week

Last Update: 2025-10-11
See Project
12

EfficientNet Keras

Implementation of EfficientNet model. Keras and TensorFlow Keras

...Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet.

Downloads: 0 This Week

Last Update: 2022-08-10
See Project
13

Multilingual Speech Synthesis

An implementation of Tacotron 2 that supports multilingual experiments

This repository provides synthesized samples, training and evaluation data, source code, and parameters for the paper One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech. It contains an implementation of Tacotron 2 that supports multilingual experiments and that implements different approaches to encoder parameter sharing. It presents a model combining ideas from Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning, End-to-End Code-Switched TTS with Mix of Monolingual Recordings, and Contextual Parameter Generation for Universal Neural Machine Translation. ...

Downloads: 0 This Week

Last Update: 2023-03-24
See Project
14

Magnitude

A fast, efficient universal vector embedding utility package

...It is primarily intended to be a simpler / faster alternative to Gensim but can be used as a generic key-vector store for domains outside NLP. It offers unique features like out-of-vocabulary lookups and streaming of large models over HTTP. Published in our paper at EMNLP 2018 and available on arXiv.

Downloads: 0 This Week

Last Update: 2024-08-16
See Project
15

Reliable Metrics for Generative Models

Code base for the precision, recall, density, and coverage metrics

...Because it does not differentiate the fidelity and diversity aspects of the generated images, recent papers have introduced variants of precision and recall metrics to diagnose those properties separately. In this paper, we show that even the latest version of the precision and recall (Kynkäänniemi et al., 2019) metrics are not reliable yet. For example, they fail to detect the match between two identical distributions, they are not robust against outliers, and the evaluation hyperparameters are selected arbitrarily. We propose density and coverage metrics that solve the above issues.

Downloads: 0 This Week

Last Update: 2023-03-21
See Project
16

PixelCNN

Code for the paper "PixelCNN++: A PixelCNN Implementation..."

...It also includes scripts for reproducing key experimental results from the paper, such as conditional sampling on datasets like CIFAR-10. The project serves as both a research reference and a practical framework for experimenting with autoregressive generative models. Although archived, PixelCNN has influenced a wide range of later work in generative modeling, including advancements in image transformers and diffusion models.

Downloads: 2 This Week

Last Update: 1 day ago
See Project
17

DeepSDF

Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF is a deep learning framework for continuous 3D shape representation using Signed Distance Functions (SDFs), as presented in the CVPR 2019 paper DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation by Park et al. The framework learns a continuous implicit function that maps 3D coordinates to their corresponding signed distances from object surfaces, allowing compact, high-fidelity shape modeling. Unlike traditional discrete voxel grids or meshes, DeepSDF encodes shapes as continuous neural representations that can be smoothly interpolated and used for reconstruction, generation, and analysis. ...

Downloads: 4 This Week

Last Update: 6 days ago
See Project
18

gpt2-client

Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, etc.

...It is the successor to the GPT (Generative Pre-trained Transformer) model trained on 40GB of text from the internet. It features a Transformer model that was brought to light by the Attention Is All You Need paper in 2017. The model has 4 versions - 124M, 345M, 774M, and 1558M - that differ in terms of the amount of training data fed to it and the number of parameters they contain. Finally, gpt2-client is a wrapper around the original gpt-2 repository that features the same functionality but with more accessiblity, comprehensibility, and utilty. ...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
19

DCVGAN

DCVGAN: Depth Conditional Video Generation, ICIP 2019.

This paper proposes a new GAN architecture for video generation with depth videos and color videos. The proposed model explicitly uses the information of depth in a video sequence as additional information for a GAN-based video generation scheme to make the model understands scene dynamics more accurately. The model uses pairs of color video and depth video for training and generates a video using the two steps.

Downloads: 0 This Week

Last Update: 2023-03-22
See Project
20

PyTorch pretrained BigGAN

PyTorch implementation of BigGAN with pretrained weights

An op-for-op PyTorch reimplementation of DeepMind's BigGAN model with the pre-trained weights from DeepMind. This repository contains an op-for-op PyTorch reimplementation of DeepMind's BigGAN that was released with the paper Large Scale GAN Training for High Fidelity Natural Image Synthesis. This PyTorch implementation of BigGAN is provided with the pretrained 128x128, 256x256 and 512x512 models by DeepMind. We also provide the scripts used to download and convert these models from the TensorFlow Hub models. This reimplementation was done from the raw computation graph of the Tensorflow version and behave similarly to the TensorFlow version (variance of the output difference of the order of 1e-5). ...

Downloads: 0 This Week

Last Update: 2023-03-21
See Project
21

SSD

A PyTorch Implementation of Single Shot MultiBox Detector

SSD is a PyTorch implementation of the Single Shot MultiBox Detector, a well-known object detection architecture introduced in the original SSD paper. It is built to help users train, evaluate, and experiment with object detection models using PyTorch rather than the original Caffe implementation. The repository includes the major components needed for an object detection workflow, including training scripts, evaluation scripts, demos, and utility modules. It supports commonly used benchmark datasets such as PASCAL VOC and MS COCO, and it also provides scripts to simplify downloading and setting up those datasets. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
22

DetectAndTrack

The implementation of an algorithm presented in the CVPR18 paper

DetectAndTrack is the reference implementation for the CVPR 2018 paper “Detect-and-Track: Efficient Pose Estimation in Videos,” focusing on human keypoint detection and tracking across video frames. The system combines per-frame pose detection with a tracking mechanism to maintain identities over time, enabling efficient multi-person pose estimation in video. Code and instructions are organized to replicate paper results and to serve as a starting point for researchers working on pose in video. ...

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
23

Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation

Tacotron-2 is a TensorFlow implementation of DeepMind’s Tacotron-2 end-to-end text-to-speech architecture, which predicts mel spectrograms from raw text and then feeds them to a neural vocoder such as WaveNet. It reproduces the original paper’s hyperparameters exactly via paper_hparams.py, while also offering a tuned hparams.py with extra improvements that often yield better audio quality in practice. The repository is structured as a full training pipeline: dataset preparation,...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
24

Improved GAN

Code for the paper "Improved Techniques for Training GANs"

Improved-GAN is the official code release from OpenAI accompanying the research paper Improved Techniques for Training GANs. It provides implementations of experiments conducted on datasets such as MNIST, SVHN, CIFAR-10, and ImageNet. The project focuses on demonstrating enhanced training methods for Generative Adversarial Networks, addressing stability and performance issues that were common in earlier GAN models.

Downloads: 2 This Week

Last Update: 7 days ago
See Project
25

Finetune Transformer LM

Code for "Improving Language Understanding by Generative Pre-Training"

finetune-transformer-lm is a research codebase that accompanies the paper “Improving Language Understanding by Generative Pre-Training,” providing a minimal implementation focused on fine-tuning a transformer language model for evaluation tasks. The repository centers on reproducing the ROCStories Cloze Test result and includes a single-command training workflow to run the experiment end to end. It documents that runs are non-deterministic due to certain GPU operations and reports a median accuracy over multiple trials that is slightly below the single-run result in the paper, reflecting expected variance in practice. ...

Downloads: 3 This Week

Last Update: 6 days ago
See Project