Showing 359 open source projects for "encoder"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Bert-VITS2

    Bert-VITS2

    VITS2 backbone with multilingual-bert

    Bert-VITS2 is a neural text-to-speech project that combines a VITS2 backbone with a multilingual BERT front-end to produce high-quality speech in multiple languages. The core idea is to use BERT-style contextual embeddings for text encoding while relying on a refined VITS2 architecture for acoustic generation and vocoding. The repository includes everything needed to train, fine-tune, and run the model, from configuration files to preprocessing scripts, spectrogram utilities, and training...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    LAME (Lame Aint an MP3 Encoder)

    LAME (Lame Aint an MP3 Encoder)

    A high quality MP3 encoder

    LAME is an educational tool to be used for learning about MP3 encoding. The goal of the LAME project is to improve the psycho acoustics, quality and speed of MP3 encoding. Note: we provide source code only!
    Leader badge
    Downloads: 24,974 This Week
    Last Update:
    See Project
  • 3
    Jingo

    Jingo

    This package provides the ability to encode golang structs

    ...Its main benefit is that it has a pooling built-in. This goes a long way to helping make jingo fast by reducing its allocations and ensuring good write speeds. When you create an instance of an encoder it recursively generates an instruction set that defines how to iteratively encode your structs. This gives it the ability to provide a clear API but with the same benefits as a build-time optimized encoder. It's almost exclusively able to do all types of assertions and reflection activity during the compile, then makes ample use of the unsafe package during the instruction-set execution (the Marshal call) to make reading and writing very fast.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Shap-E

    Shap-E

    Generate 3D objects conditioned on text or images

    The shap-e repository provides the official code and model release for Shap-E, a conditional generative model designed to produce 3D assets (implicit functions, meshes, neural radiance fields) from text or image prompts. The model is built with a two-stage architecture: first an encoder that maps existing 3D assets into parameterizations of implicit functions, and then a conditional diffusion model trained on those parameterizations to generate new assets. Because it works at the level of implicit functions, Shap-E can render output both as textured meshes and NeRF-style volumetric renderings. The repository contains sample notebooks (e.g. sample_text_to_3d.ipynb, sample_image_to_3d.ipynb) so users can try out text → 3D or image → 3D generation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    microenc

    microenc

    Batch audio encoding script for Linux/BSD

    microenc is a small Bash shell script for Linux/BSD for encoding directories with audio files to other formats using FFmpeg as encoder.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    rotary-turbo-PCA9685

    rotary-turbo-PCA9685

    Led dimming code for rotary encoder with turbo

    Logarithmic applied scale to rotary encoder turbo. Turbo mode start if turning rapidly rotary weel and stop automatically after 3 seconds. Tested with stm32f103c8t6 (blue-pill) and PCA9685. For used scale I added gnumeric spreadsheet for double type value control match with integers in arduino schetch. You can modify this code according yours own needs. Pls consider this: a - using for led dimming: with logarithmic derived scale consider apparent light intensity modulation not of light source but for whole light reflecting 3d area; with scale range (here 500 steps) and timing for turbo mode, you can adapt this code with whole 3d area regarding light reflecting caracteristics and rotary using mode also reducing 360° rotary turns number not needing very large single step luminosity difference; b - with motors, ... be careful using turbo. :-)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenFlamingo

    OpenFlamingo

    An open-source framework for training large multimodal models

    ...If you have any questions, please feel free to open an issue. We also welcome contributions! We provide an initial OpenFlamingo 9B model using a CLIP ViT-Large vision encoder and a LLaMA-7B language model. In general, we support any CLIP vision encoder. For the language model, we support LLaMA, OPT, GPT-Neo, GPT-J, and Pythia models. OpenFlamingo is a multimodal language model that can be used for a variety of tasks. It is trained on a large multimodal dataset.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    jbig2enc

    jbig2enc

    JBIG2 Encoder

    This is an encoder for JBIG2. JBIG2 encodes bi-level (1 bpp) images using a number of clever tricks to get better compression than G4. This encoder can: Generate JBIG2 files, or fragments for embedding in PDFs Generic region encoding Perform symbol extraction, classification and text region coding Perform refinement coding and, Compress multipage documents It uses the (Apache-ish licensed) Leptonica library: http://leptonica.com/
    Downloads: 22 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    Basaran

    Basaran

    Basaran, an open-source alternative to the OpenAI text completion API

    ...The open source community will eventually witness the Stable Diffusion moment for large language models (LLMs), and Basaran allows you to replace OpenAI's service with the latest open-source model to power your application without modifying a single line of code. Stream generation using various decoding strategies. Support both decoder-only and encoder-decoder models. Detokenizer that handles surrogates and whitespace. Multi-GPU support with optional 8-bit quantization. Real-time partial progress using server-sent events. Compatible with OpenAI API and client libraries. Comes with a fancy web-based playground. Docker images are available on Docker Hub and GitHub Packages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    MetaTransformer

    MetaTransformer

    Meta-Transformer for Unified Multimodal Learning

    We're thrilled to present OneLLM, an ensembling Meta-Transformer framework with Multimodal Large Language Models, which performs multimodal joint training, supports more modalities including fMRI, Depth, and Normal Maps, and demonstrates very impressive performances on 25 benchmarks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Go-QRCode

    Go-QRCode

    QRCode Encoder written in Pure Go

    A QR Code is a matrix (two-dimensional) barcode. Arbitrary content may be encoded, with URLs being a popular choice :) Each QR Code contains error recovery information to aid reading damaged or obscured codes. There are four levels of error recovery: Low, medium, high and highest. QR Codes with a higher recovery level are more robust to damage, at the cost of being physically larger. Copyright (c) 2014 Tom Harwood
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    OpenNMT-tf

    OpenNMT-tf

    Neural machine translation and sequence learning using TensorFlow

    ...Models are described with code to allow training custom architectures and overriding default behavior. For example, the following instance defines a sequence-to-sequence model with 2 concatenated input features, a self-attentional encoder, and an attentional RNN decoder sharing its input and output embeddings. Sequence to sequence models can be trained with guided alignment and alignment information are returned as part of the translation API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Arduino ASCOM Focuser Pro DIY

    Arduino ASCOM Focuser Pro DIY

    Arduino Focuser, fully ASCOM complaint

    myFocuserPro is an ASCOM and Moonlite compatible stepper motor telescope focus controller (DIY) based on Arduino Nano/Uno. A popular DIY ASCOM focuser with more than 121,000+ downloads. (c) Copyright Robert Brown 2014-2024. All Rights reserved. Permission is granted for personal and Academic use only. Spreadsheet to calculate what stepper motor to use. https://sourceforge.net/projects/arduinoascomfocuserpro2diy/files/Documentation/Nema-Stepper-Motors.xlsx/download
    Downloads: 165 This Week
    Last Update:
    See Project
  • 15
    iJEPA

    iJEPA

    Official codebase for I-JEPA

    i-JEPA (Image Joint-Embedding Predictive Architecture) is a self-supervised learning framework that predicts missing high-level representations rather than reconstructing pixels. A context encoder sees visible regions of an image and predicts target embeddings for masked regions produced by a slowly updated target encoder, focusing learning on semantics instead of texture. This objective sidesteps generative pixel losses and avoids heavy negative sampling, producing features that transfer strongly with linear probes and minimal fine-tuning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    lora-svc

    lora-svc

    Singing voice change based on whisper, lora for singing voice clone

    singing voice change based on whisper, and lora for singing voice clone. You will feel the beauty of the code from this project. Uni-SVC main branch is for singing voice clone based on whisper with speaker encoder and speaker adapter. Uni-SVC main target is to develop lora for SVC. With lora, maybe clone a singer just need 10 stence after 10 minutes train. Each singer is a plug-in of the base model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Stable-Dreamfusion

    Stable-Dreamfusion

    Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion

    ...Different from Imagen, Stable-Diffusion is a latent diffusion model, which diffuses in a latent space instead of the original image space. Therefore, we need the loss to propagate back from the VAE's encoder part too, which introduces extra time costs in training. We use the multi-resolution grid encoder to implement the NeRF backbone (implementation from torch-ngp), which enables much faster rendering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    FasterTransformer

    FasterTransformer

    Transformer related optimization, including BERT, GPT

    FasterTransformer is a high-performance inference library designed to accelerate transformer-based models such as BERT, GPT, and T5 on NVIDIA GPUs. It provides optimized implementations of transformer encoder and decoder layers using CUDA, cuBLAS, and custom kernels to maximize throughput and minimize latency. The library supports multiple deep learning frameworks, including TensorFlow, PyTorch, and Triton, allowing developers to integrate it into existing pipelines without major changes. It includes advanced optimization techniques such as mixed precision, tensor parallelism, and efficient memory management, enabling large models to run across multiple GPUs and nodes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CPT

    CPT

    CPT: A Pre-Trained Unbalanced Transformer

    A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation. We replace the old BERT vocabulary with a larger one of size 51271 built from the training data, in which we 1) add missing 6800+ Chinese characters (most of them are traditional Chinese characters); 2) remove redundant tokens (e.g. Chinese character tokens with ## prefix); 3) add some English tokens to reduce OOV. Position Embeddings We extend the max_position_embeddings from 512 to 1024. We...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Lyra

    Lyra

    A Very Low-Bitrate Codec for Speech Compression

    ...Its architecture is resilient to packet loss and jitter through framing strategies and error concealment, helping conversations remain understandable under adverse conditions. The codebase includes encoder and decoder components, along with tools for data preparation and evaluation. By pushing bitrates down to just a few kilobits per second while retaining quality, lyra expands access to voice calls where bandwidth is scarce.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    SmartKnob

    SmartKnob

    Haptic input knob with software-defined endstops and virtual detents

    SmartKnob is an open‑source hardware/software hybrid rotary input device featuring software-configurable end-stops and virtual detents, facilitated by a brushless gimbal motor and magnetic encoder—offering programmable haptic feedback and a novel, tactile user interface. 240x240 round LCD ("GC9A01"), protected by 39.5mm watch glass on rotor. BLDC gimbal motor, with a hollow shaft for mechanically & electrically connecting the LCD. PCB flexure and SMD resistors used as strain gauges for press detection (haptic feedback provided via the motor). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Grip
    Grip is a GTK-based CD-player and CD-ripper / MP3 encoder. It has the ripping capabilities of cdparanoia built in, but can also use external rippers (such as cdda2wav). Encoder presets are provided for oggenc, bladeenc, lame, l3enc, xingmp3enc, mp3encode, gogo, flac, faac and opusenc. The main developers can be found in #grip on Libera (irc.libera.chat).
    Downloads: 22 This Week
    Last Update:
    See Project
  • 23
    Karlo

    Karlo

    Text-conditional image generation model based on OpenAI's unCLIP

    ...In the case of Prior and Decoder, we use ViT-L/14 provided by OpenAI’s CLIP repository. Unlike the original implementation of unCLIP, we replace the trainable transformer in the decoder into the text encoder in ViT-L/14 for efficiency. In the case of the SR module, we first train the model using the DDPM objective in 1M steps, followed by additional 234K steps to fine-tune the additional component.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Stable Diffusion

    Stable Diffusion

    A latent text-to-image diffusion model

    Stable Diffusion is a widely used open-source latent text-to-image diffusion model developed by the CompVis group for generating high-quality images from natural language prompts. The model operates by conditioning a diffusion process on text embeddings produced by a CLIP text encoder, enabling detailed and controllable image synthesis. It was trained on large-scale image datasets and later fine-tuned to produce 512×512 images with strong visual fidelity. Because the system runs efficiently on consumer hardware compared to earlier generative models, it helped popularize local AI image generation workflows. The repository includes reference scripts and model configurations that allow researchers and developers to reproduce, modify, or extend the architecture. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    LightSeq

    LightSeq

    A High Performance Library for Sequence Processing and Generation

    Lightseq is a high-performance library focused on efficient inference and training for deep learning models, especially large language models (LLMs) and transformer-based architectures. Its goal is to optimize both memory usage and computational throughput, enabling faster training or inference on limited hardware while maintaining model quality. Lightseq provides optimized CUDA kernels, quantization strategies, and runtime optimizations tailored for transformer operations — which often are...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo