Showing 49 open source projects for "encoding"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Tiktoken

    Tiktoken

    tiktoken is a fast BPE tokeniser for use with OpenAI's models

    tiktoken is a high-performance, tokenizer library (based on byte-pair encoding, BPE) designed for use with OpenAI’s models. It handles encoding and decoding text to token IDs efficiently, with minimal overhead. Because tokenization is a fundamental step in preparing text for models, tiktoken is optimized for speed, memory, and correctness in model contexts (e.g. matching OpenAI’s internal tokenization). The repo supports multiple encodings (e.g.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    SpikingJelly

    SpikingJelly

    SpikingJelly is an open-source deep learning framework

    ...This makes it especially relevant for researchers interested in biologically inspired computing, event-driven processing, and energy-efficient AI systems. The framework includes neuron models, surrogate gradient training methods, encoding strategies, network components, and utilities for simulation and experimentation, allowing users to develop a wide variety of spiking architectures. It also supports integration with familiar PyTorch workflows, which lowers the barrier for machine learning practitioners who want to explore spiking approaches without abandoning mainstream tooling.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    UltraRAG

    UltraRAG

    Less Code, Lower Barrier, Faster Deployment

    UltraRAG 2.0 is a low-code, MCP-enabled RAG framework that aims to lower the barrier to building complex retrieval pipelines for research and production. It provides end-to-end recipes—from encoding and indexing corpora to deploying retrievers and LLMs—so users can reproduce baselines and iterate rapidly. The toolkit comes with built-in support for popular RAG datasets, large corpora, and canonical baselines, plus documentation that walks from “quick start” to debugging and case analysis. It encourages pipeline composition via configuration, enabling researchers to swap retrievers, rerankers, and generators without heavy refactoring. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Instant Neural Graphics Primitives

    Instant Neural Graphics Primitives

    Instant neural graphics primitives: lightning fast NeRF and more

    ...The system implements several neural graphics primitives including neural radiance fields, signed distance functions, neural images, and neural volumes. These representations are trained using a compact neural network combined with a multiresolution hash encoding that dramatically accelerates both training and rendering processes. The framework is capable of reconstructing detailed 3D scenes from images and generating realistic views of those scenes in real time. Compared with earlier neural radiance field approaches, instant-ngp significantly reduces training time and computational requirements, enabling models to be trained within seconds or minutes on modern GPUs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    Token-Oriented Object Notation

    Token-Oriented Object Notation

    Token-Oriented Object Notation (TOON)

    ...This design allows prompts containing structured data to use significantly fewer tokens, which can reduce inference costs and improve efficiency in LLM applications. The project includes a formal specification, encoding rules, and reference implementations that developers can use to serialize and parse TOON data in their applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    FastVLM

    FastVLM

    This repository contains the official implementation of FastVLM

    FastVLM is an efficiency-focused vision-language modeling stack that introduces FastViTHD, a hybrid vision encoder engineered to emit fewer visual tokens and slash encoding time, especially for high-resolution images. Instead of elaborate pruning stages, the design trades off resolution and token count through input scaling, simplifying the pipeline while maintaining strong accuracy. Reported results highlight dramatic speedups in time-to-first-token and competitive quality versus contemporary open VLMs, including comparisons across small and larger variants. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    UForm

    UForm

    Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion

    ...It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained features but often neglect fine-grained ones. This type of models is well-suited for retrieval in large collections. The most famous example of such models is CLIP by OpenAI. Early-fusion models encode both modalities jointly so they can take into account fine-grained features. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Tiny CUDA Neural Networks

    Tiny CUDA Neural Networks

    Lightning fast C++/CUDA neural network framework

    ...Lower-end cards must reduce the n_neurons parameter or use the CutlassMLP (better compatibility but slower) instead. tiny-cuda-nn comes with a PyTorch extension that allows using the fast MLPs and input encodings from within a Python context. These bindings can be significantly faster than full Python implementations; in particular for the multiresolution hash encoding.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    caret

    caret

    caret (Classification And Regression Training) R package

    The caret (Classification And Regression Training) R package streamlines the process of building predictive machine learning models. It provides uniform interfaces for model training, tuning, evaluation, preprocessing, and variable importance. With support for over 200 models, caret is foundational for R workflows in modeling and machine learning.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    reverse-SynthID

    reverse-SynthID

    Reverse engineering Gemini's SynthID detection

    Reverse-SynthID is a research-focused project that analyzes and reverse-engineers Google’s SynthID watermarking system used in AI-generated images. It leverages signal processing and spectral analysis techniques to identify hidden watermark patterns without access to proprietary encoding methods. The project introduces a multi-resolution “SpectralCodebook” that maps watermark characteristics across different image sizes. Using this approach, it can detect SynthID watermarks with high accuracy and selectively reduce or remove them through frequency-domain manipulation. Unlike traditional image degradation methods, it performs targeted, minimally invasive adjustments that preserve image quality. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Transformers.jl

    Transformers.jl

    Julia Implementation of Transformer models

    Transformers.jl is a Julia library that implements Transformer models for natural language processing tasks. Inspired by architectures like BERT, GPT, and T5, the library offers a modular and flexible interface for building, training, and using transformer-based deep learning models. It supports training from scratch and fine-tuning pretrained models, and integrates with Flux.jl for automatic differentiation and optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    SentencePiece

    SentencePiece

    Unsupervised text tokenizer for Neural Network-based text generation

    SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabulary size is predetermined prior to the neural model training. SentencePiece implements subword units (e.g., byte-pair-encoding (BPE) [Sennrich et al.]) and unigram language model [Kudo.]) with the extension of direct training from raw sentences. SentencePiece allows us to make a purely end-to-end system that does not depend on language-specific pre/postprocessing. Purely data driven, sentencePiece trains tokenization and detokenization models from sentences. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations. The design tackles long-standing...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    WavTokenizer

    WavTokenizer

    SOTA discrete acoustic codec models with 40/75 tokens per second

    WavTokenizer is a state-of-the-art discrete acoustic codec designed specifically for audio language modeling, capable of compressing 24 kHz audio into just 40 or 75 tokens per second while preserving high perceptual quality. It is built to represent speech, music, and general audio with extremely low bitrate, making it ideal as a front-end for large audio language models like GPT-4o and similar architectures. The model uses a single-quantizer design together with temporal compression to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    AutoMLPipeline.jl

    AutoMLPipeline.jl

    Package that makes it trivial to create and evaluate machine learning

    ...To illustrate, here is a pipeline expression and evaluation of a typical machine learning workflow that extracts numerical features (numf) for ica (Independent Component Analysis) and pca (Principal Component Analysis) transformations, respectively, concatenated with the hot-bit encoding (ohe) of categorical features (catf) of a given data for rf (Random Forest) modeling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Vision Transformer Pytorch

    Vision Transformer Pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA

    This repository provides a from-scratch, minimalist implementation of the Vision Transformer (ViT) in PyTorch, focusing on the core architectural pieces needed for image classification. It breaks down the model into patch embedding, positional encoding, multi-head self-attention, feed-forward blocks, and a classification head so you can understand each component in isolation. The code is intentionally compact and modular, which makes it easy to tinker with hyperparameters, depth, width, and attention dimensions. Because it stays close to vanilla PyTorch, you can integrate custom datasets and training loops without framework lock-in. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LLaMA-Mesh

    LLaMA-Mesh

    Unifying 3D Mesh Generation with Language Models

    LLaMA-Mesh is a research framework that extends large language models so they can understand and generate 3D mesh data alongside text. The system introduces a method for representing 3D meshes in a textual format by encoding vertex coordinates and face definitions as sequences that can be processed by a language model. By serializing 3D geometry into text tokens, the approach allows existing transformer architectures to generate and interpret 3D models without requiring specialized visual tokenizers. The project includes a supervised fine-tuning dataset composed of interleaved text and mesh data, allowing the model to learn relationships between textual descriptions and 3D structures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Chinese-LLaMA-Alpaca-3

    Chinese-LLaMA-Alpaca-3

    Chinese Llama-3 LLMs) developed from Meta Llama 3

    Chinese-LLaMA-Alpaca-3 is an open-source project that provides Mandarin-focused large language models based on Meta’s LLaMA-3 architecture, with both foundational and instruction-tuned variants to support high-quality Chinese natural language understanding and generation. It extends the original LLaMA models with expanded Chinese vocabularies and additional pretraining on Chinese corpora to improve semantic encoding and decoding specifically for Chinese text. Alongside the base models, the project also releases Chinese Alpaca models that are fine-tuned on instruction datasets so they behave more like conversational and instruction-following AI assistants. It includes scripts and tooling that let researchers or developers run training, fine-tuning, quantization, and deployment on local machines (CPU or GPU), making experimentation and testing accessible without requiring large clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    minbpe is a minimal, clean implementation of byte-level Byte Pair Encoding (BPE), the tokenization approach widely used in modern language models. It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into token IDs and decode tokens back into text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Streamline Analyst

    Streamline Analyst

    AI agent that streamlines the entire process of data analysis

    Streamline Analyst is a cutting-edge, open-source application powered by Large Language Models (LLMs) designed to revolutionize data analysis. This Data Analysis Agent effortlessly automates all the tasks such as data cleaning, preprocessing, and even complex operations like identifying target objects, partitioning test sets, and selecting the best-fit models based on your data. With Streamline Analyst, results visualization and evaluation become seamless.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    GPT-2

    GPT-2

    Code for the paper Language Models are Unsupervised Multitask Learners

    This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo