Showing 56 open source projects for "encoding"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Chinese-LLaMA-Alpaca-3

    Chinese-LLaMA-Alpaca-3

    Chinese Llama-3 LLMs) developed from Meta Llama 3

    Chinese-LLaMA-Alpaca-3 is an open-source project that provides Mandarin-focused large language models based on Meta’s LLaMA-3 architecture, with both foundational and instruction-tuned variants to support high-quality Chinese natural language understanding and generation. It extends the original LLaMA models with expanded Chinese vocabularies and additional pretraining on Chinese corpora to improve semantic encoding and decoding specifically for Chinese text. Alongside the base models, the project also releases Chinese Alpaca models that are fine-tuned on instruction datasets so they behave more like conversational and instruction-following AI assistants. It includes scripts and tooling that let researchers or developers run training, fine-tuning, quantization, and deployment on local machines (CPU or GPU), making experimentation and testing accessible without requiring large clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    minbpe is a minimal, clean implementation of byte-level Byte Pair Encoding (BPE), the tokenization approach widely used in modern language models. It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into token IDs and decode tokens back into text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Streamline Analyst

    Streamline Analyst

    AI agent that streamlines the entire process of data analysis

    Streamline Analyst is a cutting-edge, open-source application powered by Large Language Models (LLMs) designed to revolutionize data analysis. This Data Analysis Agent effortlessly automates all the tasks such as data cleaning, preprocessing, and even complex operations like identifying target objects, partitioning test sets, and selecting the best-fit models based on your data. With Streamline Analyst, results visualization and evaluation become seamless.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    GPT-2

    GPT-2

    Code for the paper Language Models are Unsupervised Multitask Learners

    This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 6
    Bert-VITS2

    Bert-VITS2

    VITS2 backbone with multilingual-bert

    Bert-VITS2 is a neural text-to-speech project that combines a VITS2 backbone with a multilingual BERT front-end to produce high-quality speech in multiple languages. The core idea is to use BERT-style contextual embeddings for text encoding while relying on a refined VITS2 architecture for acoustic generation and vocoding. The repository includes everything needed to train, fine-tune, and run the model, from configuration files to preprocessing scripts, spectrogram utilities, and training entrypoints for multi-GPU and multi-node setups. It provides emotional modeling through “emo embeddings,” allowing voices to be conditioned on different affective states during synthesis. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    ConsistencyDecoder

    ConsistencyDecoder

    Consistency Distilled Diff VAE

    ...Instead of relying solely on the standard GAN or VAE decoder, this approach leverages a Consistency Distilled Diff VAE, designed to produce higher-quality and more stable outputs from encoded latents. The project provides a simple API for encoding with a Stable Diffusion VAE and decoding using the new consistency model, allowing for side-by-side comparisons with traditional decoders. It demonstrates how consistency models can enhance visual fidelity while maintaining efficiency, reducing artifacts common in GAN-decoded outputs. The repository includes installation instructions, usage examples, and visual comparisons to highlight improvements. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese LLaMA & Alpaca large language model + local CPU/GPU training

    This project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning to further promote the open research of large models in the Chinese NLP community. Based on the original LLaMA , these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 10
    minGPT

    minGPT

    A minimal PyTorch re-implementation of the OpenAI GPT

    minGPT is a minimalist, educational re-implementation of the GPT (Generative Pretrained Transformer) architecture built in PyTorch, designed by Andrej Karpathy to expose the core structure of a transformer-based language model in as few lines of code as possible. It strips away extraneous bells and whistles, aiming to show how a sequence of token indices is fed into a stack of transformer blocks and then decoded into the next token probabilities, with both training and inference supported....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Mozilla JPEG Encoder Project

    Mozilla JPEG Encoder Project

    Improved JPEG encoder

    ...We include a demo cjpeg command-line tool, but it's not intended for serious use. We encourage authors of graphics programs to use libjpeg's C API and link with MozJPEG library instead. Progressive encoding with "jpegrescan" optimization. It can be applied to any JPEG file (with jpegtran) to losslessly reduce file size.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 12
    Auto-PyTorch

    Auto-PyTorch

    Automatic architecture search and hyperparameter optimization

    While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed Auto-PyTorch, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL). Auto-PyTorch is mainly developed to support tabular data (classification, regression) and time series...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Alphafold2

    Alphafold2

    Unofficial Pytorch implementation / replication of Alphafold2

    ...Deepmind has open sourced the official code in Jax, along with the weights! This repository will now be geared towards a straight pytorch translation with some improvements on positional encoding. lhatsk has reported training a modified trunk of this repository, using the same setup as trRosetta, with competitive results. The underlying assumption is that the trunk works on the residue level, and then constitutes to atomic level for the structure module, whether it be SE3 Transformers, E(n)-Transformer, or EGNN doing the refinement.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    avio

    Python version of ffplay with built-in AI

    See the Files tab above for installation instructions
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    OpenPrompt

    OpenPrompt

    An Open-Source Framework for Prompt-Learning

    ...In the future, we will also support PLMs implemented by other libraries. The template is one of the most important modules in prompt learning, which wraps the original input with textual or soft-encoding sequence. Use the implementations of current prompt-learning approaches.* We have implemented various of prompting methods, including templating, verbalizing and optimization strategies under a unified standard. You can easily call and understand these methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. Video recognition does not need to upload the original video, which is convenient, fast and time-saving.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 17
    Kite

    Kite

    Primary Kite repo, private bits replaced with XXXXXXX

    ...However, we do have cloud instances & VMs available for running larger jobs and for testing our cloud services. We bundle a lot of pre-computed datasets & machine learning models into the Kite app through the use of a custom filemap & encoding on top of go-bindata. The data, located in kite-go/client/datadeps, is kept in Git-LFS.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Feature-engine

    Feature-engine

    Feature engineering package with sklearn like functionality

    Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow Scikit-learn's functionality with fit() and transform() methods to learn the transforming parameters from the data and then transform it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Face recognition with mask

    Face recognition with mask

    戴口罩也變識得出的face recognition 將大頭照放images 下, 用人名命名 主畫面,點選encoding,將人臉特徵編碼 就可以在即時的webcam畫面看到便識結果 Face recognition that can be learned by wearing a mask Put the photo under images and name it Main screen, click encoding to encode facial features You can see the result of the recognition on the real-time webcam screen
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    YouTokenToMe

    YouTokenToMe

    Unsupervised text tokenizer focused on computational efficiency

    YouTokenToMe is a fast and efficient unsupervised text tokenization library designed for training subword embeddings, particularly useful for NLP models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Texar

    Texar

    Toolkit for Machine Learning, Natural Language Processing

    ...Texar-TensorFlow (this repo) and Texar-PyTorch have mostly the same interfaces. Both further combine the best design of TF and PyTorch. Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    automl-gs

    automl-gs

    Provide an input CSV and a target field to predict, generate a model

    Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learning model plus native Python code pipelines allowing you to integrate that model into any prediction workflow. No black box: you can see exactly how the data is processed, and how the model is constructed, and you can make tweaks as necessary. automl-gs is an AutoML tool which, unlike Microsoft's NNI, Uber's Ludwig, and TPOT, offers a zero code/model...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Seq2seq Chatbot for Keras

    Seq2seq Chatbot for Keras

    This repository contains a new generative model of chatbot

    ...The architecture presented here assumes the same prior distributions for input and output words. Therefore, it shares an embedding layer (Glove pre-trained word embedding) between the encoding and decoding processes through the adoption of a new model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex compounding or character encoding. Hunspell interfaces: Curses, Ispell compatible pipe interface, OpenOffice.org UNO module
    Leader badge
    Downloads: 284 This Week
    Last Update:
    See Project
  • 25

    Darkbot

    The IRC's Talking Robot

    [ Please read https://sourceforge.net/p/darkbot/news/2014/01/darkbots-revitalization/ ] Darkbot is a portable IRC chat robot written in the C language that can be taught responses to user inquiries, and even have conversations with them. Darkbot was originally created by Jason Hamilton as an aid for help channels on Intenet Relay Chat.
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
Auth0 Logo