Showing 28 open source projects for "text encoding"

View related business solutions
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    Tiktoken

    Tiktoken

    tiktoken is a fast BPE tokeniser for use with OpenAI's models

    tiktoken is a high-performance, tokenizer library (based on byte-pair encoding, BPE) designed for use with OpenAI’s models. It handles encoding and decoding text to token IDs efficiently, with minimal overhead. Because tokenization is a fundamental step in preparing text for models, tiktoken is optimized for speed, memory, and correctness in model contexts (e.g. matching OpenAI’s internal tokenization). The repo supports multiple encodings (e.g.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    ...It implements multiple PDF decompression filters and handles common font encoding pathways, which are essential for turning raw PDF content streams into readable text. It also understands both classic cross-reference tables and newer cross-reference streams, including PDF 1.5+ features, and it offers configurable strict vs permissive error handling depending on whether you prioritize correctness or robustness.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    LLaMA-Mesh

    LLaMA-Mesh

    Unifying 3D Mesh Generation with Language Models

    LLaMA-Mesh is a research framework that extends large language models so they can understand and generate 3D mesh data alongside text. The system introduces a method for representing 3D meshes in a textual format by encoding vertex coordinates and face definitions as sequences that can be processed by a language model. By serializing 3D geometry into text tokens, the approach allows existing transformer architectures to generate and interpret 3D models without requiring specialized visual tokenizers. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    ComfyUI-HunyuanVideoWrapper

    ComfyUI-HunyuanVideoWrapper

    ComfyUI wrapper nodes for HunyuanVideo

    The ComfyUI-HunyuanVideoWrapper project is a ComfyUI extension that integrates Hunyuan-based multimodal video generation models into node-based workflows. It allows users to generate or manipulate video content by combining text prompts with one or more input images, enabling flexible conditioning of outputs. The system introduces specialized nodes such as text-image encoders that allow multiple image inputs to be referenced directly within prompts. This makes it possible to guide generation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 5
    Step-Video-T2V

    Step-Video-T2V

    State-of-the-art (SoTA) text-to-video pre-trained model

    Step-Video-T2V is a state-of-the-art text-to-video foundation model developed to generate videos from natural-language prompts; its 30B-parameter architecture is designed to produce coherent, temporally extended video sequences — up to around 204 frames — based on input text. Under the hood it uses a compressed latent representation (a Video-VAE) to reduce spatial and temporal redundancy, and a denoising diffusion (or similar) process over that latent space to generate smooth, plausible...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    WavTokenizer

    WavTokenizer

    SOTA discrete acoustic codec models with 40/75 tokens per second

    WavTokenizer is a state-of-the-art discrete acoustic codec designed specifically for audio language modeling, capable of compressing 24 kHz audio into just 40 or 75 tokens per second while preserving high perceptual quality. It is built to represent speech, music, and general audio with extremely low bitrate, making it ideal as a front-end for large audio language models like GPT-4o and similar architectures. The model uses a single-quantizer design together with temporal compression to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations. The design tackles long-standing...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 10
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    minbpe is a minimal, clean implementation of byte-level Byte Pair Encoding (BPE), the tokenization approach widely used in modern language models. It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into token IDs and decode tokens back into text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    Step-Audio 2

    Step-Audio 2

    Multi-modal large language model designed for audio understanding

    Step-Audio2 is an advanced, end-to-end multimodal large language model designed for high-fidelity audio understanding and natural speech conversation: unlike many pipelines that separate speech recognition, processing, and synthesis, Step-Audio2 processes raw audio, reasons about semantic and paralinguistic content (like emotion, speaker characteristics, non-verbal cues), and can generate contextually appropriate responses — including potentially generating or transforming audio output. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Chinese-LLaMA-Alpaca-3

    Chinese-LLaMA-Alpaca-3

    Chinese Llama-3 LLMs) developed from Meta Llama 3

    Chinese-LLaMA-Alpaca-3 is an open-source project that provides Mandarin-focused large language models based on Meta’s LLaMA-3 architecture, with both foundational and instruction-tuned variants to support high-quality Chinese natural language understanding and generation. It extends the original LLaMA models with expanded Chinese vocabularies and additional pretraining on Chinese corpora to improve semantic encoding and decoding specifically for Chinese text. Alongside the base models, the project also releases Chinese Alpaca models that are fine-tuned on instruction datasets so they behave more like conversational and instruction-following AI assistants. It includes scripts and tooling that let researchers or developers run training, fine-tuning, quantization, and deployment on local machines (CPU or GPU), making experimentation and testing accessible without requiring large clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    xSTUDIO

    xSTUDIO

    xSTUDIO is a high performance playback and review tool.

    xSTUDIO is a high performance playback and review tool designed by and for Visual Effects, Animation and Post Production professionals. The application can load and play large collections of media files. The efficient playback engine allows you to quickly load and play high resolution image formats with a wide range of file formats and encoding. Intuitive tools allow you to create and organise playlists and media sub-sets within playlists to build interactive review sessions, image and video...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 15
    Bert-VITS2

    Bert-VITS2

    VITS2 backbone with multilingual-bert

    Bert-VITS2 is a neural text-to-speech project that combines a VITS2 backbone with a multilingual BERT front-end to produce high-quality speech in multiple languages. The core idea is to use BERT-style contextual embeddings for text encoding while relying on a refined VITS2 architecture for acoustic generation and vocoding. The repository includes everything needed to train, fine-tune, and run the model, from configuration files to preprocessing scripts, spectrogram utilities, and training entrypoints for multi-GPU and multi-node setups. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    GPT-2

    GPT-2

    Code for the paper Language Models are Unsupervised Multitask Learners

    This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese LLaMA & Alpaca large language model + local CPU/GPU training

    This project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning to further promote the open research of large models in the Chinese NLP community. Based on the original LLaMA , these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    OpenPrompt

    OpenPrompt

    An Open-Source Framework for Prompt-Learning

    ...The template is one of the most important modules in prompt learning, which wraps the original input with textual or soft-encoding sequence. Use the implementations of current prompt-learning approaches.* We have implemented various of prompting methods, including templating, verbalizing and optimization strategies under a unified standard. You can easily call and understand these methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    q - Text as Data

    q - Text as Data

    Run SQL directly on CSV or TSV files

    q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files). q treats ordinary files as database tables, and supports all SQL constructs, such as WHERE, GROUP BY, JOINs etc. It supports automatic column name and column type detection, and provides full support for multiple encodings. q fully supports all types of encoding. Use -e data-encoding to set the input data encoding, -Q query-encoding to set the query encoding, and use -E output-encoding to set the output encoding. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    ...It exists so developers and QA engineers can easily test edge cases that normal test data would miss, such as zero-width characters, right-to-left marks, emojis, foreign alphabets, and long or malformed strings. By throwing these strings at forms, APIs, databases, and UIs, teams can discover encoding bugs, sanitizer gaps, rendering issues, and security oversights early. The list is language-agnostic and repository-friendly, meaning you can consume it from CI pipelines or local scripts with minimal setup. Because it’s crowdsourced, it reflects real issues practitioners have faced in production, not just theoretical cases. Using the list regularly helps harden applications against the fragile edges of text processing and user input.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Texar

    Texar

    Toolkit for Machine Learning, Natural Language Processing

    ...Texar-TensorFlow (this repo) and Texar-PyTorch have mostly the same interfaces. Both further combine the best design of TF and PyTorch. Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    TeXML is an XML vocabulary for TeX. The processor transforms the TeXML markup into the TeX markup, escaping special and out-of-encoding characters. The intended audience is developers who automatically generate [La]TeX or ConTeXt files.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    pyshot, windows session recorder/auditor
    pySHOT is a session recorder for windows. (soon linux session recorder also) It's a client/server python app using gearman. To use pyshot you must install pyshot-client from https://sourceforge.net/projects/pyshot-client/ on monitored server
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    note taking simplified
    *nts* provides a simple format for using text files to store notes, a command line interface for viewing notes in a variety of convenient ways and a cross-platform, wx(python)-based GUI for creating and modifying notes as well as viewing them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB