Page 5 | python text parser free download

Stable Diffusion

A latent text-to-image diffusion model

Stable Diffusion is a widely used open-source latent text-to-image diffusion model developed by the CompVis group for generating high-quality images from natural language prompts. The model operates by conditioning a diffusion process on text embeddings produced by a CLIP text encoder, enabling detailed and controllable image synthesis. It was trained on large-scale image datasets and later fine-tuned to produce 512×512 images with strong visual fidelity. Because the system runs efficiently...

Downloads: 6 This Week

Last Update: 2026-02-23

See Project

GLIDE (Text2Im)

GLIDE: a diffusion-based text-conditional image synthesis model

glide-text2im is an open source implementation of OpenAI’s GLIDE model, which generates photorealistic images from natural language text prompts. It demonstrates how diffusion-based generative models can be conditioned on text to produce highly detailed and coherent visual outputs. The repository provides both model code and pretrained checkpoints, making it possible for researchers and developers to experiment with text-to-image synthesis. GLIDE includes advanced techniques such as...

Downloads: 0 This Week

Last Update: 2 days ago

See Project

GPT Neo

An implementation of model parallel GPT-2 and GPT-3-style models

An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very...

Downloads: 1 This Week

Last Update: 2023-03-24

See Project

SG2Im

Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201

sg2im is a research codebase that learns to synthesize images from scene graphs—structured descriptions of objects and their relationships. Instead of conditioning on free-form text alone, it leverages graph structure to control layout and interactions, generating scenes that respect constraints like “person left of dog” or “cup on table.” The pipeline typically predicts object layouts (bounding boxes and masks) from the graph, then renders a realistic image conditioned on those layouts....

Downloads: 0 This Week

Last Update: 2025-10-10

See Project

Retrieval-Based Conversational Model

Dual LSTM Encoder for Dialog Response Generation

Retrieval-Based Conversational Model in Tensorflow is a project implementing a retrieval-based conversational model using a dual LSTM encoder architecture in TensorFlow, illustrating how neural networks can be trained to select appropriate responses from a fixed set of candidate replies rather than generate them from scratch. The core idea is to embed both the conversation context and potential replies into vector representations, then score how well each candidate fits the current dialogue,...

Downloads: 0 This Week

Last Update: 2026-02-13

See Project

Dia-1.6B

Dia-1.6B generates lifelike English dialogue and vocal expressions

Dia-1.6B is a 1.6 billion parameter text-to-speech model by Nari Labs that generates high-fidelity dialogue directly from transcripts. Designed for realistic vocal performance, Dia supports expressive features like emotion, tone control, and non-verbal cues such as laughter, coughing, or sighs. The model accepts speaker conditioning through audio prompts, allowing limited voice cloning and speaker consistency across generations.

Downloads: 0 This Week

Last Update: 2025-06-27

See Project

mms-300m-1130-forced-aligner

CTC-based forced aligner for audio-text in 158 languages

mms-300m-1130-forced-aligner is a multilingual forced alignment model based on Meta’s MMS-300M wav2vec2 checkpoint, adapted for Hugging Face’s Transformers library. It supports forced alignment between audio and corresponding text across 158 languages, offering broad multilingual coverage. The model enables accurate word- or phoneme-level timestamping using Connectionist Temporal Classification (CTC) emissions. Unlike other tools, it provides significant memory efficiency compared to the TorchAudio forced alignment API. Users can integrate it easily through the Python package ctc-forced-aligner, and it supports GPU acceleration via PyTorch. ...

Downloads: 0 This Week

Last Update: 2025-07-02

See Project

OpenVLA 7B

Vision-language-action model for robot control via images and text

OpenVLA 7B is a multimodal vision-language-action model trained on 970,000 robot manipulation episodes from the Open X-Embodiment dataset. It takes camera images and natural language instructions as input and outputs normalized 7-DoF robot actions, enabling control of multiple robot types across various domains. Built on top of LLaMA-2 and DINOv2/SigLIP visual backbones, it allows both zero-shot inference for known robot setups and parameter-efficient fine-tuning for new domains. The model...

Downloads: 0 This Week

Last Update: 2025-07-23

See Project

Search Results for "python text parser" - Page 5

Showing 108 open source projects for "python text parser"

Stable Diffusion

GLIDE (Text2Im)

GPT Neo

SG2Im

Retrieval-Based Conversational Model

Dia-1.6B

mms-300m-1130-forced-aligner

OpenVLA 7B

Search Results for "python text parser" - Page 5

Showing 108 open source projects for "python text parser"

Stable Diffusion

GLIDE (Text2Im)

GPT Neo

SG2Im

Retrieval-Based Conversational Model

Dia-1.6B

mms-300m-1130-forced-aligner

OpenVLA 7B

Related Searches

Related Categories