python text parser free download

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB, designed for real-time CPU-based deployment across diverse platforms.

Downloads: 7 This Week

Last Update: 3 days ago

See Project

MiniCPM-o

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

... text and audio inputs to generate outputs in various forms, including voice cloning, emotion control, and interactive role-playing.

Downloads: 1 This Week

Last Update: 2025-05-15

See Project

Phi-3-MLX

Phi-3.5 for Mac: Locally-run Vision and Language Models

Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.

Downloads: 0 This Week

Last Update: 2025-03-13

See Project

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.

Downloads: 0 This Week

Last Update: 2025-02-16

See Project

Chinese-LLaMA-Alpaca-2 v2.0

Chinese LLaMA & Alpaca large language model + local CPU/GPU training

This project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning to further promote the open research of large models in the Chinese NLP community. Based on the original LLaMA , these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which...

Downloads: 0 This Week

Last Update: 2023-08-21

See Project

VALL-E

PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL...

Downloads: 9 This Week

Last Update: 2023-04-14

See Project

Stable-Dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion

A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. This project is a work-in-progress and contains lots of differences from the paper. The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers). Different from Imagen, Stable-Diffusion is a latent diffusion...

Downloads: 0 This Week

Last Update: 2023-05-15

See Project

Wan2.2

Wan2.2: Open and Advanced Large-Scale Video Generative Model

..., color tone, and more, for high-quality, customizable video styles. The model is trained on significantly larger datasets than its predecessor, greatly enhancing motion complexity, semantic understanding, and aesthetic diversity. Wan2.2 also open-sources a 5-billion parameter high-compression VAE-based hybrid text-image-to-video (TI2V) model that supports 720P video generation at 24fps on consumer-grade GPUs like the RTX 4090. It supports multiple video generation tasks including text-to-video.

1 Review

Downloads: 61 This Week

Last Update: 2025-07-30

See Project

Stable Diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Stable Diffusion Version 2. The Stable Diffusion project, developed by Stability AI, is a cutting-edge image synthesis model that utilizes latent diffusion techniques for high-resolution image generation. It offers an advanced method of generating images based on text input, making it highly flexible for various creative applications. The repository contains pretrained models, various checkpoints, and tools to facilitate image generation tasks, such as fine-tuning and modifying the models...

2 Reviews

Downloads: 56 This Week

Last Update: 2025-02-28

See Project

HunyuanWorld 1.0

Generating Immersive, Explorable, and Interactive 3D Worlds from Words

HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations. This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines...

Downloads: 37 This Week

Last Update: 2025-07-30

See Project

Qwen-Image

Powerful image generation foundation model

Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence. The model excels not only in text rendering but also in a wide range of artistic styles, including...

1 Review

Downloads: 23 This Week

Last Update: 6 days ago

See Project

Qwen3

Powerful large language model (LLM) from Alibaba Cloud

Qwen3 is a cutting-edge large language model (LLM) series developed by the Qwen team at Alibaba Cloud. The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage.

1 Review

Downloads: 25 This Week

Last Update: 2025-07-23

See Project

Wan2.1

Wan2.1: Open and Advanced Large-Scale Video Generative Model

Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text...

1 Review

Downloads: 14 This Week

Last Update: 2025-07-30

See Project

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.

Downloads: 4 This Week

Last Update: 2025-03-19

See Project

Janus-Pro

Janus-Series: Unified Multimodal Understanding and Generation Models

.... Its latest iteration, Janus-Pro, improves on this with a more optimized training strategy, expanded data, and larger model scaling, leading to significant advancements in both multimodal understanding and text-to-image generation.

1 Review

Downloads: 4 This Week

Last Update: 2025-03-04

See Project

Qwen

Qwen (通义千问) chat/pretrained large language model Alibaba Cloud

Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making...

1 Review

Downloads: 2 This Week

Last Update: 2025-03-03

See Project

GPT Neo

An implementation of model parallel GPT-2 and GPT-3-style models

An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very...

Downloads: 6 This Week

Last Update: 2023-03-24

See Project

stable-diffusion-v1-4

Text-to-image diffusion model for high-quality image generation

stable-diffusion-v1-4 is a high-performance text-to-image latent diffusion model developed by CompVis. It generates photo-realistic images from natural language prompts using a pretrained CLIP ViT-L/14 text encoder and a UNet-based denoising architecture. This version builds on v1-2, fine-tuned over 225,000 steps at 512×512 resolution on the “laion-aesthetics v2 5+” dataset, with 10% text-conditioning dropout for improved classifier-free guidance. It is optimized for use with Hugging Face’s...

Downloads: 0 This Week

Last Update: 2025-06-26

See Project

stable-diffusion-xl-base-1.0

Advanced base model for high-quality text-to-image generation

stable-diffusion-xl-base-1.0 is a next-generation latent diffusion model developed by Stability AI for producing highly detailed images from text prompts. It forms the core of the SDXL pipeline and can be used on its own or paired with a refinement model for enhanced results. This base model utilizes two pretrained text encoders—OpenCLIP-ViT/G and CLIP-ViT/L—for richer text understanding and improved image quality. The model supports two-stage generation, where the base model creates initial...

Downloads: 0 This Week

Last Update: 2025-06-26

See Project

Llama-2-7b-hf

Llama-2-7B is a 7B-parameter transformer model for text generation

Llama-2-7B is a foundational large language model developed by Meta as part of the Llama 2 family, designed for general-purpose text generation tasks. It is a 7 billion parameter auto-regressive transformer trained on 2 trillion tokens from publicly available sources, using an optimized architecture without Grouped-Query Attention (GQA). This model is the pretrained version, intended for research and commercial use in English, and can be adapted for downstream applications such as summarization...

Downloads: 0 This Week

Last Update: 2025-06-27

See Project

stable-diffusion-3-medium

Efficient text-to-image model with enhanced quality and typography

Stable Diffusion 3 Medium is a next-generation text-to-image model by Stability AI, designed using a Multimodal Diffusion Transformer (MMDiT) architecture. It offers notable improvements in image quality, prompt comprehension, typography, and computational efficiency over previous versions. The model integrates three fixed, pretrained text encoders—OpenCLIP-ViT/G, CLIP-ViT/L, and T5-XXL—to interpret complex prompts more effectively. Trained on 1 billion synthetic and filtered public images...

Downloads: 0 This Week

Last Update: 2025-06-26

See Project

Llama-2-7b

7B-parameter foundational LLM by Meta for text generation tasks

Llama-2-7B is a foundational large language model developed by Meta as part of the Llama 2 family, designed for general-purpose text generation in English. It has 7 billion parameters and uses an optimized transformer-based, autoregressive architecture. Trained on 2 trillion tokens of publicly available data, it serves as the base for fine-tuned models like Llama-2-Chat. The model is pretrained only, meaning it is not optimized for dialogue but can be adapted for various natural language...

Downloads: 0 This Week

Last Update: 2025-06-27

See Project

GPT-2

GPT-2 is a 124M parameter English language model for text generation

GPT-2 is a pretrained transformer-based language model developed by OpenAI for generating natural language text. Trained on 40GB of internet data from outbound Reddit links (excluding Wikipedia), it uses causal language modeling to predict the next token in a sequence. The model was trained without human labels and learns representations of English that support text generation, feature extraction, and fine-tuning. GPT-2 uses a byte-level BPE tokenizer with a vocabulary of 50,257 and handles...

Downloads: 0 This Week

Last Update: 2025-06-27

See Project

Dia-1.6B

Dia-1.6B generates lifelike English dialogue and vocal expressions

Dia-1.6B is a 1.6 billion parameter text-to-speech model by Nari Labs that generates high-fidelity dialogue directly from transcripts. Designed for realistic vocal performance, Dia supports expressive features like emotion, tone control, and non-verbal cues such as laughter, coughing, or sighs. The model accepts speaker conditioning through audio prompts, allowing limited voice cloning and speaker consistency across generations. It is optimized for English and built for real-time performance...

Downloads: 0 This Week

Last Update: 2025-06-27

See Project

ERNIE-4.5-0.3B-Base-PT

Compact 360M text model with high efficiency and fine-tuning support

ERNIE-4.5-0.3B-Base-PT is a compact, fully dense transformer model with 360 million parameters, optimized for general-purpose text generation tasks. It belongs to the ERNIE 4.5 series by Baidu and leverages advanced pretraining techniques without relying on a Mixture-of-Experts (MoE) structure. The model features 18 transformer layers, 16 attention heads, and a maximum context length of 131,072 tokens, offering strong language understanding for its size. It can be fine-tuned using ERNIEKit...

Downloads: 0 This Week

Last Update: 2025-06-30

See Project

Search Results for "python text parser"

Showing 42 open source projects for "python text parser"

Kitten TTS

MiniCPM-o

Phi-3-MLX

MedicalGPT

Chinese-LLaMA-Alpaca-2 v2.0

VALL-E

Stable-Dreamfusion

Wan2.2

Stable Diffusion

HunyuanWorld 1.0

Qwen-Image

Qwen3

Wan2.1

CSM (Conversational Speech Model)

Janus-Pro

Qwen

GPT Neo

stable-diffusion-v1-4

stable-diffusion-xl-base-1.0

Llama-2-7b-hf

stable-diffusion-3-medium

Llama-2-7b

GPT-2

Dia-1.6B

ERNIE-4.5-0.3B-Base-PT

Search Results for "python text parser"

Showing 42 open source projects for "python text parser"

Kitten TTS

MiniCPM-o

Phi-3-MLX

MedicalGPT

Chinese-LLaMA-Alpaca-2 v2.0

VALL-E

Stable-Dreamfusion

Wan2.2

Stable Diffusion

HunyuanWorld 1.0

Qwen-Image

Qwen3

Wan2.1

CSM (Conversational Speech Model)

Janus-Pro

Qwen

GPT Neo

stable-diffusion-v1-4

stable-diffusion-xl-base-1.0

Llama-2-7b-hf

stable-diffusion-3-medium

Llama-2-7b

GPT-2

Dia-1.6B

ERNIE-4.5-0.3B-Base-PT

Related Searches

Related Categories