Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level token operations. This allows users to modify not only what is said (the text) but also how it's said: emotion, tone, speaking style, prosody, accent, even paralinguistic cues. Because the model is trained with a “large-margin learning” objective over many synthesized and natural speech samples, it gains robust control over expressive attributes, and can perform iterative editing: e.g. you could record a line, then ask the model to “make it sadder,” “speak slower,” or “change accent to X.”

Features

  • Token-based audio editing: converts speech to discrete token streams for high-level, language-like editing operations on audio
  • Dual-codebook tokenizer design: separates linguistic content and prosody/style — enabling control over both what is said and how it's said
  • Expressive editing: allows modifying emotion, tone, accent, speaking style, prosody, pacing, and other vocal attributes without re-recording
  • Iterative editing workflow: supports multiple rounds of edits — e.g. change style, then adjust emotion, then pace, etc.
  • Zero-shot TTS: generate speech directly from text + optional style/emotion instructions, in a controlled expressive voice
  • Open-source model & code under permissive license — enabling integration, customization, and use in research, creative workflows, or production

Project Samples

Project Activity

See All Activity >

Categories

AI Models

License

Apache License V2.0

Follow Step-Audio-EditX

Step-Audio-EditX Web Site

Other Useful Business Software
Cloud-based help desk software with ServoDesk Icon
Cloud-based help desk software with ServoDesk

Full access to Enterprise features. No credit card required.

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
Try ServoDesk for free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Step-Audio-EditX!

Additional Project Details

Operating Systems

Linux

Programming Language

Python

Related Categories

Python AI Models

Registered

2 days ago