A subtitle generator for Japanese Adult Videos.
Transformer-based ASR architectures like Whisper suffer significant performance degradation when applied to the spontaneous and noisy domain of JAV. This degradation is driven by specific acoustic and temporal characteristics that defy the statistical distributions of standard training data.

Features

  • Non-Verbal Vocalisations (NVVs): A high density of physiological sounds (heavy breathing, gasps, sighs) and "obscene sounds" that lack clear harmonic structure.
  • Spectral Mimicry: These vocalizations often possess "curve-like spectrum features" that mimic the formants of fricative consonants or Japanese syllables (e.g., fu), acting as accidental adversarial examples that trick the model into recognizing words where none exist.
  • Extreme Dynamics: Volatile shifts in audio intensity, ranging from faint whispers (sasayaki) to high-decibel screams, which confuse standard gain control and attention mechanisms.
  • Linguistic Variance: The prevalence of theatrical onomatopoeia and Role Language (Yakuwarigo) containing exaggerated intonations and slang absent from standard corpora.
  • Acoustic Filtering: Deploys scene-based segmentation and VAD clamping under the hypothesis that distinct scenes possess uniform acoustic characteristics, ensuring the model processes coherent audio environments rather than mixed streams [1-3].
  • Linguistic Adaptation: Normalizes domain-specific terminology and preserves onomatopoeia, specifically correcting dialect-induced tokenization errors (e.g., in Kansai-ben) that standard BPE tokenizers fail to parse [4, 5].
  • Defensive Decoding: Tunes log-probability thresholding and no_speech_threshold to systematically discard low-confidence outputs (hallucinations), while utilizing regex filters to clean non-lexical markers (e.g., (moans)) from the final subtitle track [6, 7].

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow WhisperJAV

WhisperJAV Web Site

Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5

User Reviews

  • good subtitle generator
Read more reviews >

Additional Project Details

Operating Systems

Windows

Programming Language

Python

Related Categories

Python Speech Recognition Software

Registered

2026-01-01