I dont know what it is but no matter what file i give it either a video or audio the transcription is not accuarate at all, even tho that the files are clear english it's really bad. do i have to train it? if yes can someone tell me how to do it in python?
that's my code bellow ↓↓↓

also its 2024 arent you planning on changing the conversation over to slack or some else better?