We have to specifically write in the user guide that "when you upload the transcript in .txt or .srt file, please make sure that the file is saved in UTF-8 or unicode8 format". Otherwise, you might get some invalid characters.