Dia-1.6B generates lifelike English dialogue and vocal expressions
Very lightweight and minimalistic python browser with webkit
converts csv files into one (or more if splitted) xls file(s)
CTC-based forced aligner for audio-text in 158 languages
Vision-language-action model for robot control via images and text