Audience
Language processing practitioners and researchers requiring a tool for learning word embeddings and building text classifiers
About fastText
fastText is an open source, free, and lightweight library developed by Facebook's AI Research (FAIR) lab for efficient learning of word representations and text classification. It supports both unsupervised learning of word vectors and supervised learning for text classification tasks. A key feature of fastText is its ability to capture subword information by representing words as bags of character n-grams, which enhances the handling of morphologically rich languages and out-of-vocabulary words. The library is optimized for performance and capable of training on large datasets quickly, and the resulting models can be reduced in size for deployment on mobile devices. Pre-trained word vectors are available for 157 languages, trained on Common Crawl and Wikipedia data, and can be downloaded for immediate use. fastText also offers aligned word vectors for 44 languages, facilitating cross-lingual natural language processing tasks.