With just 2 clicks (not including Colab auth process), the 1.5B pretrained Chinese model demo is ready to go. The contents in this repository are for academic research purpose, and we do not provide any conclusive remarks. Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC) Simplifed GPT2 train scripts(based on Grover, supporting TPUs). Ported bert tokenizer, multilingual corpus compatible. 1.5B GPT2 pretrained Chinese model (~15G corpus, 10w steps). Batteries-included Colab demo. 1.5B GPT2 pretrained Chinese model (~30G corpus, 22w steps).
Features
- Simplifed GPT2 train scripts(based on Grover, supporting TPUs)
- Ported bert tokenizer, multilingual corpus compatible
- 1.5B GPT2 pretrained Chinese model ( ~15G corpus, 10w steps )
- Batteries-included Colab demo
- 1.5B GPT2 pretrained Chinese model ( ~30G corpus, 22w steps )
- Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC)
License
Apache License V2.0Follow GPT2 for Multiple Languages
Other Useful Business Software
MongoDB Atlas runs apps anywhere
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of GPT2 for Multiple Languages!