Audience
Anyone interested in an open source AI chatbot
About StableVicuna
StableVicuna is the first large-scale open source chatbot trained via reinforced learning from human feedback (RHLF). StableVicuna is a further instruction fine tuned and RLHF trained version of Vicuna v0 13b, which is an instruction fine tuned LLaMA 13b model.
In order to achieve StableVicuna’s strong performance, we utilize Vicuna as the base model and follow the typical three-stage RLHF pipeline outlined by Steinnon et al. and Ouyang et al. Concretely, we further train the base Vicuna model with supervised finetuning (SFT) using a mixture of three datasets:
OpenAssistant Conversations Dataset (OASST1), a human-generated, human-annotated assistant-style conversation corpus comprising 161,443 messages distributed across 66,497 conversation trees, in 35 different languages;
GPT4All Prompt Generations, a dataset of 437,605 prompts and responses generated by GPT-3.5 Turbo;
And Alpaca, a dataset of 52,000 instructions and demonstrations generated by OpenAI's text-davinci-003.