Code for "Learning to summarize from human feedback"
The summarize-from-feedback repository implements the methods from the paper “Learning to Summarize from Human Feedback”. Its purpose is to train a summarization model that better aligns with human preferences by first collecting human feedback (comparisons between summaries) to train a reward model, and then fine-tuning a policy (summarizer) to maximize that learned reward. The code includes different stages: a supervised baseline (i.e. standard summarization training), the reward modeling...
... [BATCH], [ANALYZE] and [OPTIMIZE] cases in OpenFOAM and others
'caseGen' allows you to simplify your numerical simulation problem!
Actual its not fairly tested so please contact me if you see any refinements or found bugs.