ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
Features
- Human preference reward model trained on 137k expert comparisons
- Outperforms CLIP, Aesthetic, and BLIP in preference scoring accuracy
- Easy-to-use Python package for scoring and ranking generated images
- Reward Feedback Learning (ReFL) for fine-tuning diffusion models with preference signals
- Integration into Stable Diffusion WebUI for auto-scoring and filtering images
- Full training and evaluation scripts to reproduce published benchmarks