Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Also supports saving captions for url+caption datasets.
Opt-out directives:
Websites can pass the http headers X-Robots-Tag: noai, X-Robots-Tag: noindex , X-Robots-Tag: noimageai and X-Robots-Tag: noimageindex By default img2dataset will ignore images with such headers.
License
MIT LicenseOther Useful Business Software
Gemini 3 and 200+ AI Models on One Platform
Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of img2dataset!