When updocing a website, most of the time is spent doing an html-to-text conversion of pages that have no embedded updoc test cases. It would be good for Updoc to remember which pages these are, so it can skip them quickly. To do this robustly, it should maintain a list of the cryptohashes of these pages (not because there's a security issue, but because cryptohashes don't collide).
Note that pages that do contain test cases should still, of course, be rerun, even if they haven't changed.