On Oct 30, 2011, at 1:19, humanengr wrote:
>> I'm not sure how you calculate the sha1 hash.
> I'm using openssl sha1 <<file>>.
>> But usually it's calculated from the data fork content. That changes
>> when you save embedded notes, but not when you save just as PDF
>> (i.e. not embedded or without notes). (Unless you've converted
> Yes, that's what I'm seeing. The odd thing is that if one takes, say,
> an rtf and calculates a hash, edits, saves (yielding a different
> hash), and undoes the edit, the hash is restored to the original
> value. But the corresponding process with a pdf (adding a Skim note,
> embedding, and removing) doesn't restore the hash. It's as if the
> process of editing leaves a "residue".
> If it's not restored to the original value, it's missed as a
> duplicate when I use, say FileBuddy or EagleFiler. My tentative
> solution is to rely on other metadata to identify possible duplicates.
PDF data is far from uniquely determined by it's content of information. So there is no reason why the data of the same PDF saved at different times will produce the exact same data (and when using different programs/libraries it will be even less unique). This is very different for plain text and RTF. So there's nothing odd about it, it's the way it is and what you should expect.