|
From: Mojca M. <moj...@gm...> - 2015-11-24 22:31:28
|
On 24 November 2015 at 23:15, Eric S. Raymond <es...@th...> wrote: > Daniel J Sebald <dan...@ie...>: >> >But the real show-stopper is that, if this project is anything like >> >typical, a single ChangeLog entry often actually summarizes work from >> >*multiple* commits. It may not be clear which ones, since one of >> >the bedevilling quirks of older CVSes was that commit timestamps were >> >take *client-side* and tended to be flaky. >> >> Could the documentation be integrated into the git commit history in an >> approximate way? Say, associate all the comments for a particular day with >> the last CVS entry for that date? But it would have to be that the full >> Changelog header and date appears in the comment. So for example, one >> commit message might have three Changelog entries listed. It would only be >> an approximate alignment of commit messages and the changelog, but anyone >> who would go through the history to manually decipher and reconstruct things >> would be faced with an approximate association with what is in CVS anyway. > > Yes, something like that could be attempted. The reason it's never been done > is that the implementation would be a swamp of complexity, and the results of > very dubious quality. > > The easiest way I could imagine to do it (all the alternatives would require > a larger volume of custom code) would be to: > > (1) Write a Python program that could convert the entire sequence of ChangeLog > into a shelf object keyed by date, with the values being a pair consisting of > a name and the comment text. > > (2) Write a custom plugin for reposurgeon that would load the shelf object > produced by the previous program and walk through the commit sequence looking > for where a copy of each item should be inserted. > > The heart of the code would be a predicate that takes as arguments the > following: > > * the git commit date > * the git commit committer ID > * a ChangeLog entry date > * a ChangeLog author ID > > and returns yes or no according as the entry should or should not be appended > to the comment of the specified commit. > > Good luck writing a predicate that produces consistently reasonable results. > Here are some of the complications: > > * ChangeLog entry dates only have resolution to a day > > * The commit dates are unreliable both due to clock skew and unreported > time-zone offsets (remember CVS commit stamps are taken client side) > > * Git committer IDs may not match ChangeLog committer IDs even if they > were the same persion. There are just too many ways for email addresses > and personal names to have variations that are transparent to a human > but not to a string-matching algorithm. > > As an example of the latter, I've often run into situations where a committer > used a correct spelling of his name (featuring, for example, a Latin-1 umlaut) > one context and a plain-ASCII approximtion in the other. > > My prediction is that the attempt will not end well. Honestly I don't see the reason to attempt this. The Gnuplot's ChangeLog seems like a manually written document to me, usually citing the real authors of patches, while the CVS log would mention the committer. Unless someone goes through the list semi-manually, I don't see a way to make this work reliably and I also don't see any reason to do so now as a prerequisite for the conversion. Anyone can check the old ChangeLog and then find the relevant release in the git log if needed. Mojca |