Rudimentary commit message linting
Sending GitHub PRs to the Git mailing list
Brought to you by:
ashba22
Originally created by: dscho
To make it easier for new contributors, we should probably have our own "linter" for commit messages, i.e. validate that at least some of the conventions are met:
tests:), it should continue in lower-caseThere is also the convention that the commit message should talk in the imperative tone (as opposed to the past tense, as many other projects prefer). I am not sure that there is an easy way to test for that.
If any of these conventions are not met, GitGitGadget should talk about that, in a way that helps the contributors (e.g. suggest amending the commit, or using an interactive rebase if it is not the latest commit, force-push to update the PR, etc).
Originally posted by: webstech
There are some github apps for commit linting but they do not appear to match these requirements. Not sure if they can be extended to include these requirements but a config file should be able to control what is needed for a generic linter. Just wondering if this issue is better addressed with a github action responding to a pull request or a push to a pull request versus a specific PR comment. There is also the question of whether it would not be a good commit hook for git in general.
Originally posted by: dscho
I would be in favor, but that GitHub Action needs to be in
master, which is controlled by the Git maintainer, not by GitGitGadget.Sure. But there is an entire ecosystem of
npmpackages out there that Unix shell scripts typically cannot access (read: the common Git hooks), but GitGitGadget can.Originally posted by: webstech
I have an initial POC/WIP at https://github.com/webstech/gitgitgadget/commit/888c4f53429f6f5801c89f6e7319e9f24a380713. The message linting is done in a separate class with methods to perform one or more checks based on complexity. Just want to make sure you agree on the basic design before proceeding further. As always, suggestions are welcome.
Originally posted by: dscho
Very nice! I left a couple of comments there.
That makes a total lot of sense!
I love it.
At some stage, I think we will want to make the tests a bit more DRY and maybe split up
ci-helper.tests.tsto allow for more parallel testing (or does Jest already allow parallelizing tests within a single script?), but that can wait for later, it is not pressing yet, I would think.Originally posted by: webstech
Thanks for the input.
I agree on splitting up ci-helper.test.ts and reducing the DRY of the pieces. The q&d of getting tests in place makes it easy to end up with bloated test code. I am thinking about this (and possible defaults mentioned in your comments on the WIP). This is post this change.
I have pushed an update to the WIP. If you have no major issues (or in spite of, to make them more public), I will open a PR after doing some code/doc cleanup.
Just trying to get the linting framework agreed to and in place. It would help if you can prioritize the initial list above (you are more familiar with the concerns/issues from the git list).
After that we can all compete to implement new linting asap. \<insert laughing emoji here/>
One linting request is for imperative - would that mean a negative dictionary for possessive nouns? Sounds like this could be a separate project (if one is not already out there).
Originally posted by: dscho
Could you open the PR right away? That would make it much easier for me to keep track of changes. It does not have to be perfect, of course, if you want to, you can even mark it as a Draft PR.
🤣
I think that people already tried to work on this, see e.g. https://github.com/spencermountain/compromise/issues/499. It also looks as if https://www.npmjs.com/package/node-nlp should have something to help with that task, but I did not really find anything.
Originally posted by: dscho
I guess https://github.com/clns/node-commit-msg would be a much more pragmatic route to take ;-)
Originally posted by: webstech
Java required but no source for the parser? Didn't really look at what the parser is supposed to do.
Originally posted by: dscho
Oy, sorry, I failed to see that it is not implemented in Javascript :-(
OTOH I wonder whether
parser.jaris simply a pre-compiled version of the Stanford Parser, as mentioned in the npm package description:I found a couple other npm modules to validate commit messages, but they seem to follow the "Conventional Commits" guidelines (which disagree with Git's own requirements for commit messages).
In any case, this is not relevant for the initial commit linting ;-)
Originally posted by: dscho
I experimented with
wink-pos-tagger:outputs
Note the tag
VBP, which according to https://github.com/trinker/tagger#interpreting-tags means "verb, present tense, not 3rd person singular".Replacing the word
havewithhasturns this tag intoVBZwhich means "verb, present tense, 3rd person singular".It might be possible to come up with a set of simple rules what
VB*tags we want to allow in the first line of the commit message. My idea was to experiment with this a bit by using Git's own commit history and just generating some statistics next, like, which tags occur how often, then see if e.g. "VBD,VBGandVBNare not okay" would result in any false positives (I am more concerned with suppressing false positives than with preventing false negatives).