The general idea is to store the test results for selected revisions at server. There they can be reviewed manually. If result images of a test case are equal, they share the same mark.
Client-side code  is based on Guiu Rocafort's solution. I added a module that packs the output directory, adds some metadata and sends them to server. Instead of provided input files, I use vanilla W3C test cases from . Server-side application  is implemented as Django project. It stores information about test cases, test results and image marks in database. Tests results can be browsed and reviewed by web interface. Changing image mark is available for logged-in users only.
To aid in reviewing images, I used image comparison functions from resemble.js library. When inkscape generated image is first clicked, it is replaced by raw difference with reference image. Second click switches to a diff ignoring antialiasing. Third click returns to original image.
The next step will be to implement the generation of reference image library at server. It would consist of known passes and known fails. When a local testsuite will be run, client will download up-to-date library and compare test results to reference images. It can also calculate regressions and new passes since the latest revision stored at server.
I have noticed that test results are greatly influenced by used fonts (e.g ). Since the test fonts are supplied with W3C test suite, it would be nice to use them. Does Inkscape support SVG and Woff fonts? If so, is there a command line switch or environment variable allowing to select the font directory? Do you think it would be sensible to add such option?
I haven't yet rated the images. I'll do it over the weekend.