We were seeing test3 "Get the /MediaBox of pdf files" fail on the Ubuntu autopkgtest infrastructure -- first the test was producing unexpected stderr of "XtCreatePopupShell requires non-NULL parent", which I think is because the code is trying to report an error without the UI being around. I changed test3 to set "update_figs" to 1, which caused this message on stderr:
Cannot open pipe with command:
gs -q -dNODISPLAY "--permit-file-read=../.././data/cross.pdf" -c "(../.././data/cross.pdf) (r) file runpdfbegin 1 pdfgetpage /MediaBox pget pop == quit"
Adding some printfs to u_ghostscript suggested that gs --version was dying with SIGPIPE (possibly because in this case it was printing "9.53.3" and the fscanf u_ghostscript is doing only reads the "9.53" part, although that doesn't seems a bit strange). Anyway, I added "(void) signal(SIGPIPE, SIG_IGN);" to test3.c and the test started passing again. I'll attach the patch but it's pretty trivial.
Thank you for the detailed report. This ticket comprises three issue,
n = fscanf(fp, "%lf", &rev);the linewhile (fgetc(fp) != EOF) ;? This avoids the broken pipe in the first place. fscanf only reads the "9.53" part of the output "9.53.3", hence the output from the pipe is not entirely consumed and SIGPIPE results. I will have to look also for other places in the code where the same might happen.That might be my fault: I build xfig in the Debian package with libgs.
But in the autopkgtest I forgot to depend on libgs-dev, so in the testing environment we have a different package set installed than when building xfig.
As the result test3 behaves different in the autopkgtest than in the build environment.
Maybe I should add test1, test2, test3 to the xfig package at build time and restore them o the test dir when doing the autopkgtest. But this means, that we ship three useless test binaries in the xfig package, which I also do not like.
But this may be the only way to make autopkgtest a realistic test for regressions, since building test1/2/3 during the autopkgtest isn't a realistic test for xfig binary.
But my main problem currently is, that I cannot reproduce the issue on my system, which makes it hard to check, whether a change solves the issue...
Neither can I reproduce a SIGPIPE failure. Perhaps, some compiler sanitizing flag is needed, an environment variable, who knows.
If the autopkgtest tries to test the distributed binary, tests 1-3 are mostly useless: These tests are useful at compilation time, if at all useful, providing a tiny coverage. Hence I would suggest to omit these tests in autopkgtest. I am not familiar with autopkgtest, though.
In any case, it makes sense to apply the above lines to read the entire output in the pipe. It would be good to test the fix.
I tested removing the SIGPIPE handling and added "while (fgetc(fp) != EOF) ;" in the relevant test and the test passed in the setup where it failed before, so that provides some evidence that my diagnosis was correct. I did notice that gs --version produces its output in two syscalls:
I assume if it did it all in one, the SIGPIPE would never occur in practice. I have no idea what is causing it to occur in our test environment with such consistency -- it still seems very surprising to me.
Looking at how test1/test2/test3 operate I agree that they are not really testing the installed package, which is the goal of an autopkgtest. But I guess they are not without value, as they should still catch a change in a dependency that breaks xfig. And presumably the fact that these tests fail in an autopkgtest but pass during build is just down to luck.
Commits [b1ab4f] and [21059d] should fix the issues that (i) a GUI is demanded on error, and (ii) the broken pipe.
Related
Commit: [21059d]
Commit: [b1ab4f]