|
From: Nicholas N. <n.n...@gm...> - 2023-03-06 11:03:32
|
Hi, Perl was a reasonable choice for `cg_annotate` when I first wrote it 20+ years ago. But it's unfortunate now, with Perl being (a) a pretty weird and horrible language, and (b) moribund. I'd like to rewrite it (and `cg_diff`) in Python, which will make maintenance easier. I see that we already have some Python in Valgrind: `coregrind/m_gdbserver/valgrind-monitor.py` and `coregrind/m_gdbserver/valgrind-monitor-def.py`. Therefore I don't think this should be controversial. But I might as well ask, just in case: any objections or advice? Because these are single file scripts, we avoid all the usual problems of Python packaging, and just use `cp` as the package manager :) On a related note, the `cg_annotate.in`/`cg_annotate` split is annoying. The only reason for it now is to auto-embed the version number into the script, via the configure `@VERSION@` variable, for `cg_annotate --version` output. Does anyone know of a way to achieve that without requiring configure? Thanks. Nick |
|
From: Paul F. <pj...@wa...> - 2023-03-12 12:01:02
|
On 06-03-23 12:03, Nicholas Nethercote wrote: > Hi, > > Perl was a reasonable choice for `cg_annotate` when I first wrote it 20+ > years ago. But it's unfortunate now, with Perl being (a) a pretty weird > and horrible language, and (b) moribund. > > I'd like to rewrite it (and `cg_diff`) in Python, which will make > maintenance easier. I see that we already have some Python in Valgrind: > `coregrind/m_gdbserver/valgrind-monitor.py` and > `coregrind/m_gdbserver/valgrind-monitor-def.py`. Therefore I don't think > this should be controversial. > > But I might as well ask, just in case: any objections or advice? Because > these are single file scripts, we avoid all the usual problems of Python > packaging, and just use `cp` as the package manager :) > > On a related note, the `cg_annotate.in > <http://cg_annotate.in>`/`cg_annotate` split is annoying. The only > reason for it now is to auto-embed the version number into the script, > via the configure `@VERSION@` variable, for `cg_annotate --version` > output. Does anyone know of a way to achieve that without requiring > configure? Hi Nick Scripting languages aren't my strong point, but I'd say for me Python is the least worst choice. The only think I can think of to get the version is to use something like pkg-config --modversion valgrind Cheers Paul |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-14 19:16:17
|
On Sun, 12 Mar 2023 at 23:01, Paul Floyd <pj...@wa...> wrote: > > The only think I can think of to get the version is to use something like > > pkg-config --modversion valgrind > Thanks for the suggestion. Unfortunately this could cause misleading results. E.g. if I have Valgrind installed on my system but I also have a development version, when I run the development version of `cg_annotate --version` it will claim to be the installed version. I think the `@VERSION@` junk is unavoidable. Nick |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-25 00:24:13
|
On Fri, 24 Mar 2023 at 22:52, Mark Wielaard <ma...@kl...> wrote: > > I completely agree with this sentiment. But how do you get there? Ruthless pragmatism and an incremental approach :) > And > how do you cross the psychological barrier. I mean that it feels like > cheating to just disable failing or flaky tests. > They might fail on some, but not all setups. Or they might even be just > flaky depending on CPU model (I think I saw some failures with an AMD > Ryzen processor, which succeeded on an Intel Xeon processor). > > What should our policy be to get to zero fail? > Does that mean a test should always pass on any arch/setup? > Or do we make exceptions for tests that fail on some setups? > Do we keep an "exception list" based on...? > What do we do with the "removed" (or excepted) tests? > Do those turn into high priority bugs instead? > What about new ports, they often start with a bunch of failing tests. > One way to do it is to divide the tests into "must pass on CI" and "the rest". I suspect there are plenty of tests that work on all platforms, which would give a lot of useful coverage from the start. Over time you can hopefully move tests from the first category to the second. The other way to do it is to divide the tests into "run on CI" and "don't run on CI", i.e. exceptions, which does require a mechanism for specifying those exceptions. In practice I think this works out much the same as the first approach, because a test that consistently fails on one platform isn't much use. (In fact, it can have negative value if its presence masks new failures in other tests.) One consequence of all this is that the CI platforms become gospel. E.g. if a test passes on CI but fails locally, that's good enough. This is fine in practice, assuming the CI platforms are reasonable choices. Flaky tests can be a problem. For rare failures you can always just trigger another CI run. For regular failures you should either fix the test or disable it. Nick |
|
From: Mark W. <ma...@kl...> - 2023-04-01 22:56:07
|
Hi Nick, On Sat, Mar 25, 2023 at 11:23:54AM +1100, Nicholas Nethercote wrote: > One way to do it is to divide the tests into "must pass on CI" and "the > rest". I suspect there are plenty of tests that work on all platforms, > which would give a lot of useful coverage from the start. Over time you can > hopefully move tests from the first category to the second. So for the original buildbot CI I set it up to only run the auxtests (that is build gsl and run the testsuite under valgrind). This used to work on all setups, but after a gcc (or glibc?) update some setups started failing as you can see at: https://builder.sourceware.org/buildbot/#/builders?tags=valgrind For the try builders I took out the now failing builders (debian-testing-x86_64, fedora-arm64, fedora-x86_64, ibm-power10, opensusetw-x86_64 and rawhide-x86_64). https://builder.sourceware.org/buildbot/#/builders?tags=valgrind-try Hopefully we can figure out why the auxtests fail now on some of these setups. > The other way to do it is to divide the tests into "run on CI" and "don't > run on CI", i.e. exceptions, which does require a mechanism for specifying > those exceptions. In practice I think this works out much the same as the > first approach, because a test that consistently fails on one platform > isn't much use. (In fact, it can have negative value if its presence masks > new failures in other tests.) So should we make a new check target that runs a subset of make regtests? Or maybe have a "ci mode" for regtests? So you would run CI_MODE=true make regtests and it would skip any vgtest that has prereq: test -z "$CI_MODE" And then add make regtests to the ci and try buildbots? Cheers, Mark |
|
From: Floyd, P. <pj...@wa...> - 2023-04-03 16:04:19
|
On 02/04/2023 00:55, Mark Wielaard wrote: > > So should we make a new check target that runs a subset of make > regtests? Or maybe have a "ci mode" for regtests? So you would run > CI_MODE=true make regtests and it would skip any vgtest that has > prereq: test -z "$CI_MODE" > > And then add make regtests to the ci and try buildbots? How long does make regtest take, compared to make auxtests? There is a patchset in bugzilla to get regtest to run in parallel https://bugs.kde.org/show_bug.cgi?id=319307 that could speed things up. If we do want to have a CI mode the way I'd do it is to add an option to tests/vg_regtest to run tests from a file containing a list and then put a list of tests in tests/ci_list It would be good to do more kinds of testing. Adding unit tests would be a bit of a challenge - we'd probably have to write out own VG_(unit_test) framework. We could do a bit of fuzz testing on the command line interface, but probably not much with the testcases. Fuzzing the testcase source files would be more of a test of the compiler. Maybe ELF fuzzing e.g., https://github.com/IOActive/Melkor_ELF_Fuzzer. On the larger scale, it would be good to be able to run more 3rd party suites with Valgrind. I've run the FreeBSD libc tests by simply running everything under Valgrind. Can we do something similar for glibc or other popular libraries? A+ Paul |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-17 05:24:31
|
I have finished the rewrite. I am happy with the new code, it is much better than the old code. You can see it at https://bugs.kde.org/show_bug.cgi?id=467472. I plan to merge it by the end of next week, and I am happy to hear any suggestions. I also have some good news about the `cg_annotate.in`/`cg_annotate` split. I learned that you can generate the latter from the former very quickly with `config.status cachegrind/cg_annotate.in`. Also, this can be done automatically from some make targets. So I ended up creating a new make target `make ann` that can be run within the `cachegrind` directory. It runs the various Python formatters, type-checkers, and linters I am using on `cg_annotate.in` and then generates `cg_annotate`. It's a one-step "build" command that runs quickly, which is great. Nick On Wed, 15 Mar 2023 at 06:15, Nicholas Nethercote <n.n...@gm...> wrote: > On Sun, 12 Mar 2023 at 23:01, Paul Floyd <pj...@wa...> wrote: > >> >> The only think I can think of to get the version is to use something like >> >> pkg-config --modversion valgrind >> > > Thanks for the suggestion. Unfortunately this could cause misleading > results. E.g. if I have Valgrind installed on my system but I also have a > development version, when I run the development version of `cg_annotate > --version` it will claim to be the installed version. I think the `@VERSION@` > junk is unavoidable. > > Nick > |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-21 23:23:08
|
I have merged the new version of `cg_annotate`: https://sourceware.org/git/?p=valgrind.git;a=commit;h=4650b7949ae3a41326e52ae454a9202493c41444 Nick On Fri, 17 Mar 2023 at 16:24, Nicholas Nethercote <n.n...@gm...> wrote: > I have finished the rewrite. I am happy with the new code, it is much > better than the old code. You can see it at > https://bugs.kde.org/show_bug.cgi?id=467472. I plan to merge it by the > end of next week, and I am happy to hear any suggestions. > > I also have some good news about the `cg_annotate.in`/`cg_annotate` > split. I learned that you can generate the latter from the former very > quickly with `config.status cachegrind/cg_annotate.in`. Also, this can be > done automatically from some make targets. So I ended up creating a new > make target `make ann` that can be run within the `cachegrind` directory. > It runs the various Python formatters, type-checkers, and linters I am > using on `cg_annotate.in` and then generates `cg_annotate`. It's a > one-step "build" command that runs quickly, which is great. > > Nick > > On Wed, 15 Mar 2023 at 06:15, Nicholas Nethercote <n.n...@gm...> > wrote: > >> On Sun, 12 Mar 2023 at 23:01, Paul Floyd <pj...@wa...> wrote: >> >>> >>> The only think I can think of to get the version is to use something like >>> >>> pkg-config --modversion valgrind >>> >> >> Thanks for the suggestion. Unfortunately this could cause misleading >> results. E.g. if I have Valgrind installed on my system but I also have a >> development version, when I run the development version of `cg_annotate >> --version` it will claim to be the installed version. I think the `@VERSION@` >> junk is unavoidable. >> >> Nick >> > |
|
From: Nicholas N. <n.n...@gm...> - 2023-04-03 10:04:56
|
On Sun, 2 Apr 2023 at 08:55, Mark Wielaard <ma...@kl...> wrote: > > So for the original buildbot CI I set it up to only run the auxtests > (that is build gsl and run the testsuite under valgrind). This used to > work on all setups, but after a gcc (or glibc?) update some setups > started failing as you can see at: > https://builder.sourceware.org/buildbot/#/builders?tags=valgrind > > For the try builders I took out the now failing builders > (debian-testing-x86_64, fedora-arm64, fedora-x86_64, ibm-power10, > opensusetw-x86_64 and rawhide-x86_64). > https://builder.sourceware.org/buildbot/#/builders?tags=valgrind-try > > Hopefully we can figure out why the auxtests fail now on some of these > setups. > What are the auxtests? I don't remember hearing about them and a Google search didn't turn up anything helpful. I looked at some of the failing builders, they all fail like this: > for gsl_test in block cblas cdf cheb combination complex const deriv dht diff eigen err fft fit histogram ieee-utils integration interpolation linalg matrix min monte multifit multimin multiroots ntuple ode-initval permutation poly qrng rng roots sort specfunc statistics sum sys vector wavelet; do echo $gsl_test; done \ > | cmp - /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out || \ > diff -u /home/builder/shared/bb1-2/worker/valgrind-debian-testing-x86_64/build/auxprogs/gsl-1.6.out.x86.exp \ > /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out > - /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out differ: char 226, line 32 > --- /home/builder/shared/bb1-2/worker/valgrind-debian-testing-x86_64/build/auxprogs/gsl-1.6.out.x86.exp 2022-08-09 19:45:48.726405142 +0000 > +++ /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out 2023-04-02 13:43:46.609646296 +0000 > @@ -15,7 +15,6 @@ > histogram > ieee-utils > integration > -FAIL: qawo(f456) elist (7.25063790881233303e-15 observed vs 7.25922435194575979e-15 expected) > interpolation > linalg > matrix > @@ -30,6 +29,10 @@ > poly > qrng > rng > +FAIL: random32-bsd, 10000 steps (852261210 observed vs 1663114331 expected) > +FAIL: random64-bsd, 10000 steps (210970120 observed vs 864469165 expected) > +FAIL: random32-libc5, 10000 steps (367802360 observed vs 1967452027 expected) > +FAIL: random64-libc5, 10000 steps (221021662 observed vs 2106639801 expected) > roots > sort > specfunc > make[1]: Leaving directory '/home/builder/shared/bb1-2/worker/valgrind-debian-testing-x86_64/build/auxprogs' > > I don't know why it's comparing against `gsl-1.6.out.x86.exp` even on x86-64, arm64, ppc64le, and so on. So should we make a new check target that runs a subset of make > regtests? Or maybe have a "ci mode" for regtests? So you would run > CI_MODE=true make regtests and it would skip any vgtest that has > prereq: test -z "$CI_MODE" > > And then add make regtests to the ci and try buildbots? > Seems reasonable. Nick |
|
From: Mark W. <ma...@kl...> - 2023-04-03 21:08:50
|
Hi Nick, On Mon, Apr 03, 2023 at 08:04:37PM +1000, Nicholas Nethercote wrote: > On Sun, 2 Apr 2023 at 08:55, Mark Wielaard <ma...@kl...> wrote: > > Hopefully we can figure out why the auxtests fail now on some of these > > setups. > > > > What are the auxtests? I don't remember hearing about them and a Google > search didn't turn up anything helpful. It is what you get when you do make auxchecks which is a target inside the auxprogs directory. See #---------------------------------------------------------------------------- # Auxiliary testsuits #---------------------------------------------------------------------------- auxchecks: gsl-check It basically downloads and builds the (very old) GNU Scientific Library 1.6. Then it runs the testsuite under valgrind. > I looked at some of the failing builders, they all fail like this: > > > for gsl_test in block cblas cdf cheb combination complex const deriv dht diff eigen err fft fit histogram ieee-utils integration interpolation linalg matrix min monte multifit multimin multiroots ntuple ode-initval permutation poly qrng rng roots sort specfunc statistics sum sys vector wavelet; do echo $gsl_test; done \ > > | cmp - /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out || \ > > diff -u /home/builder/shared/bb1-2/worker/valgrind-debian-testing-x86_64/build/auxprogs/gsl-1.6.out.x86.exp \ > > /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out > > - /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out differ: char 226, line 32 > > --- /home/builder/shared/bb1-2/worker/valgrind-debian-testing-x86_64/build/auxprogs/gsl-1.6.out.x86.exp 2022-08-09 19:45:48.726405142 +0000 > > +++ /home/builder/valgrind-auxtests/gsl-build-g-O3/valgrind-gsl.out 2023-04-02 13:43:46.609646296 +0000 > > @@ -15,7 +15,6 @@ > > histogram > > ieee-utils > > integration > > -FAIL: qawo(f456) elist (7.25063790881233303e-15 observed vs 7.25922435194575979e-15 expected) > > interpolation > > linalg > > matrix > > @@ -30,6 +29,10 @@ > > poly > > qrng > > rng > > +FAIL: random32-bsd, 10000 steps (852261210 observed vs 1663114331 expected) > > +FAIL: random64-bsd, 10000 steps (210970120 observed vs 864469165 expected) > > +FAIL: random32-libc5, 10000 steps (367802360 observed vs 1967452027 expected) > > +FAIL: random64-libc5, 10000 steps (221021662 observed vs 2106639801 expected) > > roots > > sort > > specfunc > > make[1]: Leaving directory '/home/builder/shared/bb1-2/worker/valgrind-debian-testing-x86_64/build/auxprogs' > > > > I don't know why it's comparing against `gsl-1.6.out.x86.exp` even on > x86-64, arm64, ppc64le, and so on. That is explained in the auxprogs/Makefile just above the gsl-check target: # We hope all tests PASS (so don't produce output except for the test names). # But on x86 we get one FAIL, so that is "fine" too. # We currently don't check stderr, but we probably should. So the (now new) failures are the random tests. Cheers, Mark |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-22 20:32:11
|
Thanks to Paul and Mark for a couple of small fixes to my commit. Yesterday I was idly dreaming about the quality-of-life improvements that would be available if Valgrind was hosted on GitHub: - ability to upload commits ahead of time, in a fashion nicer than "attach patch to bugzilla" - ability to do reviews - CI support for pre-merge testing runs - easier entry for newcomers Does sourceware.org have support for any of these things? Nick On Wed, 22 Mar 2023 at 10:22, Nicholas Nethercote <n.n...@gm...> wrote: > I have merged the new version of `cg_annotate`: > > https://sourceware.org/git/?p=valgrind.git;a=commit;h=4650b7949ae3a41326e52ae454a9202493c41444 > > Nick > > On Fri, 17 Mar 2023 at 16:24, Nicholas Nethercote <n.n...@gm...> > wrote: > >> I have finished the rewrite. I am happy with the new code, it is much >> better than the old code. You can see it at >> https://bugs.kde.org/show_bug.cgi?id=467472. I plan to merge it by the >> end of next week, and I am happy to hear any suggestions. >> >> I also have some good news about the `cg_annotate.in`/`cg_annotate` >> split. I learned that you can generate the latter from the former very >> quickly with `config.status cachegrind/cg_annotate.in`. Also, this can >> be done automatically from some make targets. So I ended up creating a new >> make target `make ann` that can be run within the `cachegrind` directory. >> It runs the various Python formatters, type-checkers, and linters I am >> using on `cg_annotate.in` and then generates `cg_annotate`. It's a >> one-step "build" command that runs quickly, which is great. >> >> Nick >> >> On Wed, 15 Mar 2023 at 06:15, Nicholas Nethercote <n.n...@gm...> >> wrote: >> >>> On Sun, 12 Mar 2023 at 23:01, Paul Floyd <pj...@wa...> wrote: >>> >>>> >>>> The only think I can think of to get the version is to use something >>>> like >>>> >>>> pkg-config --modversion valgrind >>>> >>> >>> Thanks for the suggestion. Unfortunately this could cause misleading >>> results. E.g. if I have Valgrind installed on my system but I also have a >>> development version, when I run the development version of `cg_annotate >>> --version` it will claim to be the installed version. I think the `@VERSION@` >>> junk is unavoidable. >>> >>> Nick >>> >> |
|
From: Paul F. <pj...@wa...> - 2023-03-22 21:02:53
|
On 22-03-23 00:22, Nicholas Nethercote wrote: > I have merged the new version of `cg_annotate`: > https://sourceware.org/git/?p=valgrind.git;a=commit;h=4650b7949ae3a41326e52ae454a9202493c41444 <https://sourceware.org/git/?p=valgrind.git;a=commit;h=4650b7949ae3a41326e52ae454a9202493c41444> Hi Nick I've seen two problems. The first I've already fixed with "env". The other is with typing / TypeAlias. This was added in Python 3.10. On my FreeBSD system "python3" defaults to Python 3.9, so I get ImportError: cannot import name 'TypeAlias' from 'typing' (/usr/local/lib/python3.9/typing.py) That will probably also cause problems on old Linux systems as well. I don't know much Python, but would it be OK to use non-explicit (implicit?) type aliases like in the diff below: The alternative would be to add python3.10 detection, something like AC_CHECK_PROGS([PYTHON3],[python3.10 python3] and then to use @PYTHON3@ in cg_annotate.in and the 3 ann[123]/vgtest files. A+ Paul diff --git a/cachegrind/cg_annotate.in b/cachegrind/cg_annotate.in index 91d75aecd..c3d5f71d4 100755 --- a/cachegrind/cg_annotate.in +++ b/cachegrind/cg_annotate.in @@ -73,7 +73,7 @@ import re import sys from argparse import ArgumentParser, BooleanOptionalAction, Namespace from collections import defaultdict -from typing import Callable, DefaultDict, NewType, NoReturn, TextIO, TypeAlias +from typing import Callable, DefaultDict, NewType, NoReturn, TextIO class Args(Namespace): @@ -323,11 +323,11 @@ class Cc: Flfn = NewType("Flfn", tuple[str, str]) # Per-function CCs. -DictFlfnCc: TypeAlias = DefaultDict[Flfn, Cc] +DictFlfnCc = DefaultDict[Flfn, Cc] # Per-line CCs, organised by filename and line number. -DictLineCc: TypeAlias = DefaultDict[int, Cc] -DictFlDictLineCc: TypeAlias = DefaultDict[str, DictLineCc] +DictLineCc = DefaultDict[int, Cc] +DictFlDictLineCc = DefaultDict[str, DictLineCc] def die(msg: str) -> NoReturn: A+ Paul |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-22 22:52:55
|
Thanks for the suggestion. I have removed the use of `TypeAlias` like you suggested, and documented version expectations. Nick On Thu, 23 Mar 2023 at 08:03, Paul Floyd <pj...@wa...> wrote: > > > On 22-03-23 00:22, Nicholas Nethercote wrote: > > I have merged the new version of `cg_annotate`: > > > https://sourceware.org/git/?p=valgrind.git;a=commit;h=4650b7949ae3a41326e52ae454a9202493c41444 > < > https://sourceware.org/git/?p=valgrind.git;a=commit;h=4650b7949ae3a41326e52ae454a9202493c41444 > > > > Hi Nick > > I've seen two problems. The first I've already fixed with "env". > > The other is with typing / TypeAlias. This was added in Python 3.10. On > my FreeBSD system "python3" defaults to Python 3.9, so I get > > ImportError: cannot import name 'TypeAlias' from 'typing' > (/usr/local/lib/python3.9/typing.py) > > That will probably also cause problems on old Linux systems as well. > > I don't know much Python, but would it be OK to use non-explicit > (implicit?) type aliases like in the diff below: > > The alternative would be to add python3.10 detection, something like > > AC_CHECK_PROGS([PYTHON3],[python3.10 python3] > > and then to use @PYTHON3@ in cg_annotate.in and the 3 ann[123]/vgtest > files. > > A+ > Paul > > > > diff --git a/cachegrind/cg_annotate.in b/cachegrind/cg_annotate.in > index 91d75aecd..c3d5f71d4 100755 > --- a/cachegrind/cg_annotate.in > +++ b/cachegrind/cg_annotate.in > @@ -73,7 +73,7 @@ import re > import sys > from argparse import ArgumentParser, BooleanOptionalAction, Namespace > from collections import defaultdict > -from typing import Callable, DefaultDict, NewType, NoReturn, TextIO, > TypeAlias > +from typing import Callable, DefaultDict, NewType, NoReturn, TextIO > > > class Args(Namespace): > @@ -323,11 +323,11 @@ class Cc: > Flfn = NewType("Flfn", tuple[str, str]) > > # Per-function CCs. > -DictFlfnCc: TypeAlias = DefaultDict[Flfn, Cc] > +DictFlfnCc = DefaultDict[Flfn, Cc] > > # Per-line CCs, organised by filename and line number. > -DictLineCc: TypeAlias = DefaultDict[int, Cc] > -DictFlDictLineCc: TypeAlias = DefaultDict[str, DictLineCc] > +DictLineCc = DefaultDict[int, Cc] > +DictFlDictLineCc = DefaultDict[str, DictLineCc] > > > def die(msg: str) -> NoReturn: > > > > A+ > Paul > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Paul F. <pj...@wa...> - 2023-03-22 21:18:25
|
On 22-03-23 21:31, Nicholas Nethercote wrote: > Thanks to Paul and Mark for a couple of small fixes to my commit. > > Yesterday I was idly dreaming about the quality-of-life improvements > that would be available if Valgrind was hosted on GitHub: > - ability to upload commits ahead of time, in a fashion nicer than > "attach patch to bugzilla" > - ability to do reviews > - CI support for pre-merge testing runs > - easier entry for newcomers > > Does sourceware.org <http://sourceware.org> have support for any of > these things? Hi GH does have a lot going for it, and when they add code browsing it will be even better. One concern though is their owner and creeping commercialization. As far as I know, a lot of GNU projects and Linux still work mostly by by patches sent to mailing lists. One other fairly common system is phabricator (used by LLVM and FreeBSD amonst others). However, the company that developed phabricator closed down so I don't know where that is going (there is a fork). FWIW FreeBSD is increasingly using GH. A+ Paul |
|
From: Nicholas N. <n.n...@gm...> - 2023-04-03 21:42:36
|
On Tue, 4 Apr 2023 at 07:08, Mark Wielaard <ma...@kl...> wrote: > > > I looked at some of the failing builders, they all fail like this: > > > > > [...] > > > integration > > > -FAIL: qawo(f456) elist (7.25063790881233303e-15 observed vs > 7.25922435194575979e-15 expected) > > > interpolation > > > [...] > > > rng > > > +FAIL: random32-bsd, 10000 steps (852261210 observed vs 1663114331 > expected) > > > +FAIL: random64-bsd, 10000 steps (210970120 observed vs 864469165 > expected) > > > +FAIL: random32-libc5, 10000 steps (367802360 observed vs 1967452027 > expected) > > > +FAIL: random64-libc5, 10000 steps (221021662 observed vs 2106639801 > expected) > > > roots > > So the (now new) failures are the random tests. > And it looks like `qawo` has changed to passing, too? Nick |
|
From: Mark W. <ma...@kl...> - 2023-04-03 22:26:53
|
On Tue, Apr 04, 2023 at 07:42:15AM +1000, Nicholas Nethercote wrote: > On Tue, 4 Apr 2023 at 07:08, Mark Wielaard <ma...@kl...> wrote: > > > > > > I looked at some of the failing builders, they all fail like this: > > > > > > > [...] > > > > integration > > > > -FAIL: qawo(f456) elist (7.25063790881233303e-15 observed vs > > 7.25922435194575979e-15 expected) > > > > interpolation > > > > [...] > > > > rng > > > > +FAIL: random32-bsd, 10000 steps (852261210 observed vs 1663114331 > > expected) > > > > +FAIL: random64-bsd, 10000 steps (210970120 observed vs 864469165 > > expected) > > > > +FAIL: random32-libc5, 10000 steps (367802360 observed vs 1967452027 > > expected) > > > > +FAIL: random64-libc5, 10000 steps (221021662 observed vs 2106639801 > > expected) > > > > roots > > > > So the (now new) failures are the random tests. > > > > And it looks like `qawo` has changed to passing, too? qawo was only failing on i386. so the logic in the makefile assumes everything is passing and if not, that it must be a run on i386 where there is only one difference, the FAIL: qawo. See the (still passing) i386 run: https://builder.sourceware.org/buildbot/#/builders/40/builds/310/steps/6/logs/stdio Cheers, Mark |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-22 23:09:11
|
All the Rust projects I work on use GitHub, and in terms of usability and productivity it's miles ahead of how Valgrind development works. If I were king of the world here's how I would drag Valgrind's development practices forward by 10-20 years. - Move the repository to GitHub. Require all changes to be done via pull requests, with no direct pushing. - Set up some CI testing via GitHub Actions. Require that all pull requests pass these tests before merging. - Lots of projects require a review approval before a pull request can be merged. But that might be too hard for Valgrind to start with, given the small number of active contributors. - Switch from KDE bugzilla to GitHub issues for bug reporting. Not sure what I'd do with existing open bug reports, whether it would be worth importing them to GitHub issues somehow or not. - Use auto-formatting tools, such as clang-format. (Possibly even moving from 3 space indents in C code to 2 or 4!) - Change the docs from that XML-based thing we use (groan) to something nicer, probably involving Markdown. - Website: not sure... a lot of it could be naturally hosted on the main GitHub page. It might be nice to still have valgrind.org, though, but perhaps greatly stripped back. I understand the concerns about GitHub and commercialization, but I also worry about Valgrind's future viability if it doesn't attract some level of new contributors. The Linux kernel will never have that problem, but I suspect lots of GNU projects also face that risk. Nick On Thu, 23 Mar 2023 at 08:19, Paul Floyd <pj...@wa...> wrote: > > > On 22-03-23 21:31, Nicholas Nethercote wrote: > > Thanks to Paul and Mark for a couple of small fixes to my commit. > > > > Yesterday I was idly dreaming about the quality-of-life improvements > > that would be available if Valgrind was hosted on GitHub: > > - ability to upload commits ahead of time, in a fashion nicer than > > "attach patch to bugzilla" > > - ability to do reviews > > - CI support for pre-merge testing runs > > - easier entry for newcomers > > > > Does sourceware.org <http://sourceware.org> have support for any of > > these things? > > Hi > > GH does have a lot going for it, and when they add code browsing it will > be even better. One concern though is their owner and creeping > commercialization. > > As far as I know, a lot of GNU projects and Linux still work mostly by > by patches sent to mailing lists. > > One other fairly common system is phabricator (used by LLVM and FreeBSD > amonst others). However, the company that developed phabricator closed > down so I don't know where that is going (there is a fork). FWIW FreeBSD > is increasingly using GH. > > A+ > Paul > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Floyd, P. <pj...@wa...> - 2023-03-24 10:55:31
|
On 23/03/2023 00:08, Nicholas Nethercote wrote: > > I understand the concerns about GitHub and commercialization, but I > also worry about Valgrind's future viability if it doesn't attract > some level of new contributors. The Linux kernel will never have that > problem, but I suspect lots of GNU projects also face that risk. I also think that is a big problem. Both for contributing and reviewing code. 'Casual' contributors that have medium to large contributions either never get merged or take ages. We really could do with some experts to help with ARM and the latest Intel/AMD instructions as well. A+ Paul |
|
From: Mark W. <ma...@kl...> - 2023-03-23 08:45:25
|
Hi Nick, On Thu, Mar 23, 2023 at 07:31:47AM +1100, Nicholas Nethercote wrote: > Thanks to Paul and Mark for a couple of small fixes to my commit. Thank you for the big fixes/rewrite! > Yesterday I was idly dreaming about the quality-of-life improvements that > would be available if Valgrind was hosted on GitHub: > - ability to upload commits ahead of time, in a fashion nicer than "attach > patch to bugzilla" > - ability to do reviews > - CI support for pre-merge testing runs > - easier entry for newcomers > > Does sourceware.org have support for any of these things? Yes, sourceware offers patchwork.sourceware.org (in combination with public-inbox support) which can be used for patch tracking and can be configured to do pre-commit CI. And through builder.sourceware.org you can do builds for "try branches". Note that valgrind already is using the sourceware CI for arm64, armhf, i386, ppc64, ppc64le, power10, power9, x86_64 on some gnu/linux distros: https://builder.sourceware.org/buildbot/#/builders?tags=valgrind And there are of course the nightly builders which report to valgrind-testresults: https://sourceforge.net/p/valgrind/mailman/valgrind-testresults/ We have been pondering moving bugzilla and the mailinglists to sourceware so we can have better integration (for example to automatically link bugs, patches and commits). But weren't sure those improvements were enough to "break" old links/habits. Cheers, Mark |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-23 08:51:15
|
Nice, I wasn't aware of any of those facilities. Are they documented anywhere, both their existence and how to use them? I couldn't find anything on valgrind.org about them, but maybe I overlooked something. Nick On Thu, 23 Mar 2023 at 19:45, Mark Wielaard <ma...@kl...> wrote: > Hi Nick, > > On Thu, Mar 23, 2023 at 07:31:47AM +1100, Nicholas Nethercote wrote: > > Thanks to Paul and Mark for a couple of small fixes to my commit. > > Thank you for the big fixes/rewrite! > > > Yesterday I was idly dreaming about the quality-of-life improvements that > > would be available if Valgrind was hosted on GitHub: > > - ability to upload commits ahead of time, in a fashion nicer than > "attach > > patch to bugzilla" > > - ability to do reviews > > - CI support for pre-merge testing runs > > - easier entry for newcomers > > > > Does sourceware.org have support for any of these things? > > Yes, sourceware offers patchwork.sourceware.org (in combination with > public-inbox support) which can be used for patch tracking and can be > configured to do pre-commit CI. And through builder.sourceware.org you > can do builds for "try branches". > > Note that valgrind already is using the sourceware CI for arm64, > armhf, i386, ppc64, ppc64le, power10, power9, x86_64 on some gnu/linux > distros: > https://builder.sourceware.org/buildbot/#/builders?tags=valgrind > > And there are of course the nightly builders which report to > valgrind-testresults: > https://sourceforge.net/p/valgrind/mailman/valgrind-testresults/ > > We have been pondering moving bugzilla and the mailinglists to > sourceware so we can have better integration (for example to > automatically link bugs, patches and commits). But weren't sure those > improvements were enough to "break" old links/habits. > > Cheers, > > Mark > |
|
From: Mark W. <ma...@kl...> - 2023-03-23 09:15:18
|
Hi Nick, On Thu, Mar 23, 2023 at 10:08:51AM +1100, Nicholas Nethercote wrote: > All the Rust projects I work on use GitHub, and in terms of usability and > productivity it's miles ahead of how Valgrind development works. I think github is unacceptable for Free Software projects, which I believe should use free software tools and not a corporate controlled proprietary platform: https://mako.cc/writing/hill-free_tools.html https://giveupgithub.com/ But that doesn't mean I don't agree with you. There are some really nice concepts in these "code forges". And there are free software implementations like https://codeberg.org/ https://docs.pagure.org/pagure/ https://sourcehut.org/ and (self managed) gitlab ce. valgrind already has an official mirror on sourcehut (like all other sourceware projects): https://sr.ht/~sourceware/valgrind/ > If I were king of the world here's how I would drag Valgrind's development > practices forward by 10-20 years. > > - Move the repository to GitHub. Require all changes to be done via pull > requests, with no direct pushing. > - Set up some CI testing via GitHub Actions. Require that all pull > requests pass these tests before merging. > - Lots of projects require a review approval before a pull request can > be merged. But that might be too hard for Valgrind to start with, given the > small number of active contributors. Modulo using github these sound like interesting ideas. Forcing this workflow might be a bit heavy weight. But we could use sourceware try-branches so all commits go through the CI builders. Requiring all passing builds will be a bit of a puzzle since out testsuite(s) aren't zero-fail. > - Switch from KDE bugzilla to GitHub issues for bug reporting. Not sure > what I'd do with existing open bug reports, whether it would be worth > importing them to GitHub issues somehow or not. I would consider moving to sourceware bugzilla, which would make it easier to connect commits/patches with issues. The downside is that all bugs get renumbered. What benefit do you see from GitHub issues over bugzilla? > - Use auto-formatting tools, such as clang-format. (Possibly even moving > from 3 space indents in C code to 2 or 4!) Sourceware builder could help with that, we can setup a CI bot that runs such a formatter over all commits to check formatting. I have seen that work very nicely with the python black formatter. But for C code/clang-format it seems the formatter seems not that good/stable. > - Change the docs from that XML-based thing we use (groan) to something > nicer, probably involving Markdown. Not against, but a lot of work. And with the make check xml linter checks writing new docs has become a lot nicer. We should however setup a buildbot to always create the docs on each commit. > - Website: not sure... a lot of it could be naturally hosted on the main > GitHub page. It might be nice to still have valgrind.org, though, but > perhaps greatly stripped back. What would you propose? The website is in git now, but uses php for a few things. https://sourceware.org/cgit/valgrind-htdocs we could drop the php and replace it with a more simpler markdown based site? Cheers, Mark |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-23 10:51:44
|
On Thu, 23 Mar 2023 at 20:15, Mark Wielaard <ma...@kl...> wrote: > > Modulo using github these sound like interesting ideas. Forcing this > workflow might be a bit heavy weight. But we could use sourceware > try-branches so all commits go through the CI builders. Requiring all > passing builds will be a bit of a puzzle since out testsuite(s) aren't > zero-fail. > I threw a lot of ideas out in my earlier email, but this is the most important one. Graydon Hoare expressed <https://graydon2.dreamwidth.org/1597.html> this years ago as: *The Not Rocket Science Rule Of Software Engineering:* > automatically maintain a repository of code that always passes all the > tests > This requires that all the tests pass before merging a change. Having worked on projects that follow this and projects that don't, I say with confidence that it's a good idea. If we could get that happening for Valgrind, that alone would be a huge improvement over the status quo. I looked at the sites you linked, but couldn't work out much about how they work. W.r.t. failing tests, this would give great incentive to fix currently failing tests (or disable them if they cannot be made reliable) and to keep them passing. Sourceware builder could help with that, we can setup a CI bot that > runs such a formatter over all commits to check formatting. I have > seen that work very nicely with the python black formatter. But for C > code/clang-format it seems the formatter seems not that good/stable. > Mozilla's been using clang-format for Firefox C and C++ code for several years. Nick |
|
From: Floyd, P. <pj...@wa...> - 2023-03-24 10:39:14
|
On 23/03/2023 11:51, Nicholas Nethercote wrote: > > This requires that all the tests pass before merging a change. Having > worked on projects that follow this and projects that don't, I say > with confidence that it's a good idea. If we could get that happening > for Valgrind, that alone would be a huge improvement over the status > quo. I looked at the sites you linked, but couldn't work out much > about how they work. CI tests are good, as long as they aren't too slow or flaky - we do have the occasional test that fails intermittently, but no huge muti-day tests. The problem I see is for the 'other' platforms that CI probably won't cover: x86, FreeBSD, Solaris, macOS. Manually testing on all those is tedious which can result in a bit of git ping-pong to get things like expected filters to converge on working on all platforms. > Mozilla's been using clang-format for Firefox C and C++ code for > several years. I'd be happy to have anything like K&R or Allman formatting - not keen on GNU style. I really do appreciate being able to grep for patterns like "variable = " knowing that there will always be just one space to find all assignments without having to wrestle with RE whitespaces or wade through all uses of 'variable'. And I presume that this can be done with a git hook. A+ Paul |
|
From: Mark W. <ma...@kl...> - 2023-03-24 11:26:00
|
On Thu, 2023-03-23 at 19:50 +1100, Nicholas Nethercote wrote: > Nice, I wasn't aware of any of those facilities. Are they documented > anywhere, both their existence and how to use them? I couldn't find > anything on valgrind.org about them, but maybe I overlooked > something. We aren't (yet?) using all of them (and some of them would mean moving over bugzilla and the mailinglist, which might be controversial). But I'll at least add the buildbot CI testers to the website (and we should at least make use of the try-branches) this weekend. Cheers, Mark |
|
From: Mark W. <ma...@kl...> - 2023-03-24 11:52:56
|
Hi Nick, On Thu, 2023-03-23 at 21:51 +1100, Nicholas Nethercote wrote: > I threw a lot of ideas out in my earlier email, but this is the most > important one. Graydon Hoare expressed this years ago as: > > > The Not Rocket Science Rule Of Software Engineering: > > automatically maintain a repository of code that always passes all > > the tests > > This requires that all the tests pass before merging a change. Having > worked on projects that follow this and projects that don't, I say > with confidence that it's a good idea. If we could get that happening > for Valgrind, that alone would be a huge improvement over the status > quo. I looked at the sites you linked, but couldn't work out much > about how they work. > > W.r.t. failing tests, this would give great incentive to fix > currently failing tests (or disable them if they cannot be made > reliable) and to keep them passing. I completely agree with this sentiment. But how do you get there? And how do you cross the psychological barrier. I mean that it feels like cheating to just disable failing or flaky tests. They might fail on some, but not all setups. Or they might even be just flaky depending on CPU model (I think I saw some failures with an AMD Ryzen processor, which succeeded on an Intel Xeon processor). What should our policy be to get to zero fail? Does that mean a test should always pass on any arch/setup? Or do we make exceptions for tests that fail on some setups? Do we keep an "exception list" based on...? What do we do with the "removed" (or excepted) tests? Do those turn into high priority bugs instead? What about new ports, they often start with a bunch of failing tests. e.g. for x86_64 we do have memcheck/tests/overlap which fails on newer glibc with certain processors where glibc might use an ifunc to point both memcpy and memmove to the same function, which confuses our intercept code. https://bugs.kde.org/show_bug.cgi?id=402833 It works fine on some (older or not x86_64) setups though. I would love to get to "zero fail" but what should we do to get there and what should our policy be to keep it given that we don't fully control our environment and some (new) failures simply come from upgrading glibc or the compiler or even the cpu. Cheers, Mark |