## Re: [Libmesh-devel] Regression in reduced_basis_ex4

 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: Roy Stogner - 2011-12-15 16:13:21 ``` On Thu, 15 Dec 2011, John Peterson wrote: > On Thu, Dec 15, 2011 at 7:35 AM, Roy Stogner wrote: >> >> but I've no idea what causes the difference. > > Hmm... just a thought: the RB stuff uses some random number generation stuff. > > Perhaps this could explain different greedy parameter selection order > on different systems, but not outright failure? That could explain why we're triggering failure in some cases but not in others, though. --- Roy ```

 [Libmesh-devel] Regression in reduced_basis_ex4 From: Roy Stogner - 2011-12-15 03:38:52 ```We're hitting this with my plain standard --enable-everything build, too: ... ---- Basis dimension: 19 ---- Performing RB solves on training set Maximum (absolute) error bound is 0.199127 Performing truth solve at parameter: mu[0] = 0.5 mu[1] = -1 Enriching the RB space Updating RB matrices ---- Basis dimension: 20 ---- Performing RB solves on training set Maximum (absolute) error bound is 0.198364 Maximum number of basis functions reached: Nmax = 20. Perform one more Greedy iteration for error bounds. Performing truth solve at parameter: mu[0] = -1 mu[1] = -0.5 Enriching the RB space Updating RB matrices ---- Basis dimension: 20 ---- Performing RB solves on training set Maximum (absolute) error bound is 0.198364 Extra Greedy iteration finished. In RBEvaluation::write_offline_data_to_files, directory eim_data already exists, overwriting contents. Assertion `theta_q_f[i] != NULL' failed. [0] src/reduced_basis/rb_theta_expansion.C, line 89, compiled Dec 14 2011 at 20:58:16 terminate called after throwing an instance of 'libMesh::LogicError' what(): Error in libMesh internal logic make[2]: *** [run] Aborted make[2]: Leaving directory `/workspace/buildbot/slave/libmesh-trunk/build/examples/reduced_basis/reduced_basis_ex4' ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: David Knezevic - 2011-12-15 13:37:02 ```hmm, I can't seem to reproduce this error...? On 12/14/2011 10:38 PM, Roy Stogner wrote: > > We're hitting this with my plain standard --enable-everything build, > too: > > ... > > ---- Basis dimension: 19 ---- > Performing RB solves on training set > Maximum (absolute) error bound is 0.199127 > > Performing truth solve at parameter: > mu[0] = 0.5 > mu[1] = -1 > > Enriching the RB space > Updating RB matrices > > ---- Basis dimension: 20 ---- > Performing RB solves on training set > Maximum (absolute) error bound is 0.198364 > > Maximum number of basis functions reached: Nmax = 20. > Perform one more Greedy iteration for error bounds. > Performing truth solve at parameter: > mu[0] = -1 > mu[1] = -0.5 > > Enriching the RB space > Updating RB matrices > > ---- Basis dimension: 20 ---- > Performing RB solves on training set > Maximum (absolute) error bound is 0.198364 > > Extra Greedy iteration finished. > In RBEvaluation::write_offline_data_to_files, directory eim_data > already exists, overwriting contents. > Assertion `theta_q_f[i] != NULL' failed. > [0] src/reduced_basis/rb_theta_expansion.C, line 89, compiled Dec 14 > 2011 at 20:58:16 > terminate called after throwing an instance of 'libMesh::LogicError' > what(): Error in libMesh internal logic > make[2]: *** [run] Aborted > make[2]: Leaving directory > `/workspace/buildbot/slave/libmesh-trunk/build/examples/reduced_basis/reduced_basis_ex4' > ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: Roy Stogner - 2011-12-15 14:36:10 ```I can't seem to reproduce it easily, myself! BuildBot is showing a failure every time with my default build: loadmodules intel tbb mpich2/1.2.1 mkl-pecos petsc slepc trilinos glpk vtk &&./configure --enable-everything but it's showing success with literally every other build I've configured it to try. They're all being run with "LIBMESH_RUN='mpirun -np 2'", even. The failing build and the success build seem to differ starting early: FAILURE: ---- Basis dimension: 5 ---- Performing RB solves on training set Maximum (absolute) error bound is 0.946406 Performing truth solve at parameter: mu[0] = 0 mu[1] = 1 SUCCESS: ---- Basis dimension: 5 ---- Performing RB solves on training set Maximum (absolute) error bound is 0.946406 Performing truth solve at parameter: mu[0] = -1 mu[1] = 0 but I've no idea what causes the difference. --- Roy On Thu, 15 Dec 2011, David Knezevic wrote: > hmm, I can't seem to reproduce this error...? > > > > On 12/14/2011 10:38 PM, Roy Stogner wrote: >> >> We're hitting this with my plain standard --enable-everything build, >> too: >> >> ... >> >> ---- Basis dimension: 19 ---- >> Performing RB solves on training set >> Maximum (absolute) error bound is 0.199127 >> >> Performing truth solve at parameter: >> mu[0] = 0.5 >> mu[1] = -1 >> >> Enriching the RB space >> Updating RB matrices >> >> ---- Basis dimension: 20 ---- >> Performing RB solves on training set >> Maximum (absolute) error bound is 0.198364 >> >> Maximum number of basis functions reached: Nmax = 20. >> Perform one more Greedy iteration for error bounds. >> Performing truth solve at parameter: >> mu[0] = -1 >> mu[1] = -0.5 >> >> Enriching the RB space >> Updating RB matrices >> >> ---- Basis dimension: 20 ---- >> Performing RB solves on training set >> Maximum (absolute) error bound is 0.198364 >> >> Extra Greedy iteration finished. >> In RBEvaluation::write_offline_data_to_files, directory eim_data >> already exists, overwriting contents. >> Assertion `theta_q_f[i] != NULL' failed. >> [0] src/reduced_basis/rb_theta_expansion.C, line 89, compiled Dec 14 >> 2011 at 20:58:16 >> terminate called after throwing an instance of 'libMesh::LogicError' >> what(): Error in libMesh internal logic >> make[2]: *** [run] Aborted >> make[2]: Leaving directory >> `/workspace/buildbot/slave/libmesh-trunk/build/examples/reduced_basis/reduced_basis_ex4' > > ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: John Peterson - 2011-12-15 15:35:37 ```On Thu, Dec 15, 2011 at 7:35 AM, Roy Stogner wrote: > > but I've no idea what causes the difference. Hmm... just a thought: the RB stuff uses some random number generation stuff. Perhaps this could explain different greedy parameter selection order on different systems, but not outright failure? -- John ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: David Knezevic - 2011-12-15 15:57:23 ```John's right that there are random numbers in general, but in that example the training sets are not randomly generated. But the way the algorithm chooses the next parameter is by finding the one with the maximum error bound, and as you can see this problem has the same error bound at two different parameter values, so rounding error would determine which one you end up with. So I think the "early" difference is not surprising. I don't see where the NULL pointer is coming from though in the failure case though... On 12/15/2011 09:35 AM, Roy Stogner wrote: > > I can't seem to reproduce it easily, myself! BuildBot is showing a > failure every time with my default build: > > loadmodules intel tbb mpich2/1.2.1 mkl-pecos petsc slepc trilinos glpk > vtk &&./configure --enable-everything > > but it's showing success with literally every other build I've > configured it to try. They're all being run with "LIBMESH_RUN='mpirun > -np 2'", even. > > The failing build and the success build seem to differ starting early: > > FAILURE: > > ---- Basis dimension: 5 ---- > Performing RB solves on training set > Maximum (absolute) error bound is 0.946406 > > Performing truth solve at parameter: > mu[0] = 0 > mu[1] = 1 > > SUCCESS: > > ---- Basis dimension: 5 ---- > Performing RB solves on training set > Maximum (absolute) error bound is 0.946406 > > Performing truth solve at parameter: > mu[0] = -1 > mu[1] = 0 > > but I've no idea what causes the difference. > --- > Roy > > On Thu, 15 Dec 2011, David Knezevic wrote: > >> hmm, I can't seem to reproduce this error...? >> >> >> >> On 12/14/2011 10:38 PM, Roy Stogner wrote: >>> >>> We're hitting this with my plain standard --enable-everything build, >>> too: >>> >>> ... >>> >>> ---- Basis dimension: 19 ---- >>> Performing RB solves on training set >>> Maximum (absolute) error bound is 0.199127 >>> >>> Performing truth solve at parameter: >>> mu[0] = 0.5 >>> mu[1] = -1 >>> >>> Enriching the RB space >>> Updating RB matrices >>> >>> ---- Basis dimension: 20 ---- >>> Performing RB solves on training set >>> Maximum (absolute) error bound is 0.198364 >>> >>> Maximum number of basis functions reached: Nmax = 20. >>> Perform one more Greedy iteration for error bounds. >>> Performing truth solve at parameter: >>> mu[0] = -1 >>> mu[1] = -0.5 >>> >>> Enriching the RB space >>> Updating RB matrices >>> >>> ---- Basis dimension: 20 ---- >>> Performing RB solves on training set >>> Maximum (absolute) error bound is 0.198364 >>> >>> Extra Greedy iteration finished. >>> In RBEvaluation::write_offline_data_to_files, directory eim_data >>> already exists, overwriting contents. >>> Assertion `theta_q_f[i] != NULL' failed. >>> [0] src/reduced_basis/rb_theta_expansion.C, line 89, compiled Dec 14 >>> 2011 at 20:58:16 >>> terminate called after throwing an instance of 'libMesh::LogicError' >>> what(): Error in libMesh internal logic >>> make[2]: *** [run] Aborted >>> make[2]: Leaving directory >>> `/workspace/buildbot/slave/libmesh-trunk/build/examples/reduced_basis/reduced_basis_ex4' >> >> >> ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: Roy Stogner - 2011-12-15 16:13:21 ``` On Thu, 15 Dec 2011, John Peterson wrote: > On Thu, Dec 15, 2011 at 7:35 AM, Roy Stogner wrote: >> >> but I've no idea what causes the difference. > > Hmm... just a thought: the RB stuff uses some random number generation stuff. > > Perhaps this could explain different greedy parameter selection order > on different systems, but not outright failure? That could explain why we're triggering failure in some cases but not in others, though. --- Roy ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: Cody Permann - 2011-12-15 16:51:51 ```Random number generation has caused us many issues in MOOSE as well. Long ago we bundled in a free platform independent random number generator which resolved all of these issues. I don't know if we'd want to go those extremes in the libMesh library but it has worked well for us. Cody Sent from my evil iPhone On Dec 15, 2011, at 9:13 AM, Roy Stogner wrote: > > > On Thu, 15 Dec 2011, John Peterson wrote: > >> On Thu, Dec 15, 2011 at 7:35 AM, Roy Stogner wrote: >>> >>> but I've no idea what causes the difference. >> >> Hmm... just a thought: the RB stuff uses some random number generation stuff. >> >> Perhaps this could explain different greedy parameter selection order >> on different systems, but not outright failure? > > That could explain why we're triggering failure in some cases but not > in others, though. > --- > Roy > > ------------------------------------------------------------------------------ > 10 Tips for Better Server Consolidation > Server virtualization is being driven by many needs. > But none more important than the need to reduce IT complexity > while improving strategic productivity. Learn More! > http://www.accelacomm.com/jaw/sdnl/114/51507609/ > _______________________________________________ > Libmesh-devel mailing list > Libmesh-devel@... > https://lists.sourceforge.net/lists/listinfo/libmesh-devel ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: Roy Stogner - 2012-01-25 20:59:36 ```This bug has gone away, and I reluctantly have to ask: did it actually get identified and fixed, or did it just randomly stop manifesting when other changes were made? ;-) --- Roy > On 12/15/2011 09:35 AM, Roy Stogner wrote: >> >> I can't seem to reproduce it easily, myself! BuildBot is showing a >> failure every time with my default build: >> >> loadmodules intel tbb mpich2/1.2.1 mkl-pecos petsc slepc trilinos glpk vtk >> &&./configure --enable-everything >> >> but it's showing success with literally every other build I've >> configured it to try. They're all being run with "LIBMESH_RUN='mpirun -np >> 2'", even. >> >> The failing build and the success build seem to differ starting early: >> >> FAILURE: >> >> ---- Basis dimension: 5 ---- >> Performing RB solves on training set >> Maximum (absolute) error bound is 0.946406 >> >> Performing truth solve at parameter: >> mu[0] = 0 >> mu[1] = 1 >> >> SUCCESS: >> >> ---- Basis dimension: 5 ---- >> Performing RB solves on training set >> Maximum (absolute) error bound is 0.946406 >> >> Performing truth solve at parameter: >> mu[0] = -1 >> mu[1] = 0 >> >> but I've no idea what causes the difference. ```
 Re: [Libmesh-devel] Regression in reduced_basis_ex4 From: David Knezevic - 2012-01-25 21:16:13 ```OK, interesting. It wasn't identified and fixed as far as I know (I wasn't able to reproduce the bug on my system)... sounds like it randomly stopped manifesting! Dave On 01/25/2012 03:59 PM, Roy Stogner wrote: > > This bug has gone away, and I reluctantly have to ask: did it actually > get identified and fixed, or did it just randomly stop manifesting > when other changes were made? ;-) > --- > Roy > >> On 12/15/2011 09:35 AM, Roy Stogner wrote: >>> >>> I can't seem to reproduce it easily, myself! BuildBot is showing a >>> failure every time with my default build: >>> >>> loadmodules intel tbb mpich2/1.2.1 mkl-pecos petsc slepc trilinos >>> glpk vtk &&./configure --enable-everything >>> >>> but it's showing success with literally every other build I've >>> configured it to try. They're all being run with >>> "LIBMESH_RUN='mpirun -np 2'", even. >>> >>> The failing build and the success build seem to differ starting early: >>> >>> FAILURE: >>> >>> ---- Basis dimension: 5 ---- >>> Performing RB solves on training set >>> Maximum (absolute) error bound is 0.946406 >>> >>> Performing truth solve at parameter: >>> mu[0] = 0 >>> mu[1] = 1 >>> >>> SUCCESS: >>> >>> ---- Basis dimension: 5 ---- >>> Performing RB solves on training set >>> Maximum (absolute) error bound is 0.946406 >>> >>> Performing truth solve at parameter: >>> mu[0] = -1 >>> mu[1] = 0 >>> >>> but I've no idea what causes the difference. ```