r2596 cause performance issues

2012-04-23
2013-04-11
  • Ruge, Vitalij

    Ruge, Vitalij - 2012-04-23

    Hi,
    my function lose performance after the update from r2594 to r2596:
    I make a cProfile (usediPython Magic Shortcut %prun).

    r2594:
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.055    0.055    0.055    0.055 {_casadi.FX_solve}
            1    0.029    0.029    0.029    0.029 {_casadi.SharedObject_init}
            1    0.003    0.003    0.003    0.003 {_casadi.new_IpoptSolver}
            5    0.001    0.000    0.001    0.000 {_casadi.FX_setInput}
            2    0.000    0.000    0.000    0.000 {numpy.core.multiarray.concatenate}
            1    0.000    0.000    0.089    0.089 OpenSolver.py:149(solver_start)
            2    0.000    0.000    0.000    0.000 {_casadi.OptionsFunctionality_setOption}
            1    0.000    0.000    0.003    0.003 casadi.py:23022(__init__)
            2    0.000    0.000    0.000    0.000 {_casadi.FX_output}
            4    0.000    0.000    0.000    0.000 casadi.py:34(_swig_setattr_nondynamic)
            1    0.000    0.000    0.000    0.000 {_casadi.DMatrix_set}
            1    0.000    0.000    0.000    0.000 {_casadi.FX_evaluate}
            2    0.000    0.000    0.000    0.000 casadi.py:50(_swig_getattr)
            5    0.000    0.000    0.001    0.000 casadi.py:12631(setInput)
            3    0.000    0.000    0.000    0.000 casadi.py:7382(<lambda>)
            1    0.000    0.000    0.000    0.000 {_casadi.FX_input}
            1    0.000    0.000    0.000    0.000 casadi.py:8173(__float__)
            1    0.000    0.000    0.089    0.089 <string>:1(<module>)
            2    0.000    0.000    0.000    0.000 casadi.py:4745(setOption)
            1    0.000    0.000    0.000    0.000 casadi.py:22997(<lambda>)
            1    0.000    0.000    0.029    0.029 casadi.py:4278(init)
            4    0.000    0.000    0.000    0.000 casadi.py:47(_swig_setattr)
            1    0.000    0.000    0.000    0.000 casadi.py:12509(input)
            1    0.000    0.000    0.000    0.000 casadi.py:22994(<lambda>)
            2    0.000    0.000    0.000    0.000 casadi.py:12518(output)
            1    0.000    0.000    0.055    0.055 casadi.py:12330(solve)
            1    0.000    0.000    0.000    0.000 {_casadi.PyDECREFParent}
            1    0.000    0.000    0.000    0.000 casadi.py:7423(toScalar)
            1    0.000    0.000    0.000    0.000 casadi.py:7385(<lambda>)
            1    0.000    0.000    0.000    0.000 casadi.py:8055(__del__)
            1    0.000    0.000    0.000    0.000 casadi.py:12320(evaluate)
            1    0.000    0.000    0.000    0.000 casadi.py:5834(numel)
            1    0.000    0.000    0.000    0.000 casadi.py:7935(set)
            5    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
            1    0.000    0.000    0.000    0.000 {_casadi.DMatrix_toScalar}
            1    0.000    0.000    0.000    0.000 casadi.py:5858(size)
            1    0.000    0.000    0.000    0.000 {_casadi.GenDMatrix_numel}
            1    0.000    0.000    0.000    0.000 {_casadi.GenDMatrix_size}
            2    0.000    0.000    0.000    0.000 {time.time}
            1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

    r2596:
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.877    0.877    0.877    0.877 {_casadi.FX_solve}
            1    0.043    0.043    0.043    0.043 {_casadi.SharedObject_init}
            1    0.003    0.003    0.003    0.003 {_casadi.new_IpoptSolver}
            5    0.001    0.000    0.001    0.000 {_casadi.FX_setInput}
            2    0.000    0.000    0.000    0.000 {numpy.core.multiarray.concatenate}
            1    0.000    0.000    0.925    0.925 OpenSolver.py:149(solver_start)
            2    0.000    0.000    0.000    0.000 {_casadi.FX_output}
            1    0.000    0.000    0.000    0.000 {_casadi.DMatrix_set}
            2    0.000    0.000    0.000    0.000 {_casadi.OptionsFunctionality_setOption}
            1    0.000    0.000    0.003    0.003 casadi.py:23093(__init__)
            4    0.000    0.000    0.000    0.000 casadi.py:34(_swig_setattr_nondynamic)
            1    0.000    0.000    0.000    0.000 {_casadi.FX_evaluate}
            2    0.000    0.000    0.000    0.000 casadi.py:50(_swig_getattr)
            5    0.000    0.000    0.001    0.000 casadi.py:12675(setInput)
            4    0.000    0.000    0.000    0.000 casadi.py:47(_swig_setattr)
            1    0.000    0.000    0.000    0.000 {_casadi.FX_input}
            2    0.000    0.000    0.000    0.000 casadi.py:12562(output)
            1    0.000    0.000    0.000    0.000 casadi.py:8217(__float__)
            1    0.000    0.000    0.877    0.877 casadi.py:12374(solve)
            1    0.000    0.000    0.043    0.043 casadi.py:4278(init)
            1    0.000    0.000    0.000    0.000 casadi.py:7979(set)
            1    0.000    0.000    0.925    0.925 <string>:1(<module>)
            2    0.000    0.000    0.000    0.000 casadi.py:4789(setOption)
            1    0.000    0.000    0.000    0.000 casadi.py:8099(__del__)
            1    0.000    0.000    0.000    0.000 casadi.py:12364(evaluate)
            5    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
            1    0.000    0.000    0.000    0.000 {_casadi.PyDECREFParent}
            1    0.000    0.000    0.000    0.000 casadi.py:5902(size)
            1    0.000    0.000    0.000    0.000 casadi.py:7429(<lambda>)
            3    0.000    0.000    0.000    0.000 casadi.py:7426(<lambda>)
            1    0.000    0.000    0.000    0.000 {_casadi.GenDMatrix_numel}
            1    0.000    0.000    0.000    0.000 casadi.py:23068(<lambda>)
            2    0.000    0.000    0.000    0.000 {time.time}
            1    0.000    0.000    0.000    0.000 {_casadi.DMatrix_toScalar}
            1    0.000    0.000    0.000    0.000 casadi.py:7467(toScalar)
            1    0.000    0.000    0.000    0.000 casadi.py:5878(numel)
            1    0.000    0.000    0.000    0.000 casadi.py:12553(input)
            1    0.000    0.000    0.000    0.000 casadi.py:23065(<lambda>)
            1    0.000    0.000    0.000    0.000 {_casadi.GenDMatrix_size}
            1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

    -> {_casadi.FX_solve}, {_casadi.SharedObject_init} is now much slower.

    In my function:
    -Continuity constraints  as SXFunction
    -Objective function as SXFunction
    -Solver options:
    -solver.setOption("max_iter",3e3)
    -solver.setOption("generate_hessian",True)

    Best
    Vitalij

     
  • Joel Andersson

    Joel Andersson - 2012-04-23

    Hello Vitalij!

    I had a looked at the r2594-r2596 commits but couldn't find anything which would explain a slowdown. A few commits later, there were indeed some stuff that could have caused the slowdown during the initialization that you describe (a new way of calculating sparsity patterns for Jacobians of SXFunction.

    Did you also check with the latest version of CasADi? Does the problem still remain? If so, can please you check again exactly which commit caused the slowdown.

    Best,
    Joel

     
  • Ruge, Vitalij

    Ruge, Vitalij - 2012-04-23

    Hello Joel,
    Thank you so much for your quick response!

    In fact, you are right it's update from r2599 to r2601, that cause the issues above.
    With the latest version it's the same as in r2601.

    Best,
    Vitalij

     
  • Joel Andersson

    Joel Andersson - 2012-04-23

    Hello again,

    I see. This is quite worrying. Since you are using IPOPT, did you see any difference in iterations due to the commit? And from the IPOPT header, did the reported number of nonzeros for the Jacobian and Hessian change? You don't happen to have a minimal failing example?

    Best,
    Joel

     
  • Joel Andersson

    Joel Andersson - 2012-04-23

    Hello again,

    I have created a ticket for this: https://sourceforge.net/apps/trac/casadi/ticket/374 and also commited a tentative fix that might solve the problem. When running CasADi's testsuite after the commit, I saw a very small speed-up, so it might have resolved the issue. If so, please confirm this.

    Best,
    Joel

     
  • Ruge, Vitalij

    Ruge, Vitalij - 2012-04-23

    r2612:

    This is Ipopt version 3.10.2, running with linear solver ma27.

    Number of nonzeros in equality constraint Jacobian…:    16770
    Number of nonzeros in inequality constraint Jacobian.:        0
    Number of nonzeros in Lagrangian Hessian………….:      396

    Total number of variables……………………….:      693
                         variables with only lower bounds:        0
                    variables with lower and upper bounds:      693
                         variables with only upper bounds:        0
    Total number of equality constraints……………..:      594
    Total number of inequality constraints……………:        0
            inequality constraints with only lower bounds:        0
       inequality constraints with lower and upper bounds:        0
            inequality constraints with only upper bounds:        0

    iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
       0 -9.9999900e-03 4.06e+02 1.01e-04  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
       1 -1.1080083e-02 3.91e+00 3.27e+02  -1.0 9.90e-01    -  1.10e-02 9.90e-01f  1
       2 -5.8257565e-02 3.65e-03 8.08e+02  -1.0 8.46e-02   4.0 9.93e-01 1.00e+00f  1
       3 -2.9243043e-01 1.07e-01 7.95e+02  -1.0 4.57e-01    -  1.86e-01 1.00e+00f  1
       4 -2.6309905e-01 4.36e-05 9.78e+01  -1.0 2.93e-02   3.5 1.00e+00 1.00e+00h  1
       5 -2.8590562e-01 1.94e-02 8.82e+01  -1.0 2.95e-01    -  8.57e-01 1.00e+00f  1
       6 -4.4472845e-01 3.57e-01 1.33e+02  -1.0 8.83e-01    -  3.57e-01 1.00e+00f  1
       7 -4.1779514e-01 9.57e-05 2.99e+01  -1.0 2.69e-02   3.0 1.00e+00 1.00e+00h  1
       8 -4.1804770e-01 3.51e-05 3.29e+00  -1.0 8.44e-03   2.6 1.00e+00 1.00e+00f  1
       9 -4.1885148e-01 3.22e-04 3.16e+00  -1.0 2.56e-02   2.1 1.00e+00 1.00e+00f  1
    iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
      10 -4.2113912e-01 2.52e-03 2.94e+00  -1.0 7.16e-02   1.6 1.00e+00 1.00e+00f  1
      11 -4.2710544e-01 1.60e-02 2.45e+00  -1.0 1.80e-01   1.1 1.00e+00 1.00e+00f  1
      12 -4.4003391e-01 6.99e-02 1.65e+00  -1.0 3.78e-01   0.7 1.00e+00 1.00e+00f  1
      13 -5.3237768e-01 1.21e+00 5.46e+00  -1.0 1.59e+00    -  8.11e-01 1.00e+00f  1
      14 -4.8344532e-01 2.56e-01 3.15e-01  -1.0 7.53e-01    -  1.00e+00 1.00e+00f  1
      15 -4.7883473e-01 6.90e-03 1.65e-02  -1.7 1.21e-01    -  1.00e+00 1.00e+00h  1
      16 -4.8530914e-01 5.18e-03 2.93e-01  -3.8 2.21e-01    -  9.17e-01 1.00e+00h  1
      17 -5.4450802e-01 6.91e-01 4.80e-02  -3.8 1.31e+00    -  8.14e-01 1.00e+00h  1
      18 -5.7385118e-01 4.90e-01 1.07e-03  -3.8 1.01e+00    -  1.00e+00 1.00e+00h  1
      19 -5.6976904e-01 3.72e-02 2.15e-04  -3.8 4.29e-01    -  1.00e+00 1.00e+00h  1
    iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
      20 -5.6920622e-01 3.12e-04 1.59e-06  -3.8 8.05e-02    -  1.00e+00 1.00e+00h  1
      21 -5.7496313e-01 8.41e-02 4.58e-03  -5.7 4.64e-01    -  9.06e-01 1.00e+00h  1
      22 -5.7374609e-01 1.04e-02 4.12e-05  -5.7 3.03e-01    -  1.00e+00 1.00e+00h  1
      23 -5.7352651e-01 2.57e-04 4.44e-06  -5.7 1.12e-01    -  1.00e+00 1.00e+00h  1
      24 -5.7352367e-01 1.21e-05 2.00e-07  -5.7 2.23e-02    -  1.00e+00 1.00e+00h  1
      25 -5.7353537e-01 6.22e-05 9.49e-06  -8.6 3.77e-02    -  9.94e-01 1.00e+00h  1
      26 -5.7353414e-01 2.47e-07 4.86e-09  -8.6 3.53e-03    -  1.00e+00 1.00e+00h  1
      27 -5.7353414e-01 2.20e-11 4.64e-13  -8.6 3.39e-05    -  1.00e+00 1.00e+00h  1

    Number of Iterations….: 27

                                       (scaled)                 (unscaled)
    Objective……………:  -5.7353413769090589e-01   -5.7353413769090589e-01
    Dual infeasibility……:   4.6395371261802831e-13    4.6395371261802831e-13
    Constraint violation….:   5.3690647535438870e-12    2.2002399902021352e-11
    Complementarity………:   2.5060328417816396e-09    2.5060328417816396e-09
    Overall NLP error…….:   2.5060328417816396e-09    2.5060328417816396e-09

    Number of objective function evaluations             = 28
    Number of objective gradient evaluations             = 28
    Number of equality constraint evaluations            = 28
    Number of inequality constraint evaluations          = 0
    Number of equality constraint Jacobian evaluations   = 28
    Number of inequality constraint Jacobian evaluations = 0
    Number of Lagrangian Hessian evaluations             = 27
    Total CPU secs in IPOPT (w/o function evaluations)   =      0.812
    Total CPU secs in NLP function evaluations           =      0.024

    EXIT: Optimal Solution Found.
    time spent in eval_f: 0 s.
    time spent in eval_grad_f: 0 s.
    time spent in eval_g: 0 s.
    time spent in eval_jac_g: 0 s.
    time spent in eval_h: 0 s.
    time spent in callback function: 0 s.
    time spent in callback preparation: 0 s.
    ########
    Solution: -0.573534
    Solver time: 0.853206

    r2599:
    ******************************************************************************
    This program contains Ipopt, a library for large-scale nonlinear optimization.
    Ipopt is released as open source code under the Eclipse Public License (EPL).
             For more information visit http://projects.coin-or.org/Ipopt
    ******************************************************************************

    This is Ipopt version 3.10.2, running with linear solver ma27.

    Number of nonzeros in equality constraint Jacobian…:     3261
    Number of nonzeros in inequality constraint Jacobian.:        0
    Number of nonzeros in Lagrangian Hessian………….:      396

    Total number of variables……………………….:      693
                         variables with only lower bounds:        0
                    variables with lower and upper bounds:      693
                         variables with only upper bounds:        0
    Total number of equality constraints……………..:      594
    Total number of inequality constraints……………:        0
            inequality constraints with only lower bounds:        0
       inequality constraints with lower and upper bounds:        0
            inequality constraints with only upper bounds:        0

    iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
       0 -9.9999900e-03 4.06e+02 1.01e-04  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
       1 -1.1080083e-02 3.91e+00 3.27e+02  -1.0 9.90e-01    -  1.10e-02 9.90e-01f  1
       2 -5.8257565e-02 3.65e-03 8.08e+02  -1.0 8.46e-02   4.0 9.93e-01 1.00e+00f  1
       3 -2.9243043e-01 1.07e-01 7.95e+02  -1.0 4.57e-01    -  1.86e-01 1.00e+00f  1
       4 -2.6309905e-01 4.36e-05 9.78e+01  -1.0 2.93e-02   3.5 1.00e+00 1.00e+00h  1
       5 -2.8590562e-01 1.94e-02 8.82e+01  -1.0 2.95e-01    -  8.57e-01 1.00e+00f  1
       6 -4.4472845e-01 3.57e-01 1.33e+02  -1.0 8.83e-01    -  3.57e-01 1.00e+00f  1
       7 -4.1779514e-01 9.57e-05 2.99e+01  -1.0 2.69e-02   3.0 1.00e+00 1.00e+00h  1
       8 -4.1804770e-01 3.51e-05 3.29e+00  -1.0 8.44e-03   2.6 1.00e+00 1.00e+00f  1
       9 -4.1885148e-01 3.22e-04 3.16e+00  -1.0 2.56e-02   2.1 1.00e+00 1.00e+00f  1
    iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
      10 -4.2113912e-01 2.52e-03 2.94e+00  -1.0 7.16e-02   1.6 1.00e+00 1.00e+00f  1
      11 -4.2710544e-01 1.60e-02 2.45e+00  -1.0 1.80e-01   1.1 1.00e+00 1.00e+00f  1
      12 -4.4003391e-01 6.99e-02 1.65e+00  -1.0 3.78e-01   0.7 1.00e+00 1.00e+00f  1
      13 -5.3237768e-01 1.21e+00 5.46e+00  -1.0 1.59e+00    -  8.11e-01 1.00e+00f  1
      14 -4.8344532e-01 2.56e-01 3.15e-01  -1.0 7.53e-01    -  1.00e+00 1.00e+00f  1
      15 -4.7883473e-01 6.90e-03 1.65e-02  -1.7 1.21e-01    -  1.00e+00 1.00e+00h  1
      16 -4.8530914e-01 5.18e-03 2.93e-01  -3.8 2.21e-01    -  9.17e-01 1.00e+00h  1
      17 -5.4450802e-01 6.91e-01 4.80e-02  -3.8 1.31e+00    -  8.14e-01 1.00e+00h  1
      18 -5.7385118e-01 4.90e-01 1.07e-03  -3.8 1.01e+00    -  1.00e+00 1.00e+00h  1
      19 -5.6976904e-01 3.72e-02 2.15e-04  -3.8 4.29e-01    -  1.00e+00 1.00e+00h  1
    iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
      20 -5.6920622e-01 3.12e-04 1.59e-06  -3.8 8.05e-02    -  1.00e+00 1.00e+00h  1
      21 -5.7496313e-01 8.41e-02 4.58e-03  -5.7 4.64e-01    -  9.06e-01 1.00e+00h  1
      22 -5.7374609e-01 1.04e-02 4.12e-05  -5.7 3.03e-01    -  1.00e+00 1.00e+00h  1
      23 -5.7352651e-01 2.57e-04 4.44e-06  -5.7 1.12e-01    -  1.00e+00 1.00e+00h  1
      24 -5.7352367e-01 1.21e-05 2.00e-07  -5.7 2.23e-02    -  1.00e+00 1.00e+00h  1
      25 -5.7353537e-01 6.22e-05 9.49e-06  -8.6 3.77e-02    -  9.94e-01 1.00e+00h  1
      26 -5.7353414e-01 2.47e-07 4.86e-09  -8.6 3.53e-03    -  1.00e+00 1.00e+00h  1
      27 -5.7353414e-01 2.20e-11 4.64e-13  -8.6 3.39e-05    -  1.00e+00 1.00e+00h  1

    Number of Iterations….: 27

                                       (scaled)                 (unscaled)
    Objective……………:  -5.7353413769090467e-01   -5.7353413769090467e-01
    Dual infeasibility……:   4.6393983494974810e-13    4.6393983494974810e-13
    Constraint violation….:   5.3681436303143958e-12    2.1998625143737627e-11
    Complementarity………:   2.5060328417816566e-09    2.5060328417816566e-09
    Overall NLP error…….:   2.5060328417816566e-09    2.5060328417816566e-09

    Number of objective function evaluations             = 28
    Number of objective gradient evaluations             = 28
    Number of equality constraint evaluations            = 28
    Number of inequality constraint evaluations          = 0
    Number of equality constraint Jacobian evaluations   = 28
    Number of inequality constraint Jacobian evaluations = 0
    Number of Lagrangian Hessian evaluations             = 27
    Total CPU secs in IPOPT (w/o function evaluations)   =      0.036
    Total CPU secs in NLP function evaluations           =      0.016

    EXIT: Optimal Solution Found.
    time spent in eval_f: 0 s.
    time spent in eval_grad_f: 0.01 s.
    time spent in eval_g: 0 s.
    time spent in eval_jac_g: 0 s.
    time spent in eval_h: 0.01 s.
    time spent in callback function: 0 s.
    time spent in callback preparation: 0 s.
    Solution: -0.573534
    Solver time: 0.055137

     
  • Joel Andersson

    Joel Andersson - 2012-04-23

    Ah, I see. Please check again after updating. I've commited a fix.

     
  • Ruge, Vitalij

    Ruge, Vitalij - 2012-04-23

    r2599:
    Number of nonzeros in equality constraint Jacobian…:    11991

    r2016:
    Number of nonzeros in equality constraint Jacobian…:   286917

    r2017:
    Number of nonzeros in equality constraint Jacobian…:   286917

    (Example ist in ticket)

     
  • Joel Andersson

    Joel Andersson - 2012-04-23

    Thanks. I've committed another fix.

     
  • Ruge, Vitalij

    Ruge, Vitalij - 2012-04-23

    work very well!
    thank you very much!

    Best,
    Vitalij

     
  • Ruge, Vitalij

    Ruge, Vitalij - 2012-04-23

    work very well!
    thank you very much!

    Best,
    Vitalij

     
  • Joel Andersson

    Joel Andersson - 2012-04-23

    Ok! Thank you for your help Vitalij, and sorry for screwing up the trunk.

    Best,
    Joel

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks