Menu

Convergence #2: time steps

Note: This post should probably be merged with the Wiki, in particular the page on the time-dependent Schödinger equation. The second part should be easy, but the first part uses a pretty different angle, so I first post the whole thing here.

This document describes how to propagate a wavefunction in time. The question should be answered are:

  • What time step should you use?
  • How can you monitor during a propagation that your chosen time step is ok?
  • What are advantages and disadvantages of different propagation schemes and when should you choose which?

To restrict the scope, I mostly skip convergence issues with respect to the grid. So either you already converged the grid, or I just take Hamiltonian plus numerical errors from the finite grid as The System and disregard deviations from reality until they come back for revenge.

Furthermore, I will only consider time-independent systems here. Basically, if you can converge a time-independent system, you can also converge a time-dependent system: Guess a slightly smaller time step and adjust until you are done. However, introducing the required formalities for proper reasoning (cavity-dressed states etc.) would bloat this already long description even more.

Some theory

Model system

As a model system throughout the text, I take one of the Wavepacket demos as blueprint. The demo, HarmOscillator/Gaussian_1D/1, describes a one-dimensional harmonic oscillator (omega = m = 1 a.u.), where the initial state is a displaced Gaussian wavepacket.
The native time scale is the classical oscillation period, which is 2pi a.u. for the given parameters. Apart from an offset of 1/2, the eigenenergies are spaced by the energy omega, hence the time scale for the n-th quantum eigenstates is approximately the n-th fraction of the classical oscillation period.

I keep the total simulation time to about one classical period. Longer times would make the simulations slightly slower, but do not contribute qualitatively new features.

Figure 1: Projection of the initial state onto the energy eigenstates as a function of the eigenstate's energy. Red crosses: 128 grid points, interval [-10,10] a.u.. Blue dots: 512 points, interval [-20,20] a.u.

The first figure shows the projection of the initial state onto the eigenstates of the harmonic oscillators, that is, their populations as red crosses. The Python script that generated this plot can be found here. The populations decay roughly exponentially down to about 10^-9 at an energy of 50 a.u. and remain on this level. This latter feature is not what we would expect from an analytic solution, instead the populations should decrease monotonically with energy. In fact, this feature is an artefact from the finite grid, which is particularly pronounced for high energies. To demonstrate this, we can use a brute force approach and double the size and density of the grid (increasing the number of points fourfold). In this case (blue dotted line), the populations decrease further as expected.

This observation highlights two issues that we will encounter again further below:

  1. We have an unnaturally high population of high-energy states. This will cause trouble during propagation.
  2. We can safely truncate the energies of the Hamiltonian at 50 a.u. This will perturb all higher energy states, but they are garbage anyway.

As a final remark, this system lacks some of the real-world complications for simplicity. The Hamiltonian is Hermitian and time-independent, there are neither open-system terms and no absorbing boundary conditions. We will briefly touch some of these complications later when discussing different solution schemes.

Finding a good time step and monitoring convergence

A first intuitive approach to determine a good time step is based on a qualitative understanding of the system dynamics. If you plot the spectrum of your initial state, as in figure 1, the fastest dynamics can be expected to come from populated eigenstates with the highest energies. So if you choose a time step that is considerably smaller than the oscillation, say one tenth, you would expect the simulation to converge. For the harmonic oscillator example, you could generously say that states above 50 a.u. have negligible population.
An energy of 50 a.u. corresponds to an oscillation time of 2pi / 50 a.u.~= 0.1 a.u. Taking one tenth of that, you could reasonably guess that a timestep of 0.01 a.u. should be good enough for reasonably accurate numerical schemes.

This approach is certainly useful to get a good order of magnitude guess, however, it is also not correct. We will go into the details when we examine our example system in the next section.

Another standard approach is to try out different parameters, that is, time steps, and to check how much the results differ as a function of the time steps. The differences should become smaller as we go to smaller time steps, i.e., more accurate solutions. Hence, we only need to set some qualitative target when we are satisfied, and run a couple of calculations.

As a simple measure for convergence, we can take the induced metric of the L2 scalar product. If v_1 and v_2 are two results with different time steps, then their distance is

 ||v1 - v2|| = sqrt{ <v1-v2|v1-v2> } = sqrt{ <v1|v1> - <v1|v2> - <v2|v1> + <v2|v2> } = sqrt{2} sqrt{ 1 - Re <v1|v2> } approx sqrt{1 - Re <v1|v2>}

Here, I assume that the norm of the states is approximately one. Since only relative quantities and order-of-magnitude estimates are important here, I also dropped the prefactor sqrt{2} for simplicity.

However, there is also a much simpler metric available. For this, we first note that in general the norm of a wavefunction is constant over time. This is a direct consequence of the time evolution operator exp(-iHt being unitary. All numerical schemes formally produce approximations to the time evolution operator, and unless they are carefully crafted, these approximations are not unitary. Hence, most numerical schemes fail to preserve the norm, and we can easily find an ok time step by checking that the norm of our wavepacket does not change.

However, like all good ideas, there are certain cases where this scheme fails. In particular, as soon as we use absorbing boundary conditions, the norm ceases to be a preserved quantity.

Application to model system

Now let us see what happens when we play around with the harmonic oscillator model system. Figure 2 shows the norm of the wavefunction as a function of time for different time steps ( The Python script can be found here). We used essentially a fifth-order Runge-Kutta solver to evolve the wave function in time.

Figure 2: *Norm of the wavepacket as a function of time for different time steps.

Recall that we estimated a good time step as 0.01 a.u. For a too large time step of 0.1 a.u., the norm diverges essentially immediately, showing the immediate use of the norm as a monitoring tool. However, even for our estimate of 0.01 a.u., the norm diverges rapidly at some point, while a slightly smaller time step of 0.005 a.u. shows no divergence.

So, what has happened here?

dt = 0.01 a.u. dt = 0.005 a.u.

Figure 3: *Projection of the time-dependent state onto eigenstates of the Hamiltonian, similar to figure 1, for different times. Left plot: dt = 0.01. Right plot: dt = 0.005.

To gain some further insight, we calculate spectra similar to figure 1, but for different times. This is done in figure 3 for the two relevant time steps of 0.01 a.u and 0.005 a.u. (Python script here).

These spectra tell a story. Apparently, the numerical errors that lead to the norm deviation accumulate not in the energy range where most of the wavepacket resides (< 50 a.u.), but at the highest energies. Now if we have an eigenstate for a given energy, we try to approximate the correct time evolution, exp(-iEt), by discrete time steps, which gives an error. An example is shown below for the exceedingly bad but simple Euler method. Fifth-order Runge-Kutta methods are qualitatively similar, although their error is much smaller. In particular, you need to sample one oscillation by multiple steps, say 5 or so, otherwise, you cannot hope to describe the dynamics accurately.

Figure 3b: *Approximation of the correct time evolution (black circle) with the Euler method for different time steps as fractions of the oscillation time T.

Now for the highest energy states, we have E about 200 a.u., and hence an oscillation period of about 0.03 a.u. Such an oscillation is poorly sampled by the time step of 0.01 a.u., leading to a rather large error in the norm. For the half as large time step, we sample an oscillation with about 6 discrete time steps, which is just enough for reasonably accurate results. Note that if you look closely, the highest energies also show a norm increase for dt=0.005 a.u., but much smaller, maybe one order of magnitude every 10 a.u. Hence, if we would simulate much longer, say, for 100 a.u., the norm would also diverge for the smaller timestep.

Corollary

The main result from trying out the theory is something like

Not the energy / dynamics of your initial state dictate the required time step, but the whole spectrum of your system.

This observation has an important consequence: If you are interested in performance, that is, large time steps, you should always truncate the spectrum of your Hamiltonian or its components! There are two reasons why truncating operators should not bother you too much:

  1. On the purely numerical side, these high-energy states are often garbage anyway, as discussed in figure 1. Hence, there is little harm done if you screw them up completely by truncating the operators.
  2. On the physics side, you need to recall that most systems are approximations. For our harmonic oscillator example, we observed severe numerical errors at energies larger than E = 50 a.u., which corresponds to the fiftieth harmonic oscillator eigenstate. Recall that harmonic oscillators are low-order Taylor expansions of the true potential. So whatever the real system is, we can safely assume that, say, the fiftieth excited state is not going to look like a harmonic oscillator eigenstate.

    No matter how accurately you model your system, you will always have such severe approximations. Maybe you use the Born-Oppenheimer approximation and ignore non-adiabatic couplings: This approximation fails for large energies. Maybe you use a model for your thermal bath: Rest assured that your model is not quantitatively correct at high energies.

So: always know and pick the energy range that you consider important. Then be brave and truncate the spectrum of your operators (typically kinetic energy and potential) outside of this range, knowing full well that you limit the validity of your calculations to sufficiently low energies. What you loose in correctness, you will definitely gain in computation speed and numerical robustness.

Figure 4: *Norm as a function of time for different time steps. The kinetic and potential energy have been truncated at 50 a.u. each, that is, values larger than 50 a.u. have been set to exactly 50 a.u. in the corresponding DVR/FBR grid representations.

For the model system, figure 4 shows the effect of truncating the kinetic energy and potential to 50 a.u. each (Python script here).
We can clearly see that the norm no longer diverges for the time step of 0.01 a.u., giving us a factor of 2 decrease in computation cost.
In practice we might truncate at considerably lower energies (20 a.u. or so); here we chose 50 a.u. because that is where the obvious numerical artefacts start.

Solution schemes

To solve the time-dependent Schroedinger equation or Liouville-von-Neumann equation, different methods or method classes exist. In the following we will present some important solvers and discuss their advantages and disadvantages

Runge-Kutta and other ODE solvers

The basic idea for a Runge-Kutta or similar scheme is to expand the time-dependent wavefunction in some orthonormal basis

psi(t) = \sum_n c_n(t) \varphi_n

Note that the grid representation is such an expansion with the coefficients c_n(t) being the values of the wavefunction at the grid points psi(x_n, t). If we plug this expansion into the time-dependent Schroedinger equation, we get

i \dot c_n(t) = \sum_m <\varphi_n| \hat H |\varphi_m> c_m(t)

The latter is a set of ordinary differential equations (ODEs) coupled by the matrix elements of the Hamiltonian.

There is a long history of numerical approaches to solve ODEs, and many solvers exist. Prominent general-purpose solvers are the Runge-Kutta schemes, the Bulirsch-Stoer method or various multistep methods.

As a particular case, there are also adaptive solvers, for example adaptive Runge-Kutta methods. Here, a Runge-Kutta method of order N gives you the solution for a given timestep, then you check with a special other method of order N+1 if the solution looks reasonably converged. If the errors are outside of a given bound, the solution is recalculated with a smaller time step. If, on the other hand, the solution seems extremely well converged, the time step can be increased instead.

Typical parameters that you supply are the order N of the scheme, and the timestep \Delta t. Alternatively, adaptive methods require an error bound, usually consisting of a relative and an absolute error. Generally, for a (Runge-Kutta) scheme of order N, the error scales as O(\Delta t^{N+1}).

Advantages

ODE solvers are incredibly robust work horses. They work with time-independent problems, time-dependent problems, complex Hamiltonians (absorbing boundary conditions), open systems, ... Just throw a problem at them, and you get a reasonable solution.

If you take an adaptive solver, you do not even need to bother with the timestep directly. Just supply an error bound, and the adaptive solver spits out a reasonable result.

In short: ODE solvers tend to always work, and they are very easy to use.

Disadvantages

ODE solvers do not use any knowledge about your system or quantum mechanics in general. This is boon because they are robust, but also a bane because it makes them the slowest of all solvers. There are also some cases where convergence is poor, which again translates into small time steps and low performance.

In short: If computation time is an issue, you should generally avoid ODE solvers.

Split-operator method

The basic idea of the Split operator method is to separate the Hamiltonian into components that you can diagonalize easily,

\hat H = \hat H_1 + \hat H_2 + ... + H_n

If you can diagonalize an operator, that is decompose it into a basis transformation T and a diagonal operator D in this basis, \hat H_i = T_i^_1 \hat D_i \hat T_i, you can also easily exponentiate it as

exp(\hat H_i) = T_i^{-1} exp(\hat D_i) T_i.

as can be seen by writing out the Taylor series of the exponential function.

The split operator then consists of writing the short-term propagator as a product of individual terms. For the common Strang splitting, we get

exp(-i \hat H \Delta t) = exp(-i \hat H_1 \Delta t/2) ... exp(-i H_{n-1} \Delta t/2) exp(-i H_n \Delta t) exp(-i H_{n-1} \Delta t/2)
     ... exp(-i \hat H_1 \Delta t / 2) + O(\Delta t^3)

Note that the right-hand side is by construction unitary. As a consequence, the split operator method preserves the norm of the wavefunction.

Besides the time step, there are two further "parameters" to choose. One is the order, though in practice, everything beyond the Strang splitting tends to become complex. The other is the choice of operators H_i; again, in practice, this is usually intuitive. You normally choose the potential as one component (diagonal in the grid representation / DVR), the kinetic energy as the second (diagonal in the underlying basis / FBR), and some dipole couplings or so as the third one for time-dependent problems.

Advantages

The split-operator method has a number of unique advantages.

First of all, despite the rather low order of O(\Delta t^3), it is generally rather accurate. As a handwaving argument, you can consider the artificial case where only the Hamiltonian is chosen and diagonalized. Then the split-operator method is actually the exact time propagation. As an alternative hand-waving argument, the conservation of the norm means that the approximate solution is searched only in a reduced-dimensional manifold of states with unit norm, which gives more accurate results.

A second advantage is that the split-operator method with Strang splitting is rather cheap for simple systems. Consider a Hamiltonian decomposable into two terms, kinetic and potential energy. Consider further that you are generally not interested in the value of the wavefunction at the end of each small timestep, but only every hundredth or so timestep. Then you can combine the last term of the n-th timestep and the first term of the (n+1)-th timestep into one exponential term. As the exponentiation can be performed beforehand, you need to apply two diagonal operators and perform two FBR <-> DVR transformations for propagating a single timestep \Delta t. Contrast this with a second-order Runge-Kutta-scheme with similar error of O(\Delta t^3). There you would have to apply the Hamiltonian twice for a total of four diagonal operators and four FBR <-> DVR transformations.

The split-operator scheme can also be extended to time-dependent systems, but becomes less efficient there. In practice, you then have to exponentiate a diagonal operator for every small time step.

Disadvantages

The split-operator method cannot be applied if your system has terms that cannot be easily diagonalized. This prevents its use with certain Hamiltonians, for example products of momentum and coordinate operators. Also, open-system terms tend not to have an accessible diagonal representation. Hence, although you could in principle generalize the split-operator method to density operators (replace Hamiltonians by Liouvillians), this is not very useful in practice.

Another issue concerns the implementation. An efficient diagonalization of certain operators like dipole couplings is possible, but requires some fiddling. This is, for example, the reason that the method is not yet implemented in the C++ version of Wavepacket as of version 0.3.2.

Polynomial methods

The basic idea of the various polynomial methods is an expansion of the short-time propagator into some polynomials. In the simplest form, this can be written as

exp(-i H \Delta t) \approx sum_{n=0}^N c_n P_n(-i H \Delta t),

although in some schemes the coefficients also depend on \Delta t. In particular, for various classical polynomials, the P_n can be calculated through a recursion relation, and the coefficients c_n can be precomputed. In practice, you then apply the Hamiltonian N times with some simple algebra in-between.

The free parameters are the choice of polynomials, the order of the expansion N, and the time step. The error of an N-th order expansion is generally O(\Delta t^{N+1}). Polynomials are typically chosen based on some additional properties.

Particularly useful are Chebychev polynomials. These have the property that they converge continuously against the exponential function within a certain domain. The details are a bit involved, for example you have to normalize the spectrum of your Hamiltonian to the range [-1, 1]. However, as this is implemented in Wavepacket, you just need to specify a maximum error, from which the order of the expansion is calculated automatically. The time step for the Chebychev expansion is inversely proportional to the spectrum of your Hamiltonian, hence this method benefits a lot from aggressive truncation.

Note that the order of expansion should generally be rather large (N > 20) for this method to be efficient. However, if you go to too large values (N > 100), you can experience numerical instabilities. Basically, the polynomials themselves have huge function values that can even overflow the double-precision range. So you should try to stay with not too large polynomials, say N < 60.

For these large orders, typically, you will choose a timestep that is much larger than for, say, an ODE solver. Often, this is not a problem, as the numerical time steps tend to be much smaller than the usual dynamics. However, if you do need the wavefunction at high time resolution, you can generally get there with additional interpolation steps. These require some work to set up, but are very efficient; in particular you should not need to apply the Hamiltonian to a wavefunction for the interpolation to work.

Advantages

The huge advantage of polynomial expansions is their efficiency.

To get an estimate, let us consider the Runge-Kutta procedure with the truncated Hamiltonian of figure 4. At a timestep of 0.01 a.u., and with the fifth-order Runge-Kutta procedure, we need to apply the Hamiltonian 5 x 100 times to propagate the wavefunction for 1 a.u., with a corresponding number of vector additions, multiplications etc. In contrast, if we take an expansion in Chebychev polynomials with a spectral range of 50 a.u., a timestep of 1 a.u., and an accuracy of 10^-8 ( about the order of the numerical errors, see figure 1), we need just 46 polynomials. That is, to propagate for 1 a.u., we need to apply the Hamiltonian about 50 times and require only one tenth of the computation time for the Runge-Kutta method.

Disadvantages

Polynomial methods in general, and the Chebychev expansion in particular, also have severe downsides.

Most importantly, all polynomial methods fail for time-dependent Hamiltonians. The reason is that the simple Taylor series for the short-time propagator turns into a hideous expression with time-integrations of time-ordered operator products, which does not allow simple numerical schemes. There are so-called Krylov subspace method like the short iterative Lanczos method, which offer a way out here, but these are pretty complex on their own.

The Chebychev method has some additional limitations. It works only if the Hamiltonian has either a real or a purely imaginary spectrum. This excludes important special cases like absorbing boundary conditions or open quantum systems. Also, the method requires the knowledge of exact lower and upper bounds to the spectrum, otherwise the solution may diverge.

Conclusion

As a general rule, you should always use a polynomial solver if your problem fits. In particular, when you need imaginary-time propagation to relax to the ground state or some thermal state, Chebychev polynomials are the way to go. For the many cases where polynomial solvers cannot be used, you can try out the split operator method, which tends to be surprisingly efficient, or one of the ODE solvers, which are more convenient to use. For small systems in particular, the convenience of an adaptive Runge-Kutta method easily beats the rather poor performance.

In any case, you should generally truncate your Hamiltonian. Truncation makes the propagation more stable and allows larger time steps. Plus, in many cases you only distort energy eigenstates that are poorly represented on the grid anyway.

Posted by Ulf Lorenz 2021-04-19

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.