WavePacket Wiki

Dynamics of quantum systems, controlled by external fields

Brought to you by: bsch63, ulflor

Numerics.Thermal

Authors:

Attachments

direct_construction.png (2056 bytes)

eq1.png (1120 bytes)

expectation.png (1785 bytes)

expectation_value.png (347 bytes)

expectation_value_squared.png (448 bytes)

lvne.png (1454 bytes)

normalization.png (1739 bytes)

random_coefficients.png (798 bytes)

random_density.png (1992 bytes)

random_derivationresult.png (2719 bytes)

random_number.png (189 bytes)

unit_expansion.png (2401 bytes)

unit_operator.png (780 bytes)

wavefunction.png (1239 bytes)

wavefunction_derivation.png (10217 bytes)

How to set up a thermal initial state

In many problems, the system under study is not in a well-defined initial state, but at a well-defined temperature.
In such a case, we need either a density operator or an ensemble of wave functions to describe our system.
The most straight-forward expression of a thermal state uses a thermalized density operator

$\hat \varrho_\mathrm{th} = \frac{\mathrm{e}^{- \beta \hat H}}{\mathrm{Tr} ( \mathrm{e}^{-\beta \hat H} )}$

Here, β = 1 / k_BT is the inverse of the typical thermal excitation energy (k_B is Boltzmann's constant, T the temperature), and H is the system Hamiltonian.

This page is supposed to give you an overview over some of the more common methods of constructing the thermal density operator. It is a less technical and extended version of the tutorial that accompagnied a particular demo for the C++ version of Wavepacket (MorseOscillator/Thermal).
There are basically two different ways to construct the thermal state: We can either attempt to construct the density operator directly, or we can represent the thermal state with an ensemble of wave functions.

Constructing the thermal density operator

Diagonalizing the Hamiltonian

If we have the eigenstates of the Hamiltonian, we can easily express the exponential. We find

$\hat H |i\rangle = E_i |i\rangle \qquad \rightarrow \qquad \hat H = \sum_i E_i |i\rangle\langle i| \qquad \rightarrow \qquad \mathrm{e}^{-\beta \hat H} = \sum_i \mathrm{e}^{-\beta E_i} |i\rangle\langle i|$

Hence, the construction goes as follows:

Determine the eigenstates and energies of the Hamiltonian.
For each eigenstate, calculate the exponential exp(-β E_i) and the direct product of the ket and bra state
Sum the result over all eigenstates.
Normalize the result / divide by the trace of the operator.

If you happen to perform the calculations in the basis of energy eigenstates, the thermal density operator is represented by a matrix where only the diagonals are given by the exponential terms. Then the direct construction is the trivial way to go. This method is for example used in the Matlab version of Wavepacket, where we routinely use the energy representation for open systems.

If you happen to work in any other basis, you should avoid this construction. Piecing the density operator together like this is cumbersome, and you need a diagonalization of the Hamiltonian to even start, which scales with the cube of the number of basis functions. In this case, you rather want to relax a simple initial guess for a density operator.

Relaxing the unit operator

It can easily be verified that the thermal density operator without the trace normalization is the solution of the initial-value problem

$\frac{\partial \hat \varrho}{\partial t} = \hat H \hat \varrho(t) \qquad \mathrm{with} \quad \hat \varrho(0) = \hat 1$

at time t=β.

Furthermore, note that this equation has the form of a Liouville-von-Neumann equation (LvNE) with two important specialties:

we do not commute with the Hamiltonian, but only multiply it from the left
the equation is set up in imaginary time instead of the common real time (we "relax" the density operator instead of "propagating" it)

There are efficient solvers for imaginary-time equations based on Chebychev polynomials. If you can set up such an equation and solver with not too much effort, this approach is the way to go.

The C++-version of Wavepacket allows a relatively smooth setup and solution of such a LvNE, so this is the preferred mechanism there.

Representing the density operator with an ensemble of wavefunctions

Density operators are rather unwieldy entities. Given a basis of N functions, a density operator is represented in this basis by an NxN matrix with N² entries. Propagation is also rather slow, with efforts (depending on the Hamiltonian) in the range between N² for a diagonal operator and N³ for a general operator in matrix representation. We can conclude that a density operator requires a lot of storage and is slow to propagate.

Contrast this with a wavefunction. It is represented by an N-dimensional vector (N entries), and the propagation effort scales between N and N². If we could represent the thermal density operator by an ensemble of few wave functions, we could speed up the computations considerably.

So where is the catch? In fact, there are some:

The handling of such an ensemble of wave functions (that is: the code you need to write) is somewhat more involved since you need to juggle around with multiple wave functions, not only one density operator. Also, the theory is a bit less straightforward.
It is not guaranteed to work. That is, you can construct cases where you need so many wave functions for convergence that you might just as well stick to the density operator. Mind you, very often you can get substantial savings, but you do not always have hard guarantees.
You can only treat closed quantum systems easily. As soon as you have an interaction with an (implicit) environment, you get relaxation and dephasing effects that are not captured by wave functions. However, if you increase your system to also include a discretized bath, your new "system" can be treated as a closed system, yet still capture the dissipation and relaxation effects on the original subsystem. Implemenations of this method have used efficient methods such as MCTDH (see for example here, or here or here) or Kosloff's spin bath (e.g., here or here)

While slightly different in the details, these wavefunction approaches have some common ground. The basic idea is to represent the (unnormalized) thermal density operator in a form like

$\mathrm{e}^{-\beta \hat H} = \sum_n c_n |\psi_n\rangle\langle \psi_n |$

where the c_n are some coefficients, and the ψ_n form the ensemble of wave functions that represent the density operator. Without loss of generality, we can require the wavefunctions to be normalized, since the normalization can be absorbed in the coefficients c_n. All operations on the density operator that we do in a quantum simulation fall into two categories:

We propagate the density operator (typically in real time and for a closed system).
We calculate an expectation value by multiplying the density operator with some operator and calculating the trace of the result.

Put together, every quantity of the density operator that we may ever be interested can be written as

$\langle \hat A \rangle (t) = \mathrm{Tr}\Bigl( \hat A \hat U(t, t_0) \mathrm{e}^{-\beta \hat H} \hat U^\dagger(t, t_0) \Bigr)$ ,

where A is some operator, and U is some time evolution operator from time t₀ to time t. (Note that more complex expressions are possible, but do not change the following arguments.)

Now, let us transform this equation a little. We take some orthonormal basis for the trace, use the resolution of identity and other bracket rules, and obtain

$\langle \hat A \rangle (t) &=& \mathrm{Tr}\Bigl( \hat A \hat U(t, t_0) \mathrm{e}^{-\beta \hat H} \hat U^\dagger(t, t_0) \Bigr)klzzwxh:0035&=& \sum_k \langle k | \hat A \hat U(t,t_0) \sum_n c_n |\psi_n\rangle \langle \psi_n | \hat U^\dagger(t,t_0) |k\rangleklzzwxh:0036&=& \sum_n c_n \langle \psi_n| \hat U^\dagger(t, t_0) \sum_k |k\rangle\langle k | \hat A \hat U(t,t_0) |\psi_n \rangleklzzwxh:0037&=& \sum_n c_n \langle \psi_n| \hat U^\dagger(t, t_0) \hat A \hat U(t,t_0) |\psi_n \rangleklzzwxh:0038&=& \sum_n c_n \langle \psi_n(t) | \hat A |\psi_n(t) \rangle \qquad\text{where} \qquad |\psi_n(t)\rangle = \hat U(t, t_0) |\psi_n\rangle$

This suggests a simple scheme for replacing the density operator by a wavefunction ensemble

Take some wavefunctions to represent the density operator with
Propagate each such wavefunction in time as required
Calculate the expectation value of operator A for each wavefunction in the ensemble
Sum up the results weighted by the coefficients c_n
Divide the result by the norm of the density operator as discussed next

One particular expectation value that we need to calculate is the normalization factor as

$\mathrm{Tr}\Bigl( \mathrm{e}^{-\beta \hat H} \Bigr) = \sum_n c_n \langle \psi_n | \psi_n \rangle = \sum_n c_n$ .

Now we only need to find a suitable wavefunction representation of the density operator.

Using Hamiltonian eigenstates

One trivial representation of the Hamiltonian is given by the energy eigenstates as already written some sections before,

The coefficients are the exponentials, and the wave functions are the energy eigenstates. Hence, all you have to do is get the eigenstates, usually from a diagonalization of the Hamiltonian, and follow the procedure outlined in the previous section. The C++ wavepacket demo MolVibration/OH/Thermal demonstrates this feature.

A few pros and cons with regard to this representation:

Exponential convergence: Note that the coefficients are negative exponentials whose value drops with increasing energy and decreasing temperature. If we reasonably assume that the contribution of the single eigenstates to some expectation value grows subexponentially, we are guaranteed to converge with only the lowest few excitations for sufficiently low temperatures. Even better, we can rather easily estimate how many states we need. For example, an OH bond has typical excitation energies of several 100 meV. For 1,000 K temperature, β is around 1 / (100 meV), so we need at most the lowest excited vibrational state for convergence, and all the higher excited states have a negligible contribution.
Need to diagonalize Hamiltonian: A major drawback of setting up the wavefunction ensemble as Hamiltonian eigenstates is the need to get the eigenstates from somewhere. Typically, this requires a diagonalization of the Hamiltonian, which is a rather slow operation. It happens to scale with N³, similar to the use of a density operator.
Poor convergence for large systems and high temperatures: As the system size and the temperature grows, more and more eigenstates need to be included for convergence. An increasing system size generally means that there are more eigenstates at a given energy, while an increasing temperature (smaller β) slows down the convergence.

Altogether we can note that the use of eigenstates of the Hamiltonian is a conservative, but systematic way of creating the wavefunction ensemble. It gives us hard guarantees on the convergence, but these are only useful for sufficiently low temperatures and sufficiently small systems.

For higher temperatures or larger systems we need a less systematic "shotgun approach" that samples the relevant state space mor aggressively. The most prominent is the random wavefunctions approach discussed next.

Random thermal wavefunctions

The basic idea of this approach is to sample the state space randomly, but with a strong bias towards "allowed" states whose energy is on the order of the thermal excitation energy 1/β or less.

This approach has been used for example with Kosloff's optimized spin bath approach (see here or with MCTDH (see here and here. If you plan to apply random thermal wavefunctions to MCTDH, see also our own work for caveats.

A minimum primer about random numbers and such

For completeness, let us start with the notation to be used for the random wavefunctions. We denote a random number, wavefunction, vector etc. by a tilde over the symbol, such as $\tilde x$ . Note that a random number is not a number, it is in essence a probability distribution. To get a real number out of a random number, we need to form an expectation value, denoted as .

For example, we could have a random number $\tilde x$ that is uniformly distributed over the interval [0,1]. Then the expectation value would be 0.5, the expectation value would be 0.25 and so on. For all but the simplest cases, calculating the exact expectation value is a major effort, and we instead approximate it by sampling: We draw n numbers randomly using the given probability distribution, and approximate the expectation value with the mean of all these numbers.

For more information, see a good book or at least the Wikipedia article on probability theory; the notation described here will be applied in the following.

Theory of random thermal wavefunctions

The theory is largely identical to the general wavefunction ensemble approach, but with some expectation values instead of a summation over multiple states. We start by representing the unit operator with a random wavefunction

We come to the construction of such a random wavefunction in a second. With the help of this representation, we can represent the unnormalized thermal density as

Now we can use that the calculation of the expectation value and the calculation of scalar products commutes, and repeat the derivation for the general wavefunction ensemble approach to finally obtain

So the general procedure as a slight deviation is then

Generate a sample of the random wavefunctions (for the construction, see below)
Propagate each wavefunction by β/2 in imaginary time using the system Hamiltonian
Calculate the norm of each resulting wavefunction and calculate the mean of all these norms
Propagate each wavefunction in real time as required
Calculate the expectation value of operator A for each wavefunction in the ensemble and calculate the mean of all these expectation values.
Divide the mean expectation value from 5. by the mean norm from 3.

See the corresponding C++ wavepacket demo for a simple working example.

Building the random wavefunctions

So the question is how to construct a random wavefunction with the property

For this, we first pick an arbitrary orthonormal basis, and expand our random wave function in this basis. We end up with a set of random coefficients whose distribution we need to determine. We get:

from which follows the requirement for the coefficients as

By far the simplest way to setup such coefficients is to choose for each coefficient one of the values +1 or -1 randomly.

To assemble the wavefunction from these coefficients, note that whenever you do numerical simulations (well, within Wavepacket at least), you always represent a wavefunction as a vector in an orthogonal basis. Thus, just combine your random coefficients into a Matlab vector (Matlab version) or CTensor (C++ version), and you are done. For the Matlab version, you only need to observe that the DVR representation is not normalized, so you should ideally set up the initial wave function in the FBR and transform back.

Advantages and Disadvantages

So what are the good and bad things about random thermal wavefunctions?

Fast convergence in many cases: In many practical situations, they can converge extremely fast, using as little as 10 wavefunctions.
Convergence for different conditions: The method works well for low temperatures, because the exp(-β H/2) imaginary time propagation effectively filters out high-energy states. However, it also works high temperatures, because the whole state space is sampled randomly. The method also often converges well for large systems. The latter feature can be qualitatively understood for weakly interacting degrees of freedom: Here, we essentially average each degree of freedom separately.
Convergence not guaranteed: In general, you are not guaranteed that the method works. You can give some heuristic arguments, but in contrast to the Hamiltonian eigenstates, there is no general "proof" that this expansion converges.
Poor lower bound for convergence: In fact, the lower bound is stochastic convergence. If you approximate an expectation value by the mean of a sample of N values, you can derive a rule of thumb from the central limit theorem. The relative error scales approximately as the square root of 1/N. This result also means that you cannot go to arbitrarily high accuracy; at some point you end up with, say 5% error and simply cannot get better without substantially more wavefunctions, which in turn negates the computational advantage of using wavefunctions in the first place.

Wiki: Numerics.Main