Hi all,
(We tried to create this topic with email option but it didnt make it to the web so far, so we try our chances with the web platform instead. Sorry for possible duplication)
My colleagues and I have run several total energy calculations of simple structures in the past using elk-1.4.18. ( See http://arxiv.org/abs/1404.3015 )
We wanted to rerun some of these using the same input but this time with version elk-2.3.22. To be precise, we have modified the input slightly specifying the variables whose default vaules changed between versions, as much as we could follow from the outputs of both cases. The resulting total energy, for a test case Ag fcc, varies in the order of tens of Hartree between the two versions.
Clearly we are missing something in our way of modernizing the old input?
Thank you very much for your help and suggestions in advance,
Feel free to ask for any further information that is relevant,
cheers,
Emine Kucukbenli, theos, epfl,
for the authors of http://arxiv.org/abs/1404.3015 .
PS: We use the same species file, which is modified from the one of elk-1.4.18.
We check that the structure is interpreted in the same way in terms of lattice parameters and symmetries found.
Here is the input for elk-1.4.18 with comments on how it was adapted to elk-2.3.22 for some significant keywords:
You might want to go through the release notes as provided on the elk.sourceforge.net page - for one thing, I believe the meaning of "nempty" has changed between 1.4.x and 2.3.x
Also note that species files may vary with program version, so if you carried them over
from your original project directory (i saw an sppath ./ in your input) you may want to compare with the currently distributed Ag.in
A non-scientific approach to the problem might be to retry the calculation with a few intermediate versions (which should still be available on sourceforge) to see where the big jump in total energy occurs. (If it is that bad, chances are that it should show up
with less demanding (default) values for gmaxvr,rgkmax etc as well, speeding up the test)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Martin,
Many thanks for the 'version bisection' suggestion, very useful indeed. Here is what we find for simple Ag fcc with two different rgkmax values, (10.5 and 11 respectively).
--
version Tot_Energy_1 Tot_Energy_2
--
1.4.18 -5312.44808573 -5312.44814831
2.1.25 -5312.44811528 -5312.44817857
2.2.10 -5312.44811649 -5312.44817927
2.3.16 -5312.44811642 -5312.44818090
2.3.22 -5342.62697647 -5342.66682698
So the jump appears in the latest version.
From the release notes I cannot guess what might be the source of this. Perhaps something wrong in our compilation? All the versions have been compiled with the same make.inc which looks like the following:
The only remotely connected thing in the release notes would seem to be the
re-enabling of the "real Hamiltonian" for centrosymmetric systems. (Which does not raise my hopes that I would be able to find any associated bug in the code).
I'll see if I can reproduce your results with different hardware and compiler setups over the weekend.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think your use of fracinr poses the problem. I used a simple input file for Ag fcc and 2.3.16 as well as 2.3.22
Commenting out fracinr I get convergence without any errors on both codes and similar total energies. With fracinr=0.0 we still get convergence for the older code, but the 2.3.22 code gives linearization energy and total charge errors.
This should be also visible if your cluster produces an outputfile of the program run.
! fracinr
! 0.0
2.3.16: total energy : -5312.43899370
2.3.22: total energy : -5312.43898842
fracinr
0.0
2.3.16: total energy : -5312.43898052
2.3.22: total energy : -5255.91455060
Hi Martin, Eike
Yes indeed I have repeated the calculations with fracinr=0.0, 0.01, 0.02 etc. and only the one with 0.0 seems to blow the energy up. For the moment, I can afford to run calculations with values other than 0.0 but if anyone wants to look further into the issue I would be happy to provide test cases.
Many thanks for your help,
best
emine
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
... and obtained -5312.45393744 Ha with Elk 1.4.18 and -5312.45400182 with 2.3.22. The difference, 0.00006438 Ha, is mainly because a parameter in Poisson solver was changed between versions.
Regards,
Kay.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all,
(We tried to create this topic with email option but it didnt make it to the web so far, so we try our chances with the web platform instead. Sorry for possible duplication)
My colleagues and I have run several total energy calculations of simple structures in the past using elk-1.4.18. ( See http://arxiv.org/abs/1404.3015 )
We wanted to rerun some of these using the same input but this time with version elk-2.3.22. To be precise, we have modified the input slightly specifying the variables whose default vaules changed between versions, as much as we could follow from the outputs of both cases. The resulting total energy, for a test case Ag fcc, varies in the order of tens of Hartree between the two versions.
Clearly we are missing something in our way of modernizing the old input?
Thank you very much for your help and suggestions in advance,
Feel free to ask for any further information that is relevant,
cheers,
Emine Kucukbenli, theos, epfl,
for the authors of http://arxiv.org/abs/1404.3015 .
PS: We use the same species file, which is modified from the one of elk-1.4.18.
We check that the structure is interpreted in the same way in terms of lattice parameters and symmetries found.
Here is the input for elk-1.4.18 with comments on how it was adapted to elk-2.3.22 for some significant keywords:
tasks
0
xctype
2
avec
0 0.5 0.5
0.5 0 0.5
0.5 0.5 0
scale
7.5792
sppath
'./'
atoms
1 : nspecies
'Ag.in' : spfname
1 : natoms
0.0 0.0 0.0 0.0 0.0 0.0 : atposl, bfcmt
rgkmax
10.5
!commented out for elk-2.3.22 as keyword assignment changed
isgkmax
-2
lmaxapw
8
lmaxvr
7
!added for elk-2.3.22 as new default is different
!lmaxinr
!2
fracinr
0.0
gmaxvr
22.0000
epsengy
0.00001
lradstp
2
nempty
10
ngridk
12 12 12
stype
0
swidth
0.005
vkloff
0.5 0.5 0.5
mixtype
2
beta0
0.05
You might want to go through the release notes as provided on the elk.sourceforge.net page - for one thing, I believe the meaning of "nempty" has changed between 1.4.x and 2.3.x
Also note that species files may vary with program version, so if you carried them over
from your original project directory (i saw an sppath ./ in your input) you may want to compare with the currently distributed Ag.in
A non-scientific approach to the problem might be to retry the calculation with a few intermediate versions (which should still be available on sourceforge) to see where the big jump in total energy occurs. (If it is that bad, chances are that it should show up
with less demanding (default) values for gmaxvr,rgkmax etc as well, speeding up the test)
Hi Martin,
Many thanks for the 'version bisection' suggestion, very useful indeed. Here is what we find for simple Ag fcc with two different rgkmax values, (10.5 and 11 respectively).
--
version Tot_Energy_1 Tot_Energy_2
--
1.4.18 -5312.44808573 -5312.44814831
2.1.25 -5312.44811528 -5312.44817857
2.2.10 -5312.44811649 -5312.44817927
2.3.16 -5312.44811642 -5312.44818090
2.3.22 -5342.62697647 -5342.66682698
So the jump appears in the latest version.
From the release notes I cannot guess what might be the source of this. Perhaps something wrong in our compilation? All the versions have been compiled with the same make.inc which looks like the following:
MAKE = make
F90 = mpiifort
F90_OPTS = -O3 -ip -unroll -no-prec-div
F77 = mpiifort
F77_OPTS = -O3 -ip -unroll -no-prec-div
AR = ar
LIB_SYS = -L/opt/software/intel/13.0.1/impi/4.1.0/lib64/ -lmpi
LIB_LPK = lapack.a blas.a
LIB_FFT = fftlib.a
SRC_MPI =
SRC_libxc = libxcifc_stub.f90
SRC_FFT = zfftifc.f90
and the following environment is loaded: "module load intel/13.0.1 intelmpi/4.1.0 " where the intel and intel mpi versions can be seen.
Detailed information about the cluster can be found here:
http://hpc-dit.epfl.ch/clusters/bellatrix.php
Thank you very much for all your help and suggestions.
emine
Last edit: Emine Kucukbenli 2014-08-22
The only remotely connected thing in the release notes would seem to be the
re-enabling of the "real Hamiltonian" for centrosymmetric systems. (Which does not raise my hopes that I would be able to find any associated bug in the code).
I'll see if I can reproduce your results with different hardware and compiler setups over the weekend.
I think your use of fracinr poses the problem. I used a simple input file for Ag fcc and 2.3.16 as well as 2.3.22
Commenting out fracinr I get convergence without any errors on both codes and similar total energies. With fracinr=0.0 we still get convergence for the older code, but the 2.3.22 code gives linearization energy and total charge errors.
This should be also visible if your cluster produces an outputfile of the program run.
! fracinr
! 0.0
2.3.16: total energy : -5312.43899370
2.3.22: total energy : -5312.43898842
fracinr
0.0
2.3.16: total energy : -5312.43898052
2.3.22: total energy : -5255.91455060
Last edit: Eike Schwier 2014-08-23
Hi Martin, Eike
Yes indeed I have repeated the calculations with fracinr=0.0, 0.01, 0.02 etc. and only the one with 0.0 seems to blow the energy up. For the moment, I can afford to run calculations with values other than 0.0 but if anyone wants to look further into the issue I would be happy to provide test cases.
Many thanks for your help,
best
emine
Dear All,
I can confirm the bug in version 2.3.22 which occurs when 'fracinr=0.0'. This has already been fixed in the current development version.
As a temporary workaround, you can keep fracinr finite and set lmaxinr=lmaxvr.
I ran the following simplified elk.in:
... and obtained -5312.45393744 Ha with Elk 1.4.18 and -5312.45400182 with 2.3.22. The difference, 0.00006438 Ha, is mainly because a parameter in Poisson solver was changed between versions.
Regards,
Kay.