Numerical aspects of DFT Jussi Enkovaara CSC – IT Center for Science Outline ● Real-space grids ● Density mixing ● Iterative solution of eigenproblem ● Parallelization Basis sets ● ● ● Expand wave functions in a set of basis functions Turns (continuous) Kohn-Sham equations into discrete matrix equations Criteria for good basis set ― Accuracy ― Numerical efficiency Real-space grids ● ● Wave functions, electron densities, and potentials are represented on grids. Single parameter, grid spacing h h ● ● Accuracy of calculation can be improved systematically by decreasing the grid spacing Can work only with smooth wave functions ― Pseudopotential or PAW approximation is needed Boundary conditions ● ● Real-space description allows flexible boundary conditions Zero boundary conditions (finite systems) ― ― ● possible to treat charged systems Periodic boundary conditions (bulk systems) ― ― ● potential, charge density and wave functions are zeros at the cell boundaries potential and charge are periodic wave-functions obey Bloch boundary conditions (k-points) Boundary conditions can be mixed ― periodic in one dimension (wires) ― periodic in two dimensions (surfaces) Finite-differences ● Both the Poisson equation and kinetic energy contain the Laplacian which can be approximated with finite differences ● Accuracy depends on the order of the stencil N ● Sparse matrix, storage not needed ● Cost in operating to wave function is proportional to the number of grid points Hamiltonian in real-space ● Denoting grid points by G, the real-space discretization of PAW Hamiltonian is where LGG' is the finite-difference Laplacian ● Sparse matrix with large dimension, thus iterative techniques requiring only are used Multigrid method ● ● General framework for solving differential equations with hierarchy of discretizations Poisson equation can be solved as ● Problem: critical slowing down for long wave length components of error ― ― solution: treat long wave length components on a coarser grid recursive restriction leads to the V-cycle Multigrid method ● Each multigrid level reduces the error component which is related to grid spacing Egg-box effect ● Integrals involving localized functions centered on an atom and functions spanning the whole simulation cell such as are evaluated as sum over grid points ● The due to high frequency components in localized function, integral depends on the position of atom with respect to grid Egg-box effect ● ● Problem: localization in real-space means delocalization in Fourier space and vice versa Fourier filtering is used for optimizing localization both in real and reciprocal spaces Real-space method ● Flexible boundary conditions ● Systematic convergence ● Uniform resolution – empty space expensive ● Total energy is not variational with grid spacing ● Egg-box effect ● Efficient multigrid and iterative methods ● Good parallelization prospects ● Real-space program packages ― GPAW, Octopus, Parsec Achieving self-consistency Self-consistent cycle revisited ● In practice, the new electron density is not used directly when calculating the effective potential for next iteration ● The new density is mixed with old density Density mixing ● ● ● ● Simplest approach: direct mixing Typically, small parameters are needed for the mixing to be stable Convergence can be slow In spin-polarized systems, charge density and magnetization density can be mixed separately Pulay mixing ● ● Mixing can be improved by using information from previous iterations Pulay method assumes that the residual depends linearly on the input density: ● Optimum set of αi is found by minimizing norm of the residual where Pulay mixing ● ● ● ● Minimization of residual leads to Original Pulay method used only for obtaining the new density A weight factor can be included for covering the solution space more efficiently Parameters of Pulay mixing ― number of old densities to use ― the weight factor Charge sloshing ● Small variation in input density can cause a large change in output density ― Problem especially in large metallic systems Convergence of SCF cycle can be slowed down considerably Charge sloshing is typically caused by the long range changes in density ― ● ● Filtering out long range changes in mixing can reduce the effect of charge sloshing Reducing charge sloshing ● ● ● ● The long-wave length changes can be filtered out by using a special metric when calculating the residuals In reciprocal space one can use a diagonal metric In real-space this metric is non-local, however it can approximated as The coefficient ci are chosen (with a weight factor w) as: Broyden mixing ● ● ● ● The SCF cycle can be interpreted as non-linear problem Newton method: Broyden methods build successively approximation to inverse Jacobian J-1 using information from previous iterations Updates can be performed without explicit storage of (approximate) inverse Jacobian Density mixing ● ● ● ● Density mixing is needed for the SCF-cycle to convergence Different mixing schemes exist Charge sloshing can be avoided by using suitable metric Optimum mixing parameters are highly system dependent Iterative solution of eigenproblem ● Determination of the effective potential from the input charge density is computationally simple ― ● ● ● Poisson equation can be tricky with some basis sets The major task in SCF calculation is the eigenproblem Well-established (dense) matrix diagonalization algorithms and libraries (LAPACK) can be used when basis size is modest If the basis is very large or the basis results in sparse matrix, iterative techniques may be advantageous Iterative solution of eigenproblem ● All iterative methods are based on trial eigenfunction which is updated in each iteration ● Approximation for eigenvalue is obtained as ● Updates are normally based on the residual ● Iterative methods differ in how the residual is used ● The full Hamiltonian matrix is not needed ― Only applying of the matrix to eigenfunction ― only O(N2) operations with sparse matrices Preconditioning ● ● ● ● Optimum update would be The residuals can be preconditioned by suitable approximations to the matrix inverse Preconditioned residuals can be obtained by solving approximately Poisson equation With plane wave basis a diagonal Teter preconditioner is often used Residual minimization ● One possibility for updating the trial wave function is ● Step length is found by minimizing the new residual ● The final update is obtained as ● Wave functions are updated band by band Davidson method ● ● ● In simple Davidson method one first calculates the preconditioned residuals for each state One constructs then a vector from wave functions and preconditioned residuals Hamiltonian is then diagonalized in this basis and Nel lowest eigenvectors are used as new trial wave functions Conjugate gradient method ● ● Conjugate gradient algorithm can be used for minimizing the expectation value By enforcing the updates the keep new trial wave functions orthonormal one can find out Nel lowest eigenstates Subspace diagonalization ● ● ● Both residual minimization and conjugate gradient algorithms calculate only arbitrary linear combination of Nel lowest eigenfunctions Hamiltonian can be diagonalized in the subspace of trial wave functions The Nel lowest eigenfunctions are obtained from subspace rotation: Orthonormalization ● ● After residual minimization the trial wave functions are not orthonormal Numerically efficient scheme for orthonormalization: ― Calculate overlap matrix and determine its Cholesky decomposition L ― Orthonormal wave functions are obtained by multiplying with inverse Cholesky factor Computational scaling of DFT calculation ● Poisson equation: O(Nb) – O(Nb log Nb) ● XC-potential ● ― semi-local functionals: O(Nb) ― EXX: O(Ne4 Nb) Direct solution of eigenproblem ― Construction of Hamiltonian: O(N2b) ― Diagonalization of Hamiltonian: O(N3b) or ● Iterative solution of eigenproblem ― Applying with Hamiltonian: O(Nel Nb) – O(Nel Nblog Nb) ― Subspace diagonalization: O(Nel3) ― Orthonormalization: O(Nel3) Summary ● ● ● Density mixing is needed in SCF cycle Iterative diagonalization schemes can be efficient with sparse matrices Computational scaling of DFT algorithm is O(N3) ― ― Direct eigensolvers: diagonalization dominates in large systems Iterative solvers: orthonormalization dominates in large systems Solving DFT equations in parallel Parallel calculations ● ● ● Speedup in modern supercomputers is based on large number of CPUs In order to exploit available computing power, parallel computing is needed Parallel computing allows one ― ― Solve problems faster Solve bigger problems Parallellization prospects in DFT ● ● ● ● Basis functions k-points and spin electronic states Additional trivial parallelizations ― e.g. different atomic configurations or unit cells Parallellization: k-points and spin ● Spin and k-points can be treated equivalently ● Trivial parallelization ● Limited scalability ― k-points only in (small) periodic systems ― spin only in magnetic systems Parallelization: basis ● Depends largely on the basis used ● Plane waves : parallel FFT is challenging ● Real-space grids: efficient domain decomposition P1 P2 Finite difference Laplacian PAW augmentation sphere P3 P4 Parallelization: electronic states ● Orthonormalizations are complicated ― ― communication of all wave functions to all processes Can be implemented as pipeline with only nearest neighbour communication Parallel scalability in practice ● ● Feasible amount of CPUs depends heavily on the studied system (number of atoms/electrons) Largest ground state DFT calculations use typically with few thousand CPU cores ● ● ● GPAW, real-space mode 561 Au atoms, 6200 electrons Blue Gene P, Argonne Summary ● ● Parallelization is needed for fully exploiting modern computers DFT equations offer several parallelization levels ― ● K-points, spin, basis, electronic states Optimum number of CPUs depend on the studied system Overview of GPAW GPAW ● Implementation of projector augmented wave method on uniform real-space grids, atomic orbital basis, plane waves Density-functional theory and time-dependent DFT ― ● ● Open source software licensed under GPL ― 20-30 developers in Europe and in USA wiki.fysik.dtu.dk/gpaw J. J. Mortensen et al., Phys. Rev. B 71, 035109 (2005) J. Enkovaara et al., J. Phys. Condens. Matter 22, 253202 (2010) GPAW features ● ● PAW-method: accurate description over the whole periodic table Total energies, forces, structural optimization analysis of electronic structure Excited states, optical spectra ― ● Non-adiabatic electron-ion dynamics Wide range of XC-potentials (thanks to libxc!) ― ● LDAs, GGAs, meta-GGAs, hybrids, DFT+U, vdW Electron transport ― ● ● ● GW-approximation, Bethe-Salpeter equation GPAW features ● ● ● ● Simple but flexible Python scripting interface via Atomic Simulation Environment Runs on wide variety of computer architectures Efficient parallelization, system sizes up to thousands of electrons Modular design helps implementing new features Atomic Simulation Environment ● ASE is a Python package for building atomic structures ― structure optimization and molecular dynamics ― ● ● analysis and visualization atomic positions ASE relies on external software which provides total energies, forces, etc. ― ● ASE ― energies, forces, wfs, densities GPAW, Abinit, Siesta, Vasp, Castep, ... Input files are Python scripts ― calculations are run as “python input.py” ― simple format, no knowledge of Python required ― knowledge of Python enables great flexibility Simple graphical user interface Calculator wiki.fysik.dtu.dk/ase Setting up the atoms ● Specifying atomic positions directly from ase.all import * # Setup the atomic simulation environment d0 = 1.10 x = d0 / sqrt(3) atoms = Atoms('CH4', positions=[(0.0, 0.0, 0.0), # C (x, x, x), # H1 (x, x, x), # H2 (x, x, x), # H3 (x, x, x)] # H4 ) view(atoms) ● Reading atomic positions from a file ... atoms = read('CH4.xyz') view(atoms) ― Several file formats supported Setting up the unit cell ● ● By default, the simulation cell of an Atoms object has zero boundary conditions and edge length of 1 Å Unit cell can be set when constructing Atoms atoms = Atoms(..., # positions must be now in absolute coordinates cell=(1., 2., 3.), pbc=True)# or pbc=(True, True, True) or later on atoms = Atoms(...) # positions in relative coordinates atoms.set_cell((2.5, 2.5, 2.5), scale_atoms=True) atoms.set_pbc(True) # or atoms.set_pbc((True, True, True)) atoms = ... atoms.center(vacuum=3.5) # finite system 3.5 Å empty space around atoms atoms = ... atoms.set_pbc((False, True, True)) # surface slab atoms.center(axis=0, vacuum=3.5) # 3.5 Å empty space in xdirection Units in ASE ● Length: Å ● Energy: eV ● Easy conversion between units: from ase.units import Bohr, Hartree a = a0 * Bohr # a0 in a.u., a in Å E = E0 * Hartree # E0 in Hartree, E in eV ― also Rydberg, kcal, nm, ... Pre-defined molecules and structures ● Database of small molecules (G2-1 and G2-2 sets) from ase.structure import molecule mol = molecule('C6H6') # coordinates from MP2 calculation mol.center(3.5) # molecule() returns unit cell of 1 Å ● Bulk structures of elemental materials from ase.lattice import bulk atoms = bulk('Si') # primitive (2atom) unit cell with exp. lattice constant atoms_conv = bulk('Si', cubic=True) # cubic 8atom unit cell atoms_my_a = bulk('Si', a=5.4) # User specified lattice constant Supercells and surfaces ● Existing Atoms objects can be “repeated” and individual atoms removed from ase.lattice import bulk atoms = bulk('Si', cubic=True) # cubic 8atom unit cell supercell = atoms.repeat((4, 4, 4)) # 512 atom supercell del supercell[0] # remove first atom, e.g. create a vacancy ● Utilities for working with surfaces from ase.lattice.surface import fcc111, add_adsorbate slab = fcc111('Cu', size=(3,3,5)) # 5layers of 3x3 Cu (111) surface # add O atom 2.5 Å above the surface in the 'bridge' site add_adsorbate(slab, 'O', 2.5, position='bridge') Performing a calculation ● In order to do calculation, one has to define a calculator object and attach that to Atoms from ase.structure import molecule # Setup the atomic simulation environment from gpaw import GPAW # Setup GPAW atoms = molecule('CH4') atoms.center(3.5) calc = GPAW() # Use default parameters atoms.set_calculator(calc) atoms.get_potential_energy() # Calculate the total energy ● Specifying calculator parameters ... calc = GPAW(h=0.18, nbands=6, # 6 bands and grid spacing of 0.20 Å kpts=(4,4,4), # 4x4x4 MonkhorstPack kmesh xc='PBE', txt='out.txt') # PBE and print text output to file ... ● See wiki.fysik.dtu.dk/gpaw/documentation/manual.html for all parameters Performing a calculation ● Serial calculations and analysis can be carried out with normal Python interpreter [jenkovaa@flamingo ~]$ python input.py ● Parallel calculations with gpaw-python executable #!/bin/bash #SBATCH J gpaw_test #SBATCH t 00:30:00 #SBATCH p parallel #SBATCH n 16 Module load gpawenv/0.10.0 srun gpawpython input.py Structural optimization from ase.all import * # Setup the atomic simulation environment from gpaw import GPAW # Setup GPAW atoms = ... calc = GPAW(...) atoms.set_calculator(calc) opt = BFGS(atoms, trajectory='file.traj') # define an optimizer opt.run(fmax=0.05) # optimize the structure until forces smaller than 0.05 eV / Å ● ● See wiki.fysik.dtu.dk/ase/ase/optimize.html for supported optimizers “Best” optimizer is case-dependent Simple Python scripting atoms = ... calc = GPAW(...) atoms.set_calculator(calc) # Check convergence with grid spacing for h in [0.35, 0.30, 0.25, 0.20, 0.18]: txtfile = 'test_h' + str(h) + '.txt' calc.set(h=h, txt=txtfile) e = atoms.get_potential_energy() print h, e import numpy as np atoms = ... calc = GPAW(...) atoms.set_calculator(calc) # lattice constant for different XCfunctionals for xc in ['LDA', 'PBE']: for a in np.linspace(3.8, 4.3, 5): txtfile = 'test_xc_' + xc + '_a' + str(s) + '.txt' atoms.set_cell((a, a, a), scale_atoms=True) calc.set(xc=xc, txt=txtfile) e = atoms.get_potential_energy() Saving and restarting ● Saving full state of calculation: .gpw-files (or .hdf5-files) ... calc = GPAW(...) atoms.set_calculator(calc) atoms.get_potential_energy() # Calculate the total energy calc.write('myfile.gpw') # Atomic positions, densities, calculator parameters ... calc.write('myfile.gpw', mode='all') # Save also wave functions (larger files) ... calc.write('myfile.hdf5', mode='all') # If GPAW is build with HDF5 support ● Restarting from ase.all import * # Setup the atomic simulation environment from gpaw import restart # Setup GPAW atoms, calc = restart('file.gpw') e0 = atoms.get_potential_energy() # no calculation needed calc.set(h=0.20) e1 = atoms.get_potential_energy() # calculation total energy with new grid Saving and restarting ● Trajectories: atomic positions, energies, forces ... calc = GPAW(...) atoms.set_calculator(calc) traj = PickleTrajectory('file.traj', 'w', atoms) # define a trajectory file for a in np.linspace(3.8, 4.3, 5): txtfile = 'test_xc_' + xc + '_a' + str(s) + '.txt' atoms.set_cell((a, a, a), scale_atoms=True) atoms.get_potential_energy() traj.write() # write cell and energy to trajectory file ● Reading atomic positions from ase.all import * # Setup the atomic simulation environment from gpaw import GPAW # Setup GPAW atoms = read('file.traj') # read the last image first = read('file.traj', 0) # first image calc = GPAW(...) atoms.set_calculator(calc) # calculator has to be attached Simple graphical interface (ase-gui) ● Trajectory can be investigated with ase-gui tool [jenkovaa@flamingo ~]$ asegui file.traj ● Investigate how total energy, forces, bond lengths etc. vary during simulation
© Copyright 2024