Effects of Basis Set Size: The Molecular Case

3.1. Effects of Basis Set Size: The Molecular Case#

In this exercise you will compute the equilibrium energy of H\(_2\) at the Hartree-Fock level using different basis sets, i.e. the 6-31G and 6-311G sets (with which you are familiar from last week), as well as Dunning’s correlation-consistent aug-cc-pVTZ basis. Correlation-consistent basis sets were defined such that systematic improvement over total energies and molecular properties is possible.

First again we again import the required modules:

import psi4
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sys 

sys.path.append("..")
from helpers import *

then we set the maximum ressources that can be used

psi4.set_memory('2 GB')
psi4.set_num_threads(4)

next we define our \(H_2\) geometry, setting the H molecules at equilibrium distance of each other, i.e. 0.7414Å. We are using XYZ coordinates for the psi4.geometry, but as you saw last week also Z-matrix geometries can be used here.

h2 = psi4.geometry("""
0 1
H 0.0 0.0 0.0
H 0.0 0.0 0.7414
""")

We can have a look at out molecule using the helper function drawXYZ, to double-check everything looks fine

drawXYZ(h2)

Edit the code cell below to compute the energy (using psi4.energy() funciton) for the different basis sets selected

basissets = ['6-31G', '6-311G', 'aug-cc-pvtz'] # these are the basis sets we are going to use

psi4.set_options({'reference':'UHF'}) # We are using Unrestriced Hartee Fock

for basis in basissets:
    psi4.core.set_output_file(f'{basis}-output.log', False) # save in seperate log files
    E = None    # EDIT HERE
                # we want to perform a single point energy calculation once per basis set

    print(basis, E)

Exercise 1

Include a table of the the calculated energies using the three different basis sets

Lets have a look at one of the outputs (e.g for the 6-31G basis). We can use the shell command !grep -options SEARCHTERM FILENAME for this. (Remember: The ! denotes that this is not a python command but a bash program.)

As the SCF procedure is iterative we can search for the word iter in the output. By including -A 1 -B 3 we get 1 line after the last find and 3 lines before the first find, which contain some helpful context to make sense of the results.

!grep -A 1 -B 3 iter 6-31G-output.log

Exercise 2

What is the meaning of the Delta E column? What is DIIS? (Hint: to get more details, it may help to check the HF page on psi4 manual)

Next, let’s plot the SCF convergence for the three tested basis sets. You can use a little helper function called psi4_read_scf which will read in the table shown above and return it to you as pandas dataframe that you can easily plot.

# Read in scf table like so:

# Syntax of psi4_read_scf function:
nameOfDataFrame = psi4_read_scf('nameOfOutputFile.log')

# to view the table you can simply type the name of the DataFrame, i.e.:
nameOfDataFrame

For the plotting we will use matplotlib. We will also include in the plot the exact value. As explained in the introduction, this was obtained with a post-Hartree Fock calculation, and its value for the H\(_2\) molecule is \(E_{exact} = -1.174474{}\) a.u.

In particular, we suggest you to produce two plots, one with the full set of points, and the other one excluding the first SCF iteration. This can help you to appreciate the differences between the three basis sets. In the cell code below, we already set both subplots. Edit the code in order to plot the results for the three basis sets.

# Subplot 1: full data and exact value reference
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14,7)) # create figure with two subplots
fig.tight_layout(pad=7.0) # add spacing between supblots 

# we use the table index and the Total Energy column for plotting
# repeat the following command once per basis set to have all on the same plot
ax1.plot(nameOfDataFrame.index, nameOfDataFrame['Total Energy'], '.-', label='label') 

ax1.set_xlabel('SCF Iteration') # set axis labels and title
ax1.set_ylabel('E [a.u]')
ax1.set_title('SCF convergence')

ax1.axhline(-1.174474, label='Exact value', linestyle='--', color='gray') # horizontal line for the exact energy

ax1.legend() # show legend

# Subplot 2: excluding first SCF point to focus on the difference along SCF cycles
# you can exclud the first point of an array just by [1:]. This will select all point from 1 (so excluding 0) to the end
# example: array DATA --> DATA[1:]
# repeat the following command once per basis set to have all on the same plot
ax2.plot(nameOfDataFrame.index[1:], nameOfDataFrame['Total Energy'][1:], '.-', label='label') 

ax2.set_xlabel('SCF Iteration') # set axis labels
ax2.set_ylabel('E [a.u]')
ax2.set_title('SCF convergence - restricted range')

ax2.legend() # show legend
plt.show()

Exercise 3

How does the basis set size influence the SCF convergence behaviour? Include in your report the plot of the energy convergence along SCF iterations.