Gaussian03 On coffee: Newbie Guide

Clare Din and James C. Ianni, Department of Chemistry at University of Pennsylvania, June 2005
din@sas.upenn.edu | jianni@sas.upenn.edu

Prerequisites

This tutorial is intended for users who wish to submit Gaussian jobs on coffee.chem (hereafter referred to simply as "coffee"). It details everything you need to know to run a successful Gaussian job on our Linux high-performance computing cluster. This guide will not teach you the fundamentals or theory of using Gaussian. For that, refer to the Gaussian 03 User Manual or Foresman and Frisch's Exploring Chermistry with Electronic Structure Methods.

Before you submit any Gaussian jobs, you need to know the following:

Initial Setup for Gaussian

Edit .bashrc and add the following lines to your .bashrc file:

umask 027
export PATH=${PATH}:"./"
export g03root=/data/apps/g03src
export PATH=${PATH}:"${g03root}/g03/linda7.1/intel-linux2.4/bin"
export LINDA_PATH="${g03root}/g03/linda7.1/intel-linux2.4/bin"
 . ${g03root}/g03/bsd/g03.profile

Edit .bash_profile and add the following lines:

umask 027
export PATH=${PATH}:"./"
export g03root=/data/apps/g03src
export PATH=${PATH}:"${g03root}/g03/linda7.1/intel-linux2.4/bin"
export LINDA_PATH="${g03root}/g03/linda7.1/intel-linux2.4/bin"
 . ${g03root}/g03/bsd/g03.profile

Make sure you're in the bash shell (this should be your default shell).

[pdiddy@coffee pdiddy]$ cp /4sysadmin/myg03 ~pdiddy/myg03

Go into myg03 and run copyall.ksh. What that does is set up the initial .bashrc environment on all the compute nodes.

[pdiddy@coffee myg03]$ . $HOME/myg03/bash_profile.g03
[pdiddy@coffee myg03]$ ksh copyall.ksh


I get "permission denied" errors for one or more nodes! What happened? »

If you get "permission denied" errors for any of the nodes, please contact coffeeadmins@chem.upenn.edu. They will check your /data/home/ group directory for any problems.

How many times do I run the initial setup procedure? »

Just once. Once it's set up, all you have to do every time you wish to run a Gaussian job is to source the bash_profile.ksh in your myg03 subdirectory (see the Submitting a Gaussian Job on Coffee section).

Moving Your Gaussian Job from a Macintosh or PC to Coffee

You can create your Gaussian job files using a plain text editor, but this might make you go insane. Instead, use Chem3D or GaussView. Your Gaussian job file usually ends with .GJF or .COM, but could called anything you want so long as the extension is a "period" followed by three characters. The "period" between the filename and the suffix is mandatory. Example: molecule.GJF

sftp your job files to coffee.

If you create the .GJF file on an MS-DOS or Windows machine, you must run dos2unix on the file to convert it to the proper format. Example:

[pdiddy@coffee pdiddy]$ dos2unix cmpd200.GJF


Does GaussView run on Coffee? »

No. Run GaussView on your Macintosh. Coffee should not be used for interactive graphical user interfacing, only batch processing. Running graphic-intensive applications on coffee will bog down the user experience for others.

Do we have Gaussian 03 for Windows PCs? »

No, we currently have licenses for Gaussian 03 for Macintosh and Gaussian 03 for Linux. Gaussian 03 for Windows is a separate license for $3,000.

Submitting a Gaussian Job on Coffee

[pdiddy@coffee pdiddy]$ cd myg03
[pdiddy@coffee myg03]$ source bash_profile.ksh

Do not copy files into your myg03 directory! Instead, copy the files to your home directory (or any subdirectory in your home directory). Remember, do not copy files into your myg03 directory!

For molecules under 65 atoms, the heading of your .GJF file must look like:

%MEM=980MB
%NPROCS=1
%NPROCLINDA=8
%RWF=1,1GB,2,1GB,3,1GB,4,1GB,5,1GB,6,1GB,7,1GB
%chk=cmp200.chk
MAXDISK=8GB
# BLYP/6-311++G(2d,2p)/AUTO FOPT FREQ  Test

You can change the entire last line from BLYP to whatever theory you wish to use. There are several hundred theories available. Each theory has several hundred basis sets and wavefunctions. You should know this. Read Aeleen Frisch's 300-page Gaussian book.

Change %chk to the name of the molecule (if the molecule is called h2so4.GJF, then the checkpoint file should be called h2so4.chk). Checkpoint files can be huge. If you don't need a checkpoint file, delete this line.

Leave NPROC=1. Anything else and Gaussian won't run efficiently. NPROCLINDA= is the number of processors you want to run your job with. If you change this, then you must change the following line in qsubg03_replace.ksh:

#PBS -l nodes=8:ppn=1

Do not change ppn! Change the nodes value only. Example:

#PBS -l nodes=4:ppn=1

The default requested walltime is 96 hours, or 4 days. You may need to change this, but should probably run your job with this value first. The line to change, if needed, is:

#PBS -l cput=096:00:00

If you have more than 65 atoms, add this to your Gaussian theory line: INT=FMMNAtoms=254, like so:

%MEM=940MB
%NProcs=1
%RWF=1,1GB,2,1GB,3,1GB,4,1GB,5,1GB,6,1GB,7,1GB,8,1GB,9,1GB,A,1GB
%NPROCLINDA=2
MAXDISK=6GB
#P BLYP/6-311++G(2d,2p)/AUTO FOPT FREQ POP=FULL INT=FMMNAtoms=254 Test
For MP2 jobs, the %RWF line will separate your Gaussian swap space into 8 segments, which will be deleted after your job finishes. If this line is excluded, Gaussian will fail and run out of space for most MP2 jobs.

If you do not have a checkpoint file, use:

[pdiddy@coffee pdiddy]$ ksh $HOME/myg03/psubby2.ksh cmpd200.GJF

If you do have a checkpoint file (for example, you want to continue running your job assuming you have a checkpoint file), use:

[pdiddy@coffee pdiddy]$ ksh $HOME/myg03/psubby.ksh cmpd200.GJF

You will be asked: "Is there a checkpoint to copy?"" Say "y" for yes. psubby will try to look for a related file. You get to highlight your checkpoint file or you can type it in.

Ignore qwhatsrunning03.txt and qwhatsrunning.txt.

After you run one of these commands, a directory will be created with the name of your molecule (e.g. cmpd200). Inside the molecule directory is a .tar.gz file (e.g. cmpd200.tar.gz) which gets copied over to the compute nodes assigned to your job. The job runs on all the compute nodes.

When the job finishes or terminates because of lack of time, all of the associated Gaussian files are placed in a .tar file. You must untar this file to obtain your Gaussian log file and any necessary checkpoint files. The tar file will look something like:

G03cmpd200.ksh_14359.coffee_out.tar

The cmpd200 is, of course, the name of your molecule. The 14359 or some number will be the job ID that the scheduler assigned to your job. To extract the contents of this tar file, use:

[pdiddy@coffee cmpd200]$ tar xvf *tar

If you run out of time, Gaussian will checkpoint it (whether you use psubby.ksh or psubby2.ksh).

The log file, in plain text format, will look like cmpd200.log. This file contains your results.

The checkpoint file, in binary format, will look like cmpd200.chk. You can't read this, but you can use mformchk.ksh to convert this file into a formatted checkpoint file that Chem3D or GaussView can read. Example:

[pdiddy@coffee cmpd200]$ ksh mformchk.ksh cmpd200.chk


Untarring the .tar file gives me files that exist in myg03! What happened? »

This is a normal result. Amid the myg03 files is the log file and checkpoint file.

My log file is really long. Where are the results? »

It depends on the theory and basis set you are running. You will need to sift through the contents of the log file for the results. More on that in the section, What Does a Successful Gaussian Job Look Like?

What Does a Successful Gaussian Job Look Like?

Successful Gaussian jobs have the following characteristics: in the log file, at the very end, you should see:

LIFE IS A CONTINUAL (some quote, could be long)
Job cpu time: 0 days 0 hours 21 minutes 29.0 seconds.
File lengths (MBytes): RWF= 990 Int = 0 D2E = 0 Chk = 258 Scr =1
Normal termination of Gaussian 03 at Wed Jun 1 2005.

Where are the results?

It depends. In a typical log file, it'll look like:

header
citations
the original job file you submitted (a reparsed version of .GJF which contains the converging properties of the molecule)
results of calculations
footer
Most users (99% of users) will do geometry optimizations. Gaussian is finished when you see the following message in the log file:
Optimization completed.
-- Stationary point found.
Grep your log file for this.

For subsequent frequency analyses which you will use to verify your stationary point, a successful optimized geometry is denoted by NImag=0 (number of imaginary frequencies equals zero). Grep for this. As a Gaussian user, you should be familiar with this.

I see the following error in my Gaussian log file. What happened?

died without ever signing in
/data/apps/g03src/g03/linda-exe/l302.exel: error while loading shared libraries:
 util.so: cannot open shared object file: No such file or directory
subprocess pid = 16104 has exited. status = 0x0000, id = 0, state = 13. command
was /data/apps/g03src/g03/linda7.1/intel-linux2.4/bin/linda_rsh coffeecompute13
-r rsh /data/apps/g03src/g03/linda-exe/l302.exel 128450560 cmp200.chk 1 /data/ho
me/staff/pdiddy/14610.coffee/Gau-16099.int 0 1.rwf,134217728,2.rwf,134217728,3.r
wf,134217728,4.rwf,134217728,5.rwf,134217728,6.rwf,134217728,7.rwf,134217728 1 /
data/home/staff/pdiddy/14610.coffee/Gau-16099.d2e 0 /data/home/staff/pdiddy/1461
0.coffee/Gau-16099.scr 0 /data/home/staff/pdiddy/14610.coffee/Gau-16098.inp 0 ju
nk.out 0 +LARGS 0 coffeecompute13 10.12.0.11 46381 7 1 /data/home/staff/pdiddy/1
4610.coffee
died without ever signing in
Sign in timed out after 0 worker connections.
Did not reach minimum (7), shutting down.

Please review the Initial Setup for Gaussian section. Perhaps you did not add the lines mentioned in your .bash_profile.