Creating a PBS script
A PBS script is a text file that contains the information that PBS needs to set up the job, followed by the commands to be executed. In the PBS script, the lines beginning with “#PBS” are PBS directives that specify the resource requirements and various other attributes of the job. Note that the directives must come first in the script as any directives after the first executable statement are ignored. Since the R program we will be running doesn't require any interaction, we want to submit the job to the batch queue. In this example, the job we will be running has the following requirements:
- ·
the job will need 1 node, 10 processors and 64 GB of
RAM
- ·
the job will not require more than 48 hours to
complete
- ·
the name of the job will be 'myscript', with the
output written to myscript.out and errors written to myscript.err
- · we want email notifications to be sent to
your_email@abc.edu when the job starts and stops or aborts. (Substitute with
your actual email address)
The PBS
directives for the above requirements are outlined below, along with a brief
explanation of what each one does.
PBS Directive |
What it does |
#PBS -q batch |
Specifies that the job be
submitted to the batch queue |
#PBS -l nodes=1:ppn=10 |
Requests 1 node and 10
processors per node |
#PBS -l mem=64gb |
Requests 64 GB of RAM |
#PBS -l walltime=48:00:00 |
Sets max walltime for the
job to 48 hours |
#PBS -N myscript |
Sets the name of the job
as displayed by qstat |
#PBS -o myscript.out |
Sends standard output to
myscript.out |
#PBS -e myscript.err |
Sends standard error to
myscript.err |
#PBS -j oe |
Merge output and error
files. Both streams will be merged, intermixed, as standard output. |
#PBS -m abe |
Sends email on job abort,
begin, and end |
#PBS
-M my_email@abc.edu |
Specifies email address to
which mail should be sent. |
#PBS -S /bin/$shell |
Sets the shell to be used
in executing your script. If left out, it defaults to your normal login
shell. Typical values for the $shell argument are /bin/bash, /bin/tcsh,
/bin/csh or /bin/sh. |
#PBS -V |
Export all environment
variables in the qsub command environment to the batch job environment. |
Once we have specified the PBS
directives in our job submission script, we will want to add the commands to
set up the environment and launch our script. We start out by changing to the
PBS working directory (i.e., the directory from which we will be submitting our
job, which is also the directory where our script is located). After that, we
load any modules our script will need and call the script.
cd $PBS_O_WORKDIR/
module load R/3.5.1
Rscript --vanilla myscript.R
Putting it all Together
Now that we have the basics, we must
put them into a job submission script. In this example, we would save the
following content in a file called myscript.pbs. For your actual job submission
script, you will want to use your favorite text editor to make any necessary
changes to the script (e.g., substitute all of the 'myscript' occurences with
the actual name of your script, put in your actual email address, etc.) and
save it as your_script_name.pbs. (Note: you can call it anything you want but
it makes it easier to keep track of things if the name of your job submission
script matches the name of the R script that it launches.)
#PBS -S /bin/bash
#PBS -q batch
#PBS -l nodes=1:ppn=10
#PBS -l mem=64gb
#PBS -l walltime=48:00:00
#PBS -N myscript
#PBS -o myscript.out
#PBS -e myscript.err
#PBS -m abe
#PBS -M my_email@umn.edu
cd $PBS_O_WORKDIR/
module load R/3.5.1
Rscript --vanilla myscript.R
Submitting the Job
Now that we have written our PBS
job submission script, we are ready to submit the job to the cluster. To do
that, we use the 'qsub' command and give it the name of the .pbs script. Since
the script contains the PBS directives for everything our job needs, we
don't need to specify any other command line options.
[gaurav@login2 ~]$qsub myscript.pbs
Checking the Status of Your Job
Once you have submitted your
job, you will want to check the status using the qstat command.
[gaurav@login2 ~]$qstat -a
Example Gaussian PBS script
An example
single core Gaussian PBS script is below:
#!/bin/bash #PBS -P PANDORA #PBS -l
select=1:ncpus=1:mem=4gb #PBS -l walltime=4:00:00 module load gaussian cd $PBS_O_WORKDIR export GAUSS_SCRDIR=/scratch/PANDORA/abcd1234/g16_scratch mkdir $GAUSS_SCRDIR g16 < input.com >
output.log 2>&1 rm -rvf $GAUSS_SCRDIR |
Replace the project name (PANDORA) and the example UniKey (abcd1234)
with your own project name and UniKey. The input file in this example is input.com.
You must provide an input file before submitting your job.
Parallel Gaussian jobs
Gaussian can use multiple cores/CPUs. The below example PBS script
automatically sets %nprocshared equal to the number of CPUs requested
in the PBS script. Therefore, do not use the %nprocshared directive
in your Gaussian input file if you set GAUSS_PDEF in your PBS script.
#!/bin/bash #PBS -P PANDORA #PBS -l
select=1:ncpus=4:mem=4gb #PBS -l walltime=4:00:00 cd $PBS_O_WORKDIR module load gaussian export GAUSS_SCRDIR=/scratch/SASTEST/skol2049/gau_scr mkdir $GAUSS_SCRDIR export GAUSS_PDEF=$NCPUS g16 < blhs.com >
blhs.log 2>&1 rm -rvf $GAUSS_SCRDIR |
Artemis can only run Gaussian jobs on a single node. Linda workers are not available. Therefore, you can request a maximum of one chunk in the select PBS directive. Artemis’s nodes have a maximum of 24, 32 or 64 cores, depending on the node.
Example PBS script for Python
#PBS -q batch
#PBS -l nodes=1:ppn=10
#PBS -l mem=64gb
#PBS -l walltime=48:00:00
#PBS -N myscript
#PBS -o myscript.out
#PBS -e myscript.err
#PBS -m abe
#PBS -M my_email@umn.edu
cd $PBS_O_WORKDIR/
module load python/conda/3.6
python myscript.py
Comments
Post a Comment