Batch Jobs: Long Running computations
If a program has to run for a few hours or more, it should be prepared as a batch job and submitted to a cluster queue. This is the only feasible, efficient way that a relatively large number of users in the campus can share a large computing resource like the ACF cluster.
Here is the gist of it:
The user needs to prepare the long running program (say, a script written in R, Mplus, Stata, or SAS) and a "submission script". The submission script is the program that we use to ask the cluster scheduler to find some available compute nodes and send the job to the those nodes. The simplest kind of chore is simply to launch a single, long running job. Some jobs, however, are more interesting because they are parallel, meaning they divide up their work among several compute nodes and collect the results when they are finished. Parallel computing is the essence of high performance computing. We use that to run computer simulations or do massively parallel computations.
The big jobs we have been running fall into two groups.
- Lots of component jobs that run separately can be dispatched across many compute nodes by separate scripts. A simulation exercise may require thousands of repetitions, but they are separate from each other. We may write a shell script that creates hundreds or thousands of separate programs and program submission scripts. A job that can be split into many completely separate parts is said to be embarrassingly parallel. Its embarrassing because it is so easy.
- A job is truly parallel (that is, not embarrassing) if there is a main program that has computations done on several "threads." It assigns separate calculations to many compute nodes or cores and these threads in some sense need to communicate with each other. This kind of program is more difficult to prepare because one has to be cautious about making sure the different nodes are aware of what they ought to do, but it is also the most rewarding kind. If a master program is used to initiate all of the separate pieces, the results may be more believable to some computer scientists.
Two Vital Elements
- A submission script
- A program to be submitted by the submission script.
The CRMDA keeps a collection of working submission and program pairs, it can be viewed at https://gitlab.crmda.ku.edu/crmda/hpcexample.
Here is an example submission script, which is found in example 50 in our collection. This one is aimed to submit just one long-running R program.
The symbol "#MSUB" is a declaration that the scheduler is supposed to notice. While running, the job's name is "Rsimple", that's how we can spot it while running. This job is a one-core job, and only uses one processor, so we request exactly that amount. The -M argument is your email address, and -m "bea" means to email you when (b) the job begins, when (e) it ends successfully, or (a) if it fails, or aborts.
|sub-serial.sh: TheSubmission for HPC Example 50|
R --vanilla -f r-serial.R
As one can see, there is a "boilerplate-ish" feeling in this script, about the only thing the user would worry about is the walltime allowed. If we choose a number too small, the job will be canceled by the scheduler before it is done. If we ask for a lot of time, the scheduler may make us wait until the cluster is not full of other jobs.
There is a separate file, "r-serial.R", in the same directory as the submission script.
Submit A Job
To submit the batch job, run this command:
$ msub sub-serial.sh 7499366.fusion
fusion is the system where the scheduler runs.
It is running in the "background". While the job runs, we can log off ACF entirely, it will keep going.
When the job finishes, it creates 2 files,
1. Output file: Rsimple.o749936
2. Error file: Rsimple.e749936
If everything went well, the error file might be empty, or it might have a harmless comment or warning. Of course, as is usually the case with R, we might have asked the program to create some graphics or data files, and they should be available as well.
At the current time, we do not believe that most of the other msub options will be relevant for people who create interactive sessions. But users who are curious can always read the helpful information in the man page:
$ man msub
qstat, qdel: Check, and Delete Batch Jobs
Did the job run yet? Is somebody else running too many jobs and clogging up the queue?
Check cluster status with qstat
To check the status of the job, we run the command "qstat". While Rsimple is waiting to run, we see the following.
|Job ID||Name||User||Time Use||S||Queue|
The column header S means "Status". The capital Q for the job Rsimple means it is waiting to run. It is waiting to run in the queue named default.
Remove requests with qdel
If you decide you need to kill a job, run "qdel" with the job number.
$ qdel 7499370
Should terminate Rsimple.
To delete several jobs, you can use just one command, such as
$ qdel 710 711 712 713 714
Perhaps that becomes tedious if you need to remove 100s of jobs you piled onto the queue by mistake.
We asked if there is a way to speed up the removal of a lot of jobs. The ITTC support staff offered a helpful answer:
for i in $(seq 1 1000); do qdel $i; done
That deletes the jobs numbered 1 to 1000.