Everybody should start here. Find a terminal program. On Linux or Macintosh systems, terminals are provided (on Mac, look under Utilties). In our experience, these terminals will be SSH-aware (compatible with KU sercurity requirements). For Windows, there is a free terminal program called Putty (http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html). Windows Putty offers a graphical menu interface to launch a session. From the command line, a session can be initiated by running
$ ssh email@example.com
If the user's login name on the workstation is the same as the login on the cluster, then it is not necessary to supply a user name, as we demonstrate here:
|Windows Putty||Linux or Mac|
When the connection is established, the user receives information about the recent changes to the computing cluster:
When you are finished, end the session by typing
If you are not finished, keep going to the next step.
Configure Your Shell
If this is the first time you've accessed the cluster, you'll need to do this. In order to use the CRMDA software and tools on the cluster, you need to configure your shell to use our environment. To do this, you need to add a command to the file your shell (the terminal prompt you are dropped at) runs every time it starts. To do this, edit the ".bashrc" and ".bash_profile" files in your user home. This can be done like so:
If you have X11 forwarding enabled (default):
$ gedit ~/.bashrc
$ gedit ~/.bash_profile
From the command line:
$ vim ~/.bashrc
$ vim ~/.bash_profile
To the end of each of the above files, add the following line:
Now, log out and re-log in to the cluster, and you should have access to CRMDA-maintained software and tools.
Go from the Login Node to a Compute Node
Don't plan to do much work on the login nodes. They don't hold all of the same software you want and the administrators do not want them to be bogged down by calculations. Instead, go from the login node to a compute node. The command to ask for 1 node with 1 processor (core) on that node would be
$ sbatch --nodes=1 --ntasks-per-node=1 --x11 --pty bash
This asks for graphics X11 forwarding (--x11; more about that later). The memory can also be specified as "2G". To ask for several cores on 1 node (test multicore project), run
$ sbatch --nodes=1 --ntasks-per-node=5 --mem=2G --x11 --pty bash
Interactive jobs can be run on any partition. By default, they go to the nodes owned by the user's "group" (which, in our case is 'crmda'). The default partition is displayed at login in the user message and it can also be retrieved by running
. If you wish to run on a node that is not in your owner group, you will then need to specify the partition. The 'sixhour' partition is a popular alternative:
sbatch --nodes=1 --ntasks-per-node=1 --x11 --partition=sixhour --pty bash
One can specify a particular node, "g001", with a request like (assuming the node g001 is available to the sixhour partition):
sbatch --nodelist=g001 --ntasks-per-node=1 --x11 --partition=sixhour --pty bash
See http://crc.ku.edu/using-hpc#Submitting"> the CRC documentation.
getnode: a shortcut
Most of the time, researchers want an interactive session in order to edit files and do small-sized computations. We want them to use only the minimum size of the computing resource. A simple shortcut was created to ask for just a single core on a randomly chosen node.
If you want X11 forwarding, use:
Or, you can disable X11 forwarding like so:
$ getnode -xX or $ getnode --no-x11
The getnode command has more options that allow the user to specify more cores or nodes. For example;
$ getnode -N=2 -n=5 -X
but information on these can be found by looking at getnode's help info by running:
$ getnode --help
What is X11 Forwarding?
X11 can transfer display windows from the compute node to the user workstation. It does not show a "whole desktop", just individual windows.
This requires the user workstation have an X11 display server. Linux workstations generally include and X11 server, while Mac and Windows systems do not. Installing an X11 server on Mac is fairly easy (see http://crmda.ku.edu/mac-admin-tips). On Windows, this is more difficult. If we have the networking setup working right, you can log in on hpc.crc.ku.edu, and from there you go to a compute node ("getnode -X"), and then, magically, the programs you launch on the compute node are displayed as windows on your workstation.
For more about X11, see Graphical Programs on an XServer. It includes some movies that demonstrate the use of an X11 server on an MS Windows system.
On an X11 enabled workstation, a user will open a terminal, and then launch an SSH session with the -X flag:
$ ssh -X <username>@hpc.crc.ku.edu
Mac users: $ ssh -Y <username>@hpc.crc.ku.edu
This enables the forwarding of the X11 display from hpc onto the user's computer. It is something of a "tunnel". Once you log into hpc.crc.ku.edu, then you can open a session in a compute node, and then the programs you start there will display on your workstation.
Users log in at hpc.crc.ku.edu, and then run:
and then they are "in" a compute node where they can launch programs that will display their windows on the user's computer.
After reacing a compute node, then any GUI program that is launched will be "forwarded" back to the user's workstation. If the editor Emacs is available, for example, fun this:
$ emacs &
The & at the end of the command "frees the terminal" to run more commands.
Installing X11 On Mac and Windows
Mac users need to install Xquartz. We have some Mac-admin-tips in these pages which demonstrate. (We are following instructions in the Apple website, which points at http://xquartz.macosforge.org">http://xquartz.macosforge.org).
We have exerted quite a bit of effort testing out various X11 Display Servers that can be added on a Windows computer. In the end, this is still a difficult thing and most Windows users need quite a bit of practice to become comfortable with it. In case you want to try, the program we have used is called
After Connecting, what Next?
The cluster nodes don't have much software. Users have to ask for the software they intend to use. This is done through a "module" framework. It is necessary for users to customize their interactive sessions to access the CRMDA modules.
To access CRMDA modules, tell your shell to load the base set of software we recommend automatically when you login by adding the following line to your "$HOME/.bashrc" AND "$HOME/.bash_profile" files:
By default, this will load a default set of software that includes Emacs, OpenMPI, Java, R, Mplus, SAS, and Anaconda. If you need custom versions of the software we recommend, or if you need additional software, you can add these customizations below the "crmda_env" line. For example, if you need to use STATA:
module load stata
Is that Everything?
The final bit of information is about where users are supposed to keep files. This is new policy in the new CRC cluster.
User home folders, the default login folder, is not the correct place to store programs and data. Instead, each user is afforded a directory in a folder referred to as $WORK, a UNIX alias for the place where the user group members have individual folders. For the CRMDA, the $WORK folder is "/panfs/pfs.local/work/crmda/<username>" and users who want to go to their folder in $WORK can simply run
$ cd $WORK
Observe, when I begin, I am logged in, my working directory is printed as "/home/pauljohn"
$ pwd /home/pauljohn
But if I change to the $WORK directory, I see
$ cd $WORK $ pwd /panfs/pfs.local/work/crmda/pauljohn
Some users ask, "how can it be that I log in on different compute nodes but I see all the same files in my HOME or the WORK directory." This is the magic of NFS, or the "network file system." All compute nodes have their own hard disk storage (note that "/tmp" is within each node), but folders like /home, $WORK, and $SCRATCH are not "inside" the compute node, they are on a shared disk drive.