External Storage: An Introduction

Storage Limits

As of July 7, 2015, we allocate 20GB storage for each user account. 

Users who are employees or affiliated with CRMDA may be granted access to additional resources. There are several folders under  /crmda, that may become accessible to users with various types of accounts.  Many, but not all user accounts, will be created with access to read and write in the folder /crmda/users. Only CRMDA employees will have access to /crmda/projects, /crmda/procedures, /crmda/archive.

The Basic Idea

Users want edit files in their own computers and transfer onto the cluster so they can run them. Programs generate output that has to be downloaded. 

There are 2 primary ways of thinking about the transfer process. 

1. A "file transfer program" uses a secure copy protocol (scp) to move files.

2. A remote file server can be "mounted" so it appears "as if" it is attached as a disk drive on the user's computer. (In CRMDA, the remote share "/crmda" appears as drive R:\, for example.)

The user's home directory (same as "folder"), as is customary in Linux systems, is /home/username. All users, whether they are staff or affiliates with CRMDA, have these home folders.


Directories worth Knowing:

/home/username

/crmda

Users who are employees or affiliated with CRMDA may be granted access to additional resources. There are several folders under  /crmda, that may become accessible to users with various types of accounts.  Many, but not all user accounts, will be created with access to read and write in the folder /crmda/users. Only CRMDA employees will have access to /crmda/projects, /crmda/procedures, /crmda/archive.

Within each home folder, there are some directories that are automatically created. One is /home/username/data. That's a separate place that is optimized for accessing large files.

These storage devices exist behind a firewall and the staff has created a rubric through which users can access the storage. This note attempts to explain the many ways in which this can be done. As usual, I suggest you use what you like, and try to remember the other methods are available.

There are two protocols through which these shares can be accessed.

1. Secure Shell protocol (SSH)

The user's files under /home/username are accessible only via the SSH protocol. Below you will find help in both Windows and Linux systems. 


From Windows

Use a GUI file transfer program.

The only fully workable MS Windows program for interacting with ACF is WinSCP (the portable version works fine). At the current time, Filezilla is not able to change ownership of files and folders, but otherwise it works well and hopefully that capability will be added.


To see my crmda folder, for example, I start "Winscp"

 host:  transfer.hpc.crc.ku.edu
  Username: pauljohn
  Password: *****
  Port: 22

It is important to choose the SCP connection option. That's the one that allows us to change ownership of files.

That will home directory on the HPC.

Try it and see!

This will transfer files without formally "mapping" the drive as a device in your system. It is faster, and makes "automatic synchronization" possible with a file transfer program like Unison (which exists on all platforms, as far as I know) or rsync.

A program like rsync has options that allow it to bring all the files on crmda "up to date" without doing a lot of extra work. It is not necessary to copy a whole directory if only one file is changed.

The program Unison is nice because it can compare 2 folders and make a two-way exchange to put both of them up to date. I use this to keep my course folders synced on several computers.


From Linux

A GUI file transfer program

Free GUIs like Filezilla or gftp will work exactly the same way in Linux as in Windows.

The server that hosts the storage is "transfer.hpc.crc.ku.edu". The reason to prefer transfer.hpc.crc.ku.edu is that it is a restricted purpose access point that is usually not crowded with users.

Use Gnome File Manager Nautilus to view & access files

If you are using a Gnome desktop, you have access to nautilus, the file manager. It is the thing that shows your home folders files when you choose "Places" in the top menu bar. You can also open nautilus by opening a terminal and typing

$ nautilus

By default, nautilus does not open a "URL bar" where you can type. But it does have a little icon that looks like a pencil and if you click that, a URL bar opens up. Even more recently, they have hidden that, but under the right side icon that looks like 3 dashes there is a menu item called "Enter location" that will make it possible to type an address. Recently, this syntax worked:

ssh://pauljohn@transfer.hpc.crc.ku.edu:22/home/pauljohn

That uses the "secure shell file system access" method, which I note below can also be accessed from the command line as "sshfs".

A Linux desktop system can access the files in a more direct way. I would just as soon mount it with a SSH file system with this command:

$ mkdir testmount

$ sshfs transfer.hpc.crc.ku.edu:/home/pauljohn testmount

That causes contents of /home/pauljohn to appear on my local system as a folder.

I do that exact same kind of access on my user home partition with

$ mkdir myhome

$ sshfs transfer.hpc.crc.ku.edu: myhome

If for some reason this fails, try to mount a share on hpc.crc.ku.edu.  It makes the same storage available, but transfer.hpc.crc.ku.edu is usually faster.

When finished, to disconnect, it is necessary to remember the somewhat more complicated

$ fusermount -u myhome

$ fusermount -u testmount

Control Ownership And Permission

The instructions on Linux systems have a section on file ownership. I'd suggest you start by reviewing that material, in particular Linux File Permissions. That will explain how the permissions look within the HPC Linux system.

At the current time, the MS Windows Explorer (the common file manager) cannot set ownership or permissions on mounted HPC shares. It is necessary to use other means to change them.

Basically, files have "owners", and the rights are assigned to "groups" and "others". The owner should have rights to read, write, and execute. The group should default to "crmda", but that can be re-assigned. (It is necessary to ask the system administrators to create officially named user groups before they can be used.) If you want other users to be able to read your materials, the "other" group needs to have "read" permissions on files and directories must be both "read" and "execute." The "execute" bit is required for directory browsing.

As of Spring, 2011, the only program for MS windows that can assign ownership and adjust permissions is WinSCP, and that must be started in "SCP" mode.

In order to change owner or group information, it may be most convenient to log into ACF, navigate to the correct location, and run a command like this to change the ownership on a particular file:

$ chown myUserName.myGroupName fileName

That can also be applied to a directory, recursively:

$ chown -R myUserName.myGroupName dirName

To change a thing called "dirName" so that it is owned by "myUserName" and the group will be "myGroupName". The -R option has the effect of applying the change recursively.


CRMDA Calendar

Like us on Facebook
 
One of 34 U.S. public institutions in the prestigious Association of American Universities
44 nationally ranked graduate programs.
—U.S. News & World Report
Top 50 nationwide for size of library collection.
—ALA
23rd nationwide for service to veterans —"Best for Vets," Military Times
Equity & Diversity Calendar

KU Today