Talapas Essentials: The Structure of Talapas
RACS
The Talapas (from Chinook for coyote) cluster is managed by Research Advanced Computing Services or RACS. RACS administers the hardware, software, PIRGS, and other key services for Talapas. Troubleshooting Talapas? Need to request software for your team? The best way to reach RACS is through their customer portal.
RACS Resources:
- Talapas Knowledge Base/Documentation
- How to Login to Talapas
- Talapas Quick Start Guide
- Talapas Customer Portal
A University Supercomputer
Talapas enables programming tasks that could not be computed without more CPU cores, GPU cores, or memory than are available on consumer hardware. It also accelerates research at UO by offloading repetitive, highly parallel computational jobs from researchers’ devices, allowing scientists to focus on more important tasks.
Talapas is a heterogenous cluster consisting of hundreds of individual computers called nodes. It is made up of login nodes, compute nodes, and private “condo” nodes. These nodes are much more powerful than a personal computer! Each compute node can have up to 128 CPU cores. Some specialized nodes for large memory jobs have up to 4TB of RAM.
Talapas is a living, growing computational ecosystem. New software is added upon request, CPU and GPU hardware is periodically upgraded, and new computing nodes are added as research groups buy special nodes for their needs.
Getting to Talapas: Login Nodes
The four login nodes are shared by hundreds of Talapas users simultaneously. The login nodes share the same filesystem, but having multiple login nodes adds redundancy and fewer points of failure to the Talapas ecosystem. They are intended for loading data, transferring large datasets from the internet or the cloud to the Talapas filesystem, preparing software environments, and connecting to IDEs.Unlike other nodes in the cluster, login nodes are open to connections from the broader internet.
The CPU cores and memory on login nodes are not for doing computational work.
There’s a detailed tutorial for connecting to login nodes here.
What is login.talapas.uoregon.edu
?
Talapas has a load balancer at login.talapas.uoregon.edu
that distributes users as evenly as possible among the four entry or “login” nodes.
If you connect to login.talapas.uoregon.edu
through your terminal, you will be routed to one of the four login nodes – login1, login2, login3, or login4 – based on how many people are currently connected to each node.
If connecting directly to a given login node doesn’t work, try another. For example, try login1 if login2 times out. If you can’t reach any of the login nodes, please open a ticket with RACS.
Compute Nodes
The login and compute nodes share the same filesystem, but all non-trivial work occurs on compute nodes.
The compute nodes on Talapas are grouped into partitions based on what resources they have, how long those resources can be used, and (in the case of condo nodes) which users can access them. Each node
The Talapas Filesystem
Talapas uses a networked filesystem called GPFS to make code, input files, and other crucial pieces of data available across all nodes in the cluster.
All users have 256GB of storage available to them in their home directory at /home/yourDuckID
. No other users have access to your home directory.
You can check your home directory path by using the following Bash command.
echo $HOME
If you have access to Talapas, you are also part of a PIRG. Your research group’s data should live in /projects/PIRG_NAME
. Unless extra storage has been negotiated, PIRG project directories have a maximum of 2TB of storage.
Because the file structure for PIRGs recently changed, some PIRGS may have a slightly different stucture in their /projects/PIRG_NAME
folder.
You can also explore the filesystem and even upload files of up to 10GB in the Talapas Files app.
New PIRGs
Joined Talapas recently? This implementation assumes all files and folders within a PIRG are shared among all members. That means any files stored in /projects/PIRG
are, by default, readable by members of that PIRG.
For example, all members of racs_training
can access files and folders in the /projects/racs_training
directory. This is why everyone was added to a single, temporary PIRG for the purpose of sharing files among all members of the workshop.
Sharing Files in Legacy PIRGs
Older PIRGS were created with the same storage limit but a different permissions structure. This implementation had problems for collaboration among lab members, as labs have members that leave to work at other institutions.
Each user had their own folder within their PIRG with files their labmates couldn’t see at /projects/PIRG/DuckID
While all members could access data shared in /projects/PIRG/shared
Shared data was reserved for the folder /projects/PIRG/shared
, but it wasn’t uncommon for lab members to want to share files and folders from their /projects/PIRG/DuckID
folders with fellow lab members.
I am a PI and have a legacy PIRG. Can you fix our my group’s permissions now that members have left?
Yes, to fix the file structure and permissions within your projects/PIRG
directory, open a ticket with RACS. RACS can help retrofit your PIRG into a flatter file structure with simpler permissions. When your PIRG is modified, you can choose to keep the existing file and permissions structure or have the new file permissions scheme implemented.
Symlinks
If you do an ll
or long-listing command on Talapas, you will see special folders in your home directory that begin with @
.
ll
...
lrwxrwxrwx. 1 root root 26 May 28 2024 library_it -> /projects/library_it/emwin
lrwxrwxrwx. 1 root root 23 Sep 5 16:22 racs_training -> /projects/racs_training
These folders aren’t actually in your home directory. Symlinks or symbolic links are references between different locations in the filesystem. If you cd
into the /home/racs_training
symlink, you will be redirected or linked to /projects/racs_training
.
These pointers are added for your convenience so that you can move files and code from your home directory to your project directory.
Some new PIRGs do not have this symlink in place. Do not worry, you can still access the same folder through /projects/PIRG_NAME
. PIs can request that the symlink be added to their lab members’ project directories.
Which Folder Should I Use for What?
/home/yourDuckID
- code, testing instances, personal work
/projects/yourPIRG
- datasets, project data, code you want to share with other members of your PIRG
Transferring Files to Talapas
There are a variety of ways to transfer files to and from Talapas based on your use case.
- To transfer files to Talapas from your web browser you can use the Talapas file browser. There’s 10GB limit on what you can upload at a time.
- To transfer files from the command line, use the
scp
command. - To transfer datasets to Talapas from the internet, use the
wget http:/<filepath>
command on a login node. Remember, compute nodes are firewalled. - For complex or large-scale transfers, try Globus or FTP tools like Filezilla.
Software: Modules on Talapas
Software on Talapas is controlled through lmod modules.
You don’t need to worry too much lmod’s implementation details to use modules, especially if the software you need is already available in the module catalogue.
To run software from within a Slurm job, you’ll need to load the appopriate modules.
For example, let’s load Python.
module load python3
Tab</kdb> to autocomplete is available to you when searching through the list of modules.
To see the modules you currently have loaded, run module list
.
module list
Currently Loaded Modules:
1) miniconda-t2/20230523
2) python3/3.11.4
Now that Python has been loaded, you can access it from your path at python3
. This version of Python has been added to your PATH through lmod.
which python3
/packages/miniconda-t2/20230523/envs/python-3.11.4/bin/python
python3 --version
Python 3.10.13
To remove a module, use the module unload
command followed by the module name.
module unload python3/3.11.4
Or get rid of ALL modules with module purge
.
module purge
Now, you can see that there are no modules loaded.
module list
Modules and PATH
The lmod system works by modifying your PATH variable.
The PATH variable defines the shell’s search path for executables: the list of directories that the shell looks in for runnable programs when you type in a program name without specifying what directory it is in.
When you type a command, the shell checks each directory in the PATH variable in turn, looking for a program with the requested name in that directory. As soon as it finds a match, it stops searching and runs the program.
For example, loading Python added /packages/miniconda-t2/20230523/envs/python-3.10.13
to my path. This is because miniconda is a dependency of the Python package.
module load python3/3.10.13
echo $PATH
/packages/miniconda-t2/20230523/envs/python-3.10.13/bin:/packages/miniconda-t2/20230523/condabin:/packages/miniconda-t2/20230523/bin:/packages/miniconda-t2/20230523/envs/python-3.11.4/bin:/packages/miniconda-t2/20230523/condabin:/home/emwin/.local/bin:/home/emwin/bin:/gpfs/t2/slurm/apps/current/bin:/gpfs/t2/slurm/apps/current/sbin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/dell/srvadmin/sbin
Talapas also supports compiled languages like C and C++. Compilers like gcc and aocc are available as modules.
R and RStudio
R can be loaded as a module.
module load R/4.4.2
To use RStudio with the Talapas filesystem, load the rstudio/base module and launch the GUI application with the command rstudio
.
module load rstudio/base
rstudio
Browsing Modules
Want a more user friendly list of modules available?
Try module spider [keyword]
. Below, I’ll search for the neuroscience related software, FSL.
module spider fsl
-----------------------------------------------------------------------------------------------------------------
fsl:
-----------------------------------------------------------------------------------------------------------------
Versions:
fsl/5.0.9
fsl/5.0.10
fsl/6.0.1
fsl/6.0.7
fsl/6.0.7.9
Other possible modules matches:
FSL FSLeyes fsleyes fslpy
-----------------------------------------------------------------------------------------------------------------
To find other possible module matches execute:
$ module -r spider '.*fsl.*'
Alternatively, you can use the module avail
command to get a full list without relying on the spider search mechanism.
module avail
---------- /packages/modulefiles/t2/modulefiles/mpi/gcc/13.1.0 --------------
mpich/4.1.1 (L) openmpi/4.1.6
--------------------- /packages/modulefiles/t2/modulefiles ---------------------
AOCL/4.2.0
Geneious
MRIConvert/2.1.0
Mathematica/11.3
Mathematica/12.0 (D)
NonLinLoc/20221102
OpenDX/4.4.4
R/3.4.2-lcni
R/4.3.2
R/4.3.3
R/4.4.2 (D)
RECON/1.08
RFdiffusion1/RFdiffusion1
RepeatMasker/4.0.7racs1
RepeatModeler/1.0.10
RepeatScout/1.0.5
adapterremoval/2.1.7
adapterremoval/2.3.3
You can scroll through the list produced by module avail
using the arrow keys. Press Q to quit.
Reproducibility with Modules
Always use complete module names in your batch jobs and scripts when possible.
For example, module load fsl/6.0.7.9
is preferred to module load fsl
because the default version will change over time. The default version could be out of date.
You should always know which packages and which versions your code relies upon, as it will make it easier for other scientists to reproduce your code.
Talapas Partitions: Where Do I Run My Jobs
Here is a summary of the primary partitions on Talapas so you can decide where to schedule your jobs at a glance.
Partition | Max Job Time | GPUs | Description | CPU Type |
---|---|---|---|---|
compute | 24 hrs | no | default partition, appropriate for most users | AMD |
compute_intel | 24 hrs | no | for software that requires Intel processors, older computers | Intel |
computelong | 2 wks | no | default partition for jobs that take longer than 24 hours | AMD |
computelong_intel | 2 wks | no | default partition for jobs that take longer than 24 hours | Intel |
gpu | 24 hrs | yes | for shorter jobs that requires GPUs | AMD |
gpulong | 2 wks | yes | partition for GPU jobs that take longer than 24 hours | AMD |
interactive | 12 hrs | no | partition for interactive srun jobs, OnDemand apps Talapas Desktop, JupyterLab | AMD |
interactivegpu | 8 hrs | yes | GPU partition for interactive srun jobs, OnDemand apps Talapas Desktop, JupyterLab | AMD |
memory | 24 hrs | no | for memory-intensive jobs that require up to 4TB of RAM | AMD |
memorylong | 2 wks | no | for memory-intensive jobs that require up to 4TB of RAM of a long duration | AMD |
preempt | 1 wk | yes | special “partition” that appropriates nodes in other partitions | Various |
Partition Status: sinfo
Want to know the current status and time limits of all the partitions on Talapas? The command sinfo
displays all available partitions that you can schedule jobs on. (That means condo owners will see a slightly different list.)
sinfo
Each partition has one line for each state in order to list the number of nodes in each of the following states: mix, alloc, and idle.
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
compute up 1-00:00:00 6 drain n[0112-0117]
compute up 1-00:00:00 36 mix n[0111,0118-0135,0180-0196]
compute_intel up 1-00:00:00 20 mix n[0049-0058,0073-0082]
compute_intel up 1-00:00:00 12 alloc n[0083-0090,0092-0093,0105,0107]
compute_intel up 1-00:00:00 19 idle n[0059-0072,0091,0094-0096,0106]
computelong up 14-00:00:0 30 mix n[0119-0134,0136,0180-0192]
computelong_intel up 14-00:00:0 20 mix n[0049-0058,0073-0082]
computelong_intel up 14-00:00:0 12 alloc n[0083-0090,0092-0093,0105,0107]
computelong_intel up 14-00:00:0 19 idle n[0059-0072,0091,0094-0096,0106]
gpu up 1-00:00:00 22 mix n[0149-0150,0152-0160,0162-0169,0171-0172,0301]
gpu up 1-00:00:00 1 alloc n0151
gpulong up 14-00:00:0 17 mix n[0150,0152-0153,0155-0157,0162-0169,0171-0172,0301]
interactive up 12:00:00 2 mix n[0209-0210]
interactive up 12:00:00 9 idle n[0211-0212,0302,0310-0313,0398-0399]
interactivegpu up 8:00:00 1 mix n0161
memory up 1-00:00:00 8 mix n[0142,0372-0374,0376-0379]
memory up 1-00:00:00 6 alloc n[0141,0143-0146,0375]
memory up 1-00:00:00 2 idle n[0147-0148]
memorylong up 14-00:00:0 5 mix n[0142,0372,0374,0376,0378]
memorylong up 14-00:00:0 2 alloc n[0144,0146]
memorylong up 14-00:00:0 1 idle n0148
preempt up 7-00:00:00 6 drain n[0112-0117]
preempt up 7-00:00:00 158 mix n[0037-0046,0049-0058,0073-0082,0109-0111,0118-0136,0142,0149-0150,0152-0169,0171-0175,0180-0189,0191-0197,0209-0210,0221,0224,0230-0242,0244-0247,0262,0265-0270,0301,0303,0314-0316,0336,0349-0351,0363-0365,0372-0374,0376-0380,0385-0388,0390-0396,0997-1000]
preempt up 7-00:00:00 56 alloc n[0083-0090,0092-0093,0105,0107,0141,0143-0146,0151,0201-0204,0223,0225-0226,0229,0248-0249,0254-0261,0263-0264,0317-0326,0346,0348,0359-0362,0375,0389]
preempt up 7-00:00:00 96 idle n[0059-0072,0091,0094-0096,0106,0147-0148,0176-0179,0205-0208,0211-0220,0222,0227-0228,0250-0253,0302,0304-0313,0327-0335,0337-0345,0347,0352-0358,0366-0371,0381-0384,0397-0399]
You can interpret the results from sinfo
as follows. Nodes are grouped by partition and state.
- The AVAIL column represents the status of partition.
- The TIMELIMIT column represents max job time in days.
1-00:00:00
is 24 hours. - The NODES column indicates how many nodes are in the partition have a given state.
- The STATE column lists the node states, ie. are the nodes
alloc
allocated, idle, or in a mixture of idle and allocated states. - The NODELIST column lists the node in each of the possible states within a partition. Each node can be a member of one or more partitions.
To learn more, see the Slurm documentation for sinfo
.
If your PIRG has purchased condo nodes, you will see additional nodes in the list returned by sinfo
.
The preempt
partition is a special partition that allows users to take advantage of additional computational resources in a low-priority queue.
Do not run critical jobs on preempt, as there’s always a risk of having your job cancelled.
On any other partition, your job will run until it either finishes, meets the time limit you requested, or exceeds the resources you requested.
Slurm: The Talapas Scheduler
Slurm is the job scheduling software used on the Talapas. While Talapas has scheduling policies, partitions, and PIRGs that are specific to UO, Slurm is used for job scheduling on high-performance computing clusters around the world.
To schedule jobs on Talapas, you must give Slurm a partition where the job must run and an account (PIRG) associated with the job.
Slurm manages a queue of jobs that determines which node(s) on a partition your job will run.
Scheduling Simple Jobs with Slurm
To practice with Slurm tasks, connect to a Talapas login node. For this exercise, feel free to use the Talapas OnDemand shell.
Batch Scheduling with sbatch
Batch scripts in Slurm are configured through special comments prefixed with #SBATCH
.
All batch jobs should have #!/bin/bash
on the first line followed by #SBATCH
options in any order. It doesn’t matter what order you specify your #SBATCH
options in as long you specify them one per line.
#!/bin/bash
#SBATCH --partition=compute
#SBATCH --account=racs_training
This set of comments represent the minimum required options for a Slurm job of Talapas:
- a valid Talapas partition
- an account (PIRG)
nano first.sbatch
Inside nano
, enter the following lines. When you’re finished, use Ctrl+O and Ctrl+X to write out to the first.sbatch
file and then exit nano
.
#!/bin/bash
#SBATCH --partition=compute
#SBATCH --account=racs_training
echo "Hello!"
Required Slurm Job Elements
#!/bin/bash
on the first line--partition=[a valid partition]
--account==[your PIRG]
All other parameters like mem
, ntasks
, and --cpus-per-task
have default values by partition. The default value for job memory as configured through the --mem-per-cpu
is 4GB per CPU. For single core jobs, that means single-core jobs start with 4GB RAM unless you specify otherwise.
Let’s run our minimum viable job by passing it to the sbatch
command.
sbatch first.sbatch
You will get a response with a (unique) job number when your job is submitted successfully.
Submitted batch job 34704033
Check your job’s status in the queue using the squeue
command. The --me
flag is a helpful trick if you don’t want to type -u [yourDuckID]
each time.
squeue --me
With a job this simple, it’s probably already finished.
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
If you see an empty queue like this, go ahead and check your most recent finished jobs with sacct
.
sacct
34704033 first.sba+ compute racs_trai+ 1 COMPLETED 0:0
34704033.ba+ batch racs_trai+ 1 COMPLETED 0:0
34704033.ex+ extern racs_trai+ 1 COMPLETED 0:0
This job doesn’t have a specified output and error log file name, so it uses the slurm defaults: slurm-[jobid].out
. Doing an ls, we see a file that was created with the default parameters at slurm-[jobid].out
. You can see where a #SBATCH --job-name
might be more helpful in the debugging process.
Let’s check the contents of the output log. If it worked as intended, we should the results of the echo
command from first.sbatch
.
cat slurm-34704033.out
Hello!
We will look at Slurm and several associated commands in detail in the next session!
Shared Resource Etiquette
- Be conscientious about your use of shared storage in the
/projects/[yourPIRG]
folder. - Close out your jobs when you’re done!
- Book your interactive jobs for as long as you need, but not longer.
- You will not be warned when time is about to run out when running interactive jobs or the Talapas Desktop app. Track your own time conscientiously.