Thursday, October 17, 2013

User Fencing Tools (UFT) on github

I just published a set of scripts, programs, config file examples, etc that I wrote for use at BYU but should be useful to other HPC sites.  I couldn't think of a better name for it, so I called it the User Fencing Tools (UFT).  It is available in our github repo at https://github.com/BYUHPC/uft.

The tools are used to control users on HPC login nodes and compute nodes in various ways.  The tools make use of cgroups, namespaces, and cputime limits to ensure that users don't negatively affect each others' work.  We limit memory, CPU, disk, and cputime for users.

UFT also has examples for how to control ssh-launched processes on compute nodes.  You can account for those with Torque but can't control them (just like normal).  SLURM will have accounting and resource enforcement for these in 13.12 (Dec. 2013).