The Tufts High Performance Compute (HPC) cluster delivers 35,845,920 cpu hours and 59,427,840 gpu hours of free compute time per year to the user community.

Teraflops: 60+ (60+ trillion floating point operations per second) cpu: 4000 cores gpu: 6784 cores Interconnect: 40GB low latency ethernet

For additional information, please contact Research Technology Services at tts-research@tufts.edu


Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Q. I want to run a large number of R scripts submitted to the cluster, but there are too many to submit individually.

A.  Several ways to do this. If they all use similar resources ( memory, cores, time) and aren't dependant on each other, a bash script to submit jobs works great. Here's an example. Note that this will work for other applications ( Matlab) that can process scripts. Just make appropriate changes. 

First create a file of R script filenames that you want to run. Make sure each line has a line ending. These should be in the directory where you invoke the script.

Eg. myfiles.lst

RT_SET_1.R
RT_SET_2.R
RT_SET_3.R
RT_SET_4.R

Here is an example bash script to process the myfiles.lst into a series of submitted sbatch commands. You would want to edit the opts line to request the resources needed, and the outs line can be edited if needed.  Leave the echo line in to see what happens, then remove the echo, leaving the sbatch command: sbatch $opts $outs --wrap='R --no-save < $filenm' .

Eg. runBatchR.sh

#!/bin/bash
module load R/3.2.2 

opts="-p batch -c 8 --mem=10000 --time=10:00:00"
while read filenm; do
      outs="--output=$filenm.out --error=$filenm.err --mail-type=ALL --mail-user=$USER
      echo "sbatch $opts $outs --wrap='R --no-save < $filenm'"
      sleep 1
done

The 'while read filenm; do' line iterates over each line passed to it, copying the line into the filenm variable.

To run this just use the cat function to pipe (pass) the file of R scripts to the bash script. Here is what would get submitted.

cat myfiles.lst | sh runBatchR.sh

sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_1.R.out --error=RT_SET_1.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_1.R'
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_2.R.out --error=RT_SET_2.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_2.R'
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_3.R.out --error=RT_SET_3.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_3.R'
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_4.R.out --error=RT_SET_4.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_4.R'

Q. How do I install python modules locally?

A.  First, load the module for the python version ( e.g. 2.7) that you intend to use. That way the system managed module are loaded.

 module load python/2.7.6

Then create a local directory tree to store the modules ( ~/lib/python2.7/site-packages).  Append a modified PYTHONPATH to your .bash_profile and source it. This is only need for the first time that this is done. 

echo export PYTHONPATH="$PYTHONPATH:~/lib/python2.7/site-packages/" >> ~/.bash_profile
source ~/.bash_profile 

Now you can use pip or setup.py to load python modules locally.

pip install --user <PACKAGE>

If there is a requirements file

pip install --user -r requirements.txt

or after a python package is dowloaded and unpacked. Be sure to read the instructions for installing.

python setup.py install

 

  • No labels