Q. I want to run a large number of R scripts submitted to the cluster, but there are too many to submit individually.
A. Several ways to do this. If they all use similar resources ( memory, cores, time) and aren't dependant on each other, a bash script to submit jobs works great. Here's an example. Note that this will work for other applications ( Matlab) that can process scripts. Just make appropriate changes.
First create a file of R script filenames that you want to run. Make sure each line has a line ending. These should be in the directory where you invoke the script.
Eg. myfiles.lst
RT_SET_1.R
RT_SET_2.R
RT_SET_3.R
RT_SET_4.R
Here is an example bash script to process the myfiles.lst into a series of submitted sbatch commands. You would want to edit the opts line to request the resources needed, and the outs line can be edited if needed. Leave the echo line in to see what happens, then remove the echo, leaving the sbatch command: sbatch $opts $outs --wrap='R --no-save < $filenm' .
Eg. runBatchR.sh
#!/bin/bash
module load R/3.2.2
opts="-p batch -c 8 --mem=10000 --time=10:00:00"
while read filenm; do
outs="--output=$filenm.out --error=$filenm.err --mail-type=ALL --mail-user=$USER
echo "sbatch $opts $outs --wrap='R --no-save < $filenm'"
sleep 1
done
The 'while read filenm; do' line iterates over each line passed to it, copying the line into the filenm variable.
To run this just use the cat function to pipe (pass) the file of R scripts to the bash script. Here is what would get submitted.
cat myfiles.lst | sh runBatchR.sh
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_1.R.out --error=RT_SET_1.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_1.R'
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_2.R.out --error=RT_SET_2.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_2.R'
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_3.R.out --error=RT_SET_3.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_3.R'
sbatch -p batch -c 8 --mem=10000 --time=10:00:00 --output=RT_SET_4.R.out --error=RT_SET_4.R.err --mail-type=ALL --mail-user=dlapoi01 --wrap='R --no-save < RT_SET_4.R'
Q. How do I install python modules locally?
A. First, load the module for the python version ( e.g. 2.7) that you intend to use. That way the system managed module are loaded.
module load python/2.7.6
Then create a local directory tree to store the modules ( ~/lib/python2.7/site-packages). Append a modified PYTHONPATH to your .bash_profile and source it. This is only need for the first time that this is done.
echo export PYTHONPATH="$PYTHONPATH:~/lib/python2.7/site-packages/" >> ~/.bash_profile
source ~/.bash_profile
Now you can use pip or setup.py to load python modules locally.
pip install --user <PACKAGE>
If there is a requirements file
pip install --user -r requirements.txt
or after a python package is dowloaded and unpacked. Be sure to read the instructions for installing.
python setup.py install