The Tufts High Performance Compute (HPC) cluster delivers 35,845,920 cpu hours and 59,427,840 gpu hours of free compute time per year to the user community.

Teraflops: 60+ (60+ trillion floating point operations per second) cpu: 4000 cores gpu: 6784 cores Interconnect: 40GB low latency ethernet

For additional information, please contact Research Technology Services at tts-research@tufts.edu


Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 18 Next »

2014 Cluster Upgrade Information

Cluster Changes Overview

  • Compute nodes are comprised of Intel based Cicso and IBM servers
  • slurm replaces LSF job scheduling and load management software
  • ssh logins to compute nodes is restricted
  • cross mounts of local disk temporary storage is removed
  • slurm interactive partition limited to 2 nodes for increased performance and quality of service
  • New login and file transfer nodes for better user service

Every 3 years, the Tufts high-performance computing cluster environment is partially refreshed to offer increased capacity.  We did so in 2008 and 2011, and again in 2014.  We have had 4 generations of IBM hardware and we will soon welcome a new generation of Cisco UCS hardware.   Our oldest IBM hardware has been retired and  our newest Cisco hardware has been installed.  We will  integrate current cluster IBM hardware into the new Cisco environment during the  Fall 2014 semester. There will be two migrations of IBM hardware roughly 5 weeks apart.  As a result of migration, fewer nodes on cluster6.uit.tufts.edu will result.   Concurrent access to old and new clusters will be an option that will allow for a smoother and non-disruptive fall transition for users and testing. 

By  late Dec. 2014 the  transition to the new cluster production environment will provide a total of 2680 cores (w/ 1680 IBM and 1000 Cisco cores) as compared to 2110 cores during 2013-2014 academic year .  Due to a significantly different and newer architecture, the 1000 Cisco cores are a lot more powerful.  Network performance between nodes is also improved due to new hardware.    In practical terms this refresh represents the largest increase in compute power for the Tufts HPC community.

As part of this hardware refresh, we will also be replacing  LSF load management software with slurm on the new cluster.  We made this decision as did many other high-performance supercomputing centers worldwide. The benefit to Tufts users is an experience built upon a  common understanding and implementation strategy for scaling HPC jobs.

Slurm replaces LSF

See wiki page about slurm for Tufts  LSF  users. 


User facing access nodes to new cluster services

There are two new nodes for interfacing the new cluster. 

  • login.cluster.tufts.edu
  • xfer.cluster.tufts.edu

The login node functions in the same fashion as the old cluster,  cluster6.uit.tufts.edu.  This includes compiling, slurm job submissions, editing, etc.   However the need to transfer data into or out of the cluster is separated from the login node.  A second node, xfer.cluster.tufts.edu, is provided for file transfers.    This insures a better quality of service to login node users and minimizes storage related logistics.  Access to xfer.cluster.tufts.edu is via scp, sftp, rsync, and any desktop file transfer programs such as WinScp, Filezilla, etc.  Also one may ssh into the node for initiating transfers as needed from either the headnode, login.cluster.tufts.edu, or desktop.   Node xfer.cluster.tufts.edu is a file transfer only service and not another headnode.  Access to slurm, compilers, etc.  is  unavailable. 

 

 

 

 

 

  • No labels