Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

May 2014 Announcement

Every 3 years, the Tufts high-performance computing cluster environment is partially refreshed to offer increased capacity. We did so in 2008 and 2011, and we have just completed our competitive bid for our 2014 refresh purchase. As you may know, we currently have 4 generations of IBM hardware and we are proud to announce that we will soon welcome a new generation of Cisco UCS hardware.

Here is how we are planning to retire our oldest IBM hardware, install our newest Cisco hardware, and integrate both IBM and Cisco environments while allowing for a smooth and non-disruptive transition.

First, we will retire the oldest IBM hardware on June 15, 2014. This represents a removal of ~500 cores or about 25% of the current cluster capacity. We have adjusted LSF queues to help accommodate the decreased compute capacity while maintaining a similar quality of service from June 15 thru August 2014. We will refer below to this IBM cluster with retired nodes as the “current cluster”.

The new Cisco hardware will be delivered in the second half of June, then installed and configured over the Summer. By mid-September, this second Cisco cluster with the new 1000 cores will be made available in production to our users to form the foundation of our new Tufts high-performance computing environment. We will refer to this Cisco cluster below as the “new cluster”.

The current cluster and the new cluster will be available concurrently during the 2014 fall semester. This will allow for a smooth user transition with ample time for testing. Finally, on January 1, 2015 the current cluster hardware will transition to the new cluster production environment for a total of 2680 cores (w/ 1680 IBM and 1000 Cisco cores) as compared to 2110 today. Due to a significantly different and newer architecture, the 1000 Cisco cores are a lot more powerful than twice the computing power of the 6-year-old 500 cores being retired. In practical terms this refresh represents the largest increase in compute power for the Tufts HPC community.

As part of this hardware refresh, we will also be replacing our LSF scheduler with slurm on our new cluster. We made this decision as did many other high-performance computing centers across the US since slurm is considerably cheaper while offering similar features as LSF and while being quickly improved by a large user community to adapt to newer HPC paradigms such as accessing cloud resources.

To allow for a smooth transition to the new slurm scheduler, LSF will remain on the current cluster until December 31, 2014 at which time it will also be retired and replaced by slurm. We are creating this 3-month+ transition period from mid-September to December with both LSF and slurm available to allow users to easily switch from the former to the latter. Workshops, tip sheets and training will be organized during the transition period to make sure no Tufts cluster user is left behind.

Update Announcement 2011

TTS is pleased to announce the completion of the Tufts research cluster upgrade project. The new High-Performance Computing (HPC) research environment is in production bringing more than 1000-cores of computing power to the Tufts community. Over the past three years, TTS has observed an increasing demand for our HPC research cluster. In anticipation of the ongoing need for additional resources, we began a project to increase resources more than three fold.

...