Argon Launch Issues

The Argon HPC system launched to investors at 10AM on February 21st, 2017.  The system opens to all campus users March 14, 2017.

 

As has been previously communicated we have had many challenges with the Lenovo hardware and unfortunately these are not all resolved at this time (~5% of nodes are still affected). We are doing our best to manage this so users don’t experience issues but you may notice these issues in three ways.

  • Uneven performance in MPI applications – We are still trying to nail down all of the high speed Omnipath fabric issues and so for some large scale simulations it is possible that you will experience performance variations. - Most of these issues are now resolved but if you see uneven performance please report it.
  • Reduced UI Queue Node Counts – Due to a number of nodes still being out of service the capacity of UI queues will be smaller at this time. We have prioritized filling investor queues.
  • Nodes temporarily out of service – In some cases there were too many problem nodes of a certain type to be absorbed by the UI queue. In these cases we will have to take some nodes out of service after the system enters production to fix components (mostly cabling) that have temporary solutions in place. We will work proactively with affected investors.