7:33 P.M. Maintenance work on the Argon HPC system has completed, and users may log in and use the cluster as normal. Thank you for your patience.
3:55 PM March 10, 2017: Neon was brought back on line and queued jobs have begun running.
2:45PM March 10, 2017: At 2:30PM the data center housing the Neon HPC system lost power causing compute nodes to lose power. We are beginning assessment of the situation and working to restore service.
The next scheduled maintenance window for the HPC systems is March 10th, 2017 8AM - March 11th, 2017 8AM.
During this time the Argon system will be offline for maintenance. All jobs running at the beginning of the maintenance will be killed. The Neon system will not be taken offline but may be affected as indicated below.
The following work is currently scheduled:
Neon System - There is a possible network outage in the March 10th, 8AM-11AM time frame as network maintenance is performed but there are no plans to stop jobs on the Neon system at this time.
* Work on oustanding issues with the Argon Omnipath high speed network.
* Replace failed hardware and cabling on the Argon system.
* Upgrade switch firmware and reboot.
* Implement monitoring improvements.