October maintenance window closed
The queues on Rackham and Snowy will be stopped as we have planned maintenance on the project storage system Crex. We will upgrade the controller OS and Lustre (packaged by DDN ExaScaler) which will provide us with long awaited bug fixes for several issues we have run into previously. The upgrade requires us to umount the filesystem, which means that the project directories will become unavailble during the upgrade phase. From previous upgrades we expect the maintenance on Crex to take at least one working day, and that Rackham and Snowy will return to production earliest on thursday, assuming the upgrades proceeds with no major issues.
The queues on Bianca and Irma will not be stopped.
We will replace a faulty switch in our core network. You connection to UPPMAX may become disconnected several times during the maintenance day as we install and configure a new switch.
Any disturbances on the accessible systems while the service window remains open should be assumed as related to the service window and await contacting the support. If the problems remain after the service window has been closed you are of course welcome to contact the support at support@uppmax.uu.se.
You can follow our progress on this page throughout the day.
Update 2020-10-08 17:15
After successful testing on the filesystem, we have now started the queues on Rackham and Snowy. The clusters are back in production and Crex seems stable so far. The installation of the new switch went without issues. The maintenance window is now officially closed. If you experience any problems, please, report them to support@uppmax.uu.se or via our support form on the web page https://www.uppmax.uu.se/support/.
Update 2020-10-08 15:00
We are close to completing the maitenance of the storage system Crex, working on testing and making sure the filesystem works fine. The last stage will be the change of the faulty core network switch. Next update is coming at 17:00.
Update 2020-10-08 12:00
The upgrade on Crex was installed and we continue with testing the filesystem. Next update at 15:00.
Update 2020-10-08 09:00
We continue working on Crex and on the Uppmax Cloud. Next update at 12:00.
Update 2020-10-07 17:00
The work on Crex is ongoing and the project directories will remain inacccessible for the rest of the day. The work was delayed due to faulty hardware needing to be replaced before the upgrade could continue. Maintenance for the Bianca system, excluding the biaca-sftp service, has completed. This is the final update for today. Upgrades of the UPPMAX Cloud is ongoing. Next update at 09:00 tomorrow.
Update 2020-10-07 15:00
The cluster Irma is in production now, Rackham and Snowy are updated but still out of production due to maintenance of the storage system Crex. Our work on Bianca continues. Next update is coming at 17:00.
Update 2020-10-07 12:00
Everything goes as planned so far. Next update is coming at 15:00.
Update 2020-10-07 09:00
Service window begins. Next update at 12:00.