Maintenance planned for January 10th closed
The UPPMAX cloud will not be stopped. No Slurm queues will be stopped. All systems will receive system updates and security fixes. Due to the end of the year holidays...
Updated:
UPPMAX support during the holidays closed
Dear users, Please be aware that our offices will be minimally staffed during weeks 52 (2023) to 1 (2024). While our support team will do their best to address your...
Updated:
Rackham/Crex - Cannot send after transport endpoint shutdown closed
We have received reports of /proj filesystem issues originating from Rackham. The problem may manifest as a ‘Cannot send after transport endpoint’ error when attempting to list files: ls: cannot...
Updated:
December maintenance planned for December 6th closed
The UPPMAX cloud will not be stopped. No Slurm queues will be stopped. All systems will receive system updates and security fixes. Any disturbances on the accessible systems while the...
Updated:
Problem starting computation nodes on Bianca closed
From around Tuesday 2023-11-28 11:00 to Wednesday 2023-11:29 11:00 there was a problem creating computation nodes on Bianca. No new nodes were started so jobs submitted on Bianca were stopped...
Updated:
Cooling failure closed
We had a significant loss of the central cooling starting after 3 PM. Akademiska hus were notified at 4 PM. At this point, we don’t know how bad it will...
Updated:
Lutra still offline closed
The Lutra offload file system is currently offline, due to an issue on one of its consistuent servers. When power came back that server showed internal file system inconsistencies, which...
Updated:
Power failure on October 29 closed
Large parts of Uppsala saw a power loss at 14.45. Our battery backups for critical infrastructure did not recover as expected. We will first bring this infrastructure back and then...
Updated:
Cooling failure closed
We had a sudden cooling outage about 3am to 4am. About 600 compute nodes had an emergency shutdown and many jobs are unfortunately lost. Update 2023-10-28 07:00 Bianca is now...
Updated:
November maintenance planned for November 8th closed
The UPPMAX cloud will not be stopped. (update: earlier post incorrectly stated that the cloud would be stopped) No Slurm queues will be stopped. All systems will receive system updates...
Updated:
Issue with ThinLinc on Bianca closed
We’re aware and addressing a non-working ThinLinc on Bianca. Users will not be able to use https://bianca.uppmax.uu.se until this issue is fixed. We apologize for any inconvenience caused. Best regards,...
Updated:
Total loss of datacenter cooling closed
We have a sudden cooling outage. Akademiska hus has been notified, but since cooling is not returning we’re executing an emergency shutdown. Update 2023-09-12 15:00 Bianca is back in production....
Updated:
UPPMAX Cloud returning to production soon closed
We are working on returning the cloud to production following last weeks major upgrade of the OpenStack platform. We have recognized an issue with the cloud image store which we...
Updated:
September maintenance planned for September 6th closed
The UPPMAX cloud will be stopped. No Slurm queues will be stopped. All systems will receive system updates and security fixes. The EAST-1 and UPPMAX cloud region will be unavailable...
Updated:
Issues loading conda module closed
We’re aware and addressing missing Conda modules resulting in no such file or directory errors on Rackham and Snowy. Best regards, UPPMAX Support Team
Updated:
Issues loading modules - poor performance closed
We’re addressing a surge in load for the home- and software storage system (Domus) attached to the Rackham and Snowy clusters. The high load causes increased latency which affects module...
Updated:
System node reboots and memory errors Miarka closed
On the evening of August 18, miarka-q, which manages Slurm reported several severe memory errors, causing reboots and unreliable operations. During the efforts to troubleshoot this, the memory configuration of...
Updated:
Longer wait times to get access to Bianca closed
We are currently seeing longer than usual login times to Bianca for newly created login nodes. This is caused by one central server running out of resources earlier today. We...
Updated:
OpenSSH vulnerability CVE-2023-38408 closed
We wish to inform you about an important security vulnerability discovered in OpenSSH during the summer. The vulnerability is CVE-2023-38408 Red Hat describes this issue as: A vulnerability was found...
Updated:
Issues with automatic creation of login nodes on Bianca closed
We see some issues with automatic creation of login nodes on Bianca. If your login node is down and needs to be started up when you login for the first...
Updated: