Snowy was unable to schedule jobs between Friday and Saturday 24-25th of April closed
The Snowy slurm master was unable to schedule jobs between 22:00 on Friday 24th of April and 11:30 Saturday 25th of April. This issue was caused by unsufficient resources on...
Updated:
Bianca and 'Transport endpoint not connected' closed
We are investigating an issue with Bianca where filesystem operations results in ‘Transport endpoint not connected’. This is a sign that the storage system Castor is having issues talking to...
Updated:
Snowy and issues with job submission between 13:44-17:30 closed
The Snowy slurm master was unable to schedule jobs between 13:44 and 17:30 on Wednesday 8th of April. This issue was, again, caused by unsufficient resources on the slurm master...
Updated:
UPPMAX Cloud east-1.cloud.snic.se unreachable closed
There is currently an issue reaching https://east-1.cloud.snic.se We are investigating. Update 15:00 The cloud should now be back again.
Updated:
Hidden files in the Bianca wharf (UPDATE: bianca-sftp.uppmax.uu.se back online) closed
We have received reports and seen cases that files appear hidden in the wharf. We are working on correting this issue and advice users to report any missing file issues...
Updated:
Issues with ThinLinc on Rackham closed
We are seeing issues with frozen ThinLinc sessions on the GUI node in Rackham rackham.uppmax.uu.se. As we have made no updates or changes to our ThinLinc configuration we believe these...
Updated:
Snowy was unable to schedule jobs between March 26th 23:22 and 27th 11:20 closed
The Snowy slurm master was unable to schedule jobs between 23:22 on Thursday 26th of March and 11:20 Friday 27th of March. This issue was caused by unsufficient resources on...
Updated:
April maintenance window closed
The service on April 1th will begin at 09:00 CET. The queues on Irma, Rackham will not be stopped. The queues on Bianca will be stopped as we update the...
Updated:
Rackham nodes r97-r120 unavailable due to broken Infiniband switch closed
The Rackham compute nodes r97, r98, …, r120 is unavailable due to a broken infiniband switch. We have contacted the vendor and expect to receive a replacement switch later this...
Updated:
The very fat node (s229) in Snowy is unavailable closed
The very fat (4TB) node s229 in Snowy has unfortunately broken. Hardware issues prevents it from booting. The -C veryfat feature in Slurm is thus no longer working. To our...
Updated:
March maintenance window closed
The service on March 4th will begin at 09:00 CET. The queues on Irma, Rackham and Snowy will not be stopped. The queues on Bianca will be stopped as we...
Updated:
UPPMAX Cloud region unavailable closed
There is currently a problem connecting to the UPPMAX Cloud. We are investigating. Update 2020-02-10 12:30 The cluod is now available again. It was during an system update a loadbalancer...
Updated:
Long queue times in Bianca closed
There is unexpectedly long queue times in Bianca. We are investigating. Update 2020-02-10 12:45 This problem was solved and related to a system update. The queues should now be running...
Updated:
Intermittent I/O-errors on Rackham and Snowy closed
The project storage system that attaches to Rackham and Snowy is unfortunately still having problems. The problem for most users will result in degraded performance when reading and writing from...
Updated:
February maintenance window closed
The service on February 5th will begin at 09:00 CET. To assist in solving the issue with having reached the maximum amount of available inodes we may need to stop...
Updated:
Please remove unneeded data in your project directory closed
During the weekend 18-19th of January the project storage system on Rackham ran out of inodes, which prevents new files from being creted inside the /proj directories. We have cleaned...
Updated:
Lower performance when running jobs on Rackham and Snowy closed
We have received reports from several users that jobs in some cases runs much slower today than a few months ago. We have started an investigation to see if any...
Updated:
Slurm jobs on Bianca are slow to start closed
There is currently an issue with the system that starts and allocates compute nodes to project clusters on Bianca. Slurm jobs may take longer than usual to start even though...
Updated:
January maintenance window closed
The service on January 8th will begin at 09:00 CET. The queues on Rackham and Snowy will be stopped. Irma, Bianca, Grus, Castor and the UPPMAX Cloud will receive the...
Updated:
Limited access to data for some projects in Bianca closed
There is currently an issue with the storage system (“Castor”) which attaches to Bianca. The problem appears to have started on December 21. A part of the storage system is...
Updated: