Security update and reboot closed

Due to a recent security issue in the Linux operating system, UPPMAX has decided to apply the newly released security fixes immediately. After applying the fix, we need to reboot...


Updated:

January maintenance window closed

The service on Wednesday 9th of January begins at 09:00 and affects all systems. All systems will get bug and security updates. We will also perform a minor reorganization of...


Updated:

Cooling issues in the UPPMAX compute room closed

We have an ongoing issue with cooling in the server hall. All systems have been closed down to prevent hardware damage. Final ticket report All systems are up and running...


Updated:

UPPMAX Account Request may take longer to process closed

UPPMAX is working on updating infrastructure services that affects the creation of new accounts. We will temporarily disable creation of new accounts at various times during the update. We expect...


Updated:

Quota issues for some sllstore projects on crex (rackham/snowy) closed

Some sllstore project ended up with incorrect quotas becasue of how data is handled when communicated to/from SUPR. Fixed data (and resulting quota) is being rolled out and this issue...


Updated:

Uppmax cloud has experienced an error causing the system to be temporally unavailable closed

Openstack gathers metrics via the internal ceilometer service and stores the logs. It seems like the container where the logs are stored became full, not allowing ceilometer to write any...

Slow slurm on rackham closed

Slurm (the workload manager we use to schedule jobs) does not always like it then it has too many jobs to keep track of. This has happened quite a few...


Updated:

Temporary problem with SSH keys closed

Due to a configuration error our loginservers temporary hade incorrect SSH host keys. If you tried connect during this time you may have got a warning like: $ ssh rackham.uppmax.uu.se...

Issues with running singularity containers with restricted permissions from /home closed

It seems singularity currently does not allow running containers from /home that have restricted access permissions, meaning the complete path to access the container file must have execute permissions for...


Updated:

"No space left on device" on Rackham and Snowy closed

There is currently a problem writing data to the project storage system on Rackham and Snowy. The error message is “No space left on device”. The problem has affected jobs...

Issues with singularity since the latest maintenance window closed

Final ticket report We believed this was fixed by the latest build installed. Update 2018-11-14 15:26 A build that should contain fixes for the issue container creation failed: mount error:...


Updated:

Problems logging into Bianca closed

We have seen a problem with Bianca that prevented users from logging into login node. We believe the problem is fixed. If you still have problem logging in you are...


Updated:

Permission errors on bianca/castor closed

Final ticket report Issues seem solved by work-around. Depending on one’s history on UPPMAX and previous memberships of project, it’s possible one may see permission issues after the recent maintenance...


Updated:

Castor issues on compute nodes closed

Final ticket report This was fixed by building new base images for the virtual machines. We’re still working on figuring out what happened. For some reason, compute nodes already started...


Updated:

November maintenance window closed

The service on Wednesday 7th of November begins at 09:00. The service window affects all systems in various degrees. For Irma and Bianca the queues will be stopped. For all...


Updated:

Issue with castor (filesystem for bianca) closed

We’re currently seeing an issue on one of the nodes providing castor (the file system for bianca). This may lead to failed read/writes and possibly crashed jobs. We’re working to...


Updated:

/sw/data temporarily invisible on crex. closed

/sw/data was temporarily unavailable under that name on crex (the storage system for rackham) for a while today. Fixed at 19:00.

My VASP jobs trigger SIGSEGV closed

We have several reports of VASP jobs killed by SIGSEGV. The problem is believed to be caused by a recently added security fix to limit a certain memory space of...


Updated:

There is currently a problem logging into Bianca closed

There is currently a problem logging into Bianca. The login nodes that gets started during your login by some reason fails to get fully operational. This results in either broken...


Updated:

Quota issues for some projects on crex (rackham) closed

A few projects on crex were subjected to the wrong quota recently. We have worked around the issue and are fixing the underlying data in SUPR.