Problem with Dis open

Summary: We have problems with Dis. We are working on it.

We are seeing problems on Dis (Swedish Science Cloud) when creating virtual machines. It seems to be related to poor performance for some parts of the storage which affects all storage. We are trying to tune some settings for OpenStack to accept longer timeouts for certain operations but this does not help the poor performance for all.

Update 2025-06-27 12:00

We have now increased a timeout so now it works better. We were most probably close to this timeout before but now when the storage got a heavier load we hit it a lot more frequently.

Update 2025-06-30 09:00

We have problem again in Dis (Swedish Science Cloud) when starting up new VMs. It is related to the storage system.

Update 2025-08-29 16:00

So far Dis has been working fine but the storage system called Alburnus that Dis is using is slower than we hoped. We have adjusted the placement groups to get a bit better performance but we still noticed slowness under load. So Dis is working, but a bit slower than we hoped.

We still have some ideas if things we would like to try. One of them is to migrate from erasure coding to replication in the storage system to get more IOPS. But that may also require migration of the images in between the pools which in turn will give some downtime and require quite a bit of work so that is something we would like to avoid if possible.