Lutra still offline closed

The Lutra offload file system is currently offline, due to an issue on one of its consistuent servers.

When power came back that server showed internal file system inconsistencies, which resulted in parts of the Lutra file system not being accessible with files seemingly missing. This was most likely due to the unexpected behavior of our power redundancy at the time of the power outage. We’re working on getting that server back online. We currently don’t expect data loss beyond data that was in the process of being written at the very time of the outage.

Currently, no parts of Lutra are accessible on the compute clusters.

Update: 2023-11-22 15:00

Lutra is back in production. We will contact projects affected by the storage failure.

Update: 2023-11-20 09:00

We apologize for the delay in providing updates. Over the course of nearly two weeks, we dedicated our efforts to constructing the necessary infrastructure to back up the affected filesystem, which amounted to close to 270TB of data. Fortunately, on November 18-19, we successfully repaired the filesystem sufficiently to restore accessibility. However, post-repair, we identified 129 lost objects—files that were recovered but with corrupted metadata. Consequently, it is highly probable that we will need to execute a partial restore from our backups. While we are still in the process of analyzing the lost objects, the majority of the required work is likely already completed. We anticipate that Lutra will soon be available again.