Status with Crex (UPDATE: Rackham and Snowy /proj and /sw/data available again) closed
The project storage system Crex (/proj and /sw/data) for Rackham and Snowy
is currently unavailable as we are investigating an issue with metadata. The
issue is related to the work during the maintenance on June
1st. We are since
June 3 working with DDN (and DDN with Whamcloud) to have this issue resolved
ASAP.
The problem: After migration, two symbolic links in the /crex/data directory was
unexpectedly broken. The metadata is expected to be identical before and after
migration, thus we we need to investigate the cause of these symblic links, if
they were broken before the migration or if something has happened, before we
proceed. We have at this time not discovered any other broken links, but the
majority of the data has not been checked, due to the large number (>1 billion)
of files.
Rolling-back the migration is possible, however, the old metadata target is at
98% utilization with regards to inodes, with the implication that relatively
few regular files can be created. If we continue to allow Rackham and Snowy
jobs to run at 98% utilization it will not be long before no more files can be
created inside /proj and we will be forced to move (or ask users to remove)
files. The high utilization of inodes together with moving to supported
hardware is the reasons we have been working extensively with Crex during the
spring.
Update 2022-06-08 11:00
Crex is fully available. We had to revert to the previous metadata targets. Queues are running again on Rackham and Snowy.
One of the goals of the operation we reverted was to get more inodes which makes it possible to store more files. We are currently back on the old number of inodes. We will manually rebalance but if you can reduce the number of files by removing files or archiving many files into a single file that is appreciated.
Update 2022-06-07 19:00
/crex is now available on the login nodes. Queues expected to be released
soon.
Update 2022-06-07 17:00
We are in the process of reverting to the previous metadata targets.