Well hello again, I have just learned that the host that recently had both nvme drives fail upon drive replacement, now has new problems: the filesystem report permanent data errors affecting the database of both, Matrix server and Telegram bridge.

I have just rented a new machine and am about to restore the database snapshot of the 26. of july, just in case. All the troubleshooting the recent days was very exhausting, however, i will try to do or at least prepare this within the upcoming hours.

Show

Update

After a rescan the errors have gone away, however the drives logged errors too. It's now the question as to whether the data integrety should be trusted.

Status august 1st

Well ... good question... optimizations have been made last night, the restore was successful and ... we are back to debugging outgoing federation :(


The new hardware also will be a bit more powerful... and yes, i have not forgotten that i wanted to update that database. It's just that i was busy debugging federation problems.

References

  • federation issues after restore: https://github.com/matrix-org/synapse/issues/16025
  • why we had to restore initially: https://text.tchncs.de/tchncs/about-the-matrix-incident-on-july-26-2023