[NCUC E-team] Tuesday crash

Tapani Tarvainen ncuc at tapani.tarvainen.info
Wed Aug 2 11:37:58 CEST 2017


Dear all,

I still don't know what actually caused the crash in our server yesterday,
but here's a brief summary of recovery action:

The machine crashed so badly it could not be booted even from control console.

Following discussion with e-team in Skype, I, with Brenden's help,
created another virtual machine, mounted old disks to it, analysed
them a little and concluded repairing the old system would take longer
than reinstallation, which would also bring all software up to date.
(It would not have been possible to reinstall same old versions, as
Gandi no longer offers that old Ubuntu images for new VMs.)

So recovery went as usual when a machine is destroyed: new one built,
software installed, content data restored from backup, DNS entries
changed to point to the new machine. Nothing critical was lost,
although some emails sent to some lists at the time may have been
lost.

As all major software components were newer versions than before,
some extra work was required to make old configurations and backup
data work (in particular Apache and PHP configuration MySQL database
restoration to new version).

Total downtime until all key functionality was restored was about three hours.

Old machine's disks have been retained for the time being for possible
more thorough post-mortem analysis or useful data.

(I'll be offline most of the rest of today and will return tomorrow to
whatever remaining glitches may yet be found.)

-- 
Tapani Tarvainen


More information about the E-team mailing list