So, things getting almost back to normal now after extended email outage. All users have a service, and for a small number we're recovering some mail from our backups. But we have sent them a list of what mail should have been delivered to them which they seem to have found useful. As in all such incidents, communication is vital. In this case we were without one of our major tools - email! All internal customers were pointed to our service status web page which we updated frequently, and the stats of the hits on it showed that people were using it - from 10 hits on Monday, to 39,000 on Wednesday!
We had a link on our home page to an information page for external users, and also kept in touch with key departmental contacts by phone. In general our customers have been very understanding, and the number of complaints relatively small.
Next step is the incident review where we look at what went wrong technically and how we can minimise the risk of it happening again, and the process of managing the incident and what lessons we can learn.