Yesterday we had a good meeting looking at progress on the different aspects of ITIL we're implementing - I've already posted about change management, and yesterday we were looking at progress on our service catalogue which is going well, and incident and problem management. In terms of incident management, we have to first define what an incident is, and we had been using a definition of any significant service failure whether planned or unplanned, but as planned should be covered by the change management processes, we're changing the definition to unplanned only. And then - what's an incident - any service failure reported to the helpdesk? If Dr X rings to say his printer's not working is that an incident? It's a service failure to him. So, if you define everything as an incident, you have to have a good method of determining the level of an incident. This is where you need to take account of SLAs, OLAs (operational level agreements), Impact assessments, the calendar of activities etc. Who decides on the level of an incident, and how much can be automated by our helpdesk software?
Of course, the purpose of incident managment is to restore normal service as soon as possible, and to minimise the adverse effect on business operations. Incidents have to be identified, logged, categorised, prioritised, diagnosed, resolved and closed. All of this is under the control of the helpdesk (note - not the actual fixing). The incident is then reviewed.
Communicate -- Fix -- Communicate
We're also drawing up procedures for problem management, the purpose of which is to reduce recurring incidents and these follow a similar cycle of:
- Investigation and diagnosis