Tuesday, 7 April 2009

Is there a problem?

How do users find out about interuptions to services, whether scheduled or more importantly, those that are unscheduled and therefore a suprise to all of us! We're trying to come up with different ways, and our service status page has proved useful in keeping people informed about service downtime and scheduled maintenance as well as unexpected outages. We're hoping it will reduce the number of calls to the Helpdesk when there are outages as people learn to look at them before phoning. We also have a Twitter presence as an experiment to see how useful that is and we'll be reviewing it soon. First signs are positive. Our Helpdesk now has a self service function, accessed through the portal where you can access a knowledge base, log calls, and track the progress of any open calls. This isn't being particulary well used at the moment - we need to build up the knowledge base and advertise it more. Or perhaps people prefer the human touch? They certainly get that with over 35,000 phone calls, emails and visits to the Helpdesk last year.

eMail has always been the traditional way of communicating (until the network/email system goes down of course), but there's always a fine line between sending too much information out, to too many people, and not enough. Our incident procedure is always being refined and reviewed, and it's interesting the way the definition of a major incident has changed. Gone are the days where loss of the VLE for a few hours was noticed only by the enthusiastic few, or where we could get away with any service being down before 0800, after 1800 and at weekends. The expectation is now very clearly for 24*7 services, and providing them without 24*7 staff is a challenge. Our response is to try and invest in reliabilty and resilience and build services which don't fall over often and don't require huge amounts of downtime, although with increasing complexity and interdependencies that is a real holy grail.

Hosted services, outsourcing and shared services are another way of spreading the risk - although you never get rid of it completely!

3 comments:

Jam said...

I am subscribed to the CiCS Twitter, and find it useful for knowing what is going on (the service page is also good, but not something I can be automatically notified about when it changes).

Andrew said...

Hosted services also introduce legal risks that you don't need to worry about so much with "in-house" services.

Forgotten to pay that e-mail bill? Hey presto - service cut off!

A big company buys up your hosted service and shuts them down (cough)Microsoft(cough)Yahoo(cough)Zimbra(cough)!

Having said all that, I still think hosted services are the way to go. Any monkey can provide a reliable e-mail service these days, but with a team with as many talented people as CiCS has, just imagine how many more useful specialised services they would be able to provide if only they had the time.

And when are the YHMAN uni's going to save more money by combining a data centre or two? There are probably some great facilities going cheap at the moment!

Markuos said...

Just had a quick thought (untried and untested).

What could be done is that various information could be aggregated; the CiCS twitter feeds, the IC twitter feed, RSS from the service page, PC availability, and pull out useful, relevant and immediate information from the HelpDesk knowledge if there is a problem.

This could be aggregated using something like FriendFeed http://friendfeed.com/ or an online portal (say Netvibes or Pageflakes). The URL can be provided to all users for them to access directly and the aggregation displayed on the plasma screens in the IC and elsewhere.