Wednesday, 21 November 2007

Shared research data

Recently, one of the organisations we're a member of - RUGIT (The Russell Group of IT Directors) has been involved in an Invitation to Tender for a study into the feasibility of establishing a shared digital research data service for UK Universities. This is a joint proposal from RUGIT and CURL (the Consortium of Research Libraries). It’s being funded by HEFCE as part of its Shared Services programme – “shared services” being used to describe a model of providing services in a combined or collaborative way to improve efficiency and reduce costs.

In this case, the outcome could be a business case for a service shared among all UK Universities, with input from the IT departments and Libraries, and bringing huge benefits to researchers.

The amount of data being produced by research is enormous, especially with the advent of grid computing and e-science. Areas such as meteorology, aeronautics and particle physics are obvious producers of large data sets, but all disciplines including the social sciences and the arts and humanities are already producing large volumes of data, of all kinds - complex data used in climate modelling, aerodynamics, molecular modelling, bioinformatics; video and image archives used in archaeology, anthropology and drama; massively large data sets used in particle physics.

The intention is not just to set up a large shared data storage facility - this would be valuable, but would add no real value to the research process. What is being proposed is a facility to manage the whole data life cycle - including creation, selection, retrieval and preservation. This will allow researchers to access previously generated data sets, to undertake new analyses and to annotate existing data.

It’s a very exciting project, and on one which I’ll keep you posted.

Martin said...

Hi CiCS people. Chris is right that this is an exciting project. Coincidentally, I’m just on my way back from a meeting of the Project Management Group, where we’ve been discussing the appointment of the consultants, and of the half-time post of Project Manager, which we think we’ll need because of the scale of the work to be done. For those interested in this area, a bit of history might be useful.

The idea for this study grew indirectly out of the report by the Office of Science & Innovation (now called the Science & Innovation Group, and part of the DIUS) called “Developing the UK’s e-infrastructure for science and innovation”.

OSI was charged with scoping the e-infrastructure by the Treasury, which published the key “Science and Innovation Investment Framework 2004-2014

A number of CURL and RUGIT colleagues were involved in the resulting OSI e-infrastructure working group, and the six sub-groups that it set up. The outcome was a very useful overview of the requirement for development of the national e-infrastructure. But it was light on detail, and in particular, it didn’t attempt to assign responsibility for the next stages or to estimate the scale of investment required. We hope that the new study will take this work further, at least as far as the data curation and management element is concerned, and that it will help with building the case for investment with key stakeholders such as RCUK, the British Library, and JISC.

It’s quite likely that what will emerge will be a mixed economy of nationally-managed services for large scale data, particularly the datasets emerging from e-science projects, with local data curation for small-scale data collections linked to the publications associated with them – most of which will hopefully be harvested by our ePrints server White Rose Research Online.

This is an area where the Library and CiCS can usefully work together in the future to support our researchers.

Coming soon (with reference to Chris’s provocative recent post on the Amazon ebook reader): why the Information Commons will still have paper books long after the desktop PC has gone the way of the electric typewriter!

Martin Lewis
Director of Library Services & University Librarian