Advertisement
Promo

Databases Toolkit

It's the end of your data as you know it

Matthew Broersma ZDNet.co.uk

Published: 23 Apr 2007 15:49 BST

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

...that underlies any long-term storage project. The British Library, for instance, found that storage-industry concepts such as ILM were quite unsuited to the type of archive it is establishing with DOM. ILM establishes practices for migrating data from fast, high-performance storage to lower-performance media as the value and use of the information decreases, but "this view of storage is at odds with our own view", wrote the British Library's Richard Masters in a white paper. That's because the Library doesn't judge the value of its objects, and doesn't intend ever to delete them.

British Library
The Library has gone through several attempts at building a long-term digital archive since the late 1990s, including calling in IBM to build a complete system from its specifications — the approach used by the KB, although on a smaller scale. None of the projects came to fruition.

"Then we realised as an organisation that the big-bang approach was never going to work. Nobody knows what the requirements are," says Masters. "That's why we are building DOM in a component fashion and learning as we go. That way we don't have a huge risk — we aren't building an expensive application that doesn't meet our needs."

If there's one thing that's certain, it's that digital records will keep increasing. They aren't going away

Richard Masters, programme manager, British Library's Digital Object Management scheme

The library's key requirements are for its digital objects to be available forever, but at a very low, though undetermined, rate of access. That means the system has to be durable, flexible and affordable to maintain, but doesn't have to offer the high speed required by enterprise storage systems.

The initial system is built in two redundant sites, each growing to about 300 terabytes, using commodity magnetic disk drives on the relatively new Serial ATA standard. That means the hardware system is independent of any one vendor; the library plans to simply replace drives with newer ones as they reach their end of warranty. The initial tender went to VSPL, which proposed a solution using JetStor disk arrays. The software layer is designed to be independent of the technical properties of the physical storage itself.

Aside from the two main sites, there will also a third "dark archive", designed as a way back from total failure of the two main sites. The details of this are still being worked out, but the idea is for it to be in a completely separate repository using a totally different technology.

The British Library's choices in some key areas underscore the degree to which the field is divided over best practices. For instance, the library has decided that the migration approach — translating from old formats to new formats — is most appropriate for its archive. "Emulation versus migration is one of those religious wars in the archives community," Masters concedes.

Work is also being done around turning Microsoft's Office Open XML file formats into open standards, bringing it into conflict with supporters of ODF and those who believe Office Open XML will extend Microsoft's control over the creation of documents. "There are billions of Word documents out there, and, if those were opened up, it would be a huge resource," says Masters.

He argues that the only thing organisations really know about digital preservation at the moment is how little they know. "It's a learning curve. We've put together the best thing we can for now, and we'll run with it for a time and accept we're going to make changes," he says. "Openness is important on this — we've got to learn from our experiences and share that with others. Experience is the only thing that will get us moving forward on this."

While few companies currently have to deal with the issues the library is tackling now, they are likely to have to do so at some point in the future, adds Masters.

"This will become mainstream. The technologies we are developing may end up being built into some storage products as standard. A lot of tools will be made available through the work that's going on now," he says. "If there's one thing that's certain, it's that digital records will keep increasing. They aren't going away."

Next

Previous

1 2 3


  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

Did you find this article useful?
61 out of 61 people found this useful


Full Talkback thread

0 comments

Company/Topic Alerts

Create a new alert from the list below:









Video icon

Video

Microsoft Futures Special Report

Ozzie: Success of Azure comes down to trust

Ozzie: Success of Azure comes down to trust

News In an interview, Ray Ozzie says businesses will be taking a risk by placing core operations in Microsoft's datacentre, but that the software giant has more to lose if things go bad

More Special Reports

Discussions

182706 182706

translation

Saturday 4 July 2009, 12:15 AM

1 comment
Moley Moley

More on Moblin

Friday 3 July 2009, 7:59 PM

4 comments
whbs whbs

Microsoft US-UK ripoff again!

Friday 3 July 2009, 7:54 PM

1 comment

Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters