Advertisement
Promo

Databases Toolkit

It's the end of your data as you know it

Matthew Broersma ZDNet.co.uk

Published: 23 Apr 2007 15:49 BST

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

...translation can be difficult and expensive, and may deliver something substantially different from the original — as with old static databases that had to be redesigned to fit the relational database model.

In effect, preserving digital objects by migrating them to current formats isn't really preserving them at all, Rothenberg argues — it could be compared to translating a poem into a different language, and then destroying the original.

"Translation is attractive because it avoids the need to retain knowledge of the text's original language, yet few scholars would praise their ancestors for taking this approach," Rothenberg wrote. "Not only does each translation lose information, but translation makes it impossible to determine whether information has been lost, because the original is discarded."

An alternative is emulation, in which the hardware, operating system and applications needed to view an original application are all simulated using current technology, an approach Rothenberg favours and which was pioneered in practice by fans of obsolete video games in the 1990s. This has its own complications, but at least it is a way of keeping documents accessible in their original state.

Physical degradation
Besides the issues around obsolescence of file formats, applications, operating systems and hardware, there is the more basic question of how to deal with the fact that media physically degrade or become obsolete.

Not only does each translation lose information, but translation makes it impossible to determine whether information has been lost

Jeff Rothenberg, RAND Corporation computer scientist

How long will various media types last? There's considerable controversy around the issue, with Kodak claiming in one report that its writeable CDs would last 217 years under certain conditions, while others observe that such media start to degrade after only a couple of years. Rothenberg estimates that optical media have a practical physical lifetime of five to 59 years, digital tape two to 30 years and magnetic disk five to 10 years.

There's just one problem with such estimates, though — they're all academic, because, with the fast pace of change in the IT industry, any given medium will be obsolete in about five years. Even if it continues to function, modern hardware may not be able to read its contents or even connect to it.

"Digital information lasts forever — or five years, whichever comes first," Rothenberg quipped.

That means any organisation that wants to keep its data accessible will have to look forward to an unbroken chain of migrations within a time cycle short enough to prevent the media from becoming physically unreadable or obsolete before they are copied. "A single break in this chain can render digital information inaccessible — short of heroic effort," Rothenberg wrote.

Taskforces
Things look quite different from the point of view of the archivists who deal with questions of preservation on a practical level. The daunting prospect of future paradigm shifts, for instance, is nothing new — archivists and records managers are trained with the understanding that future generations may well disagree with their choices about what to keep and what not to keep, and how objects are preserved. "As a records manager, you have to accept that whatever you do will be wrong," says Anna Riggs, an archivist with Birmingham City Council.

A number of institutions are now putting long-term digital preservation programmes into place, including the British Library, the Library of Congress, the National Library of the Netherlands (the KB) and the California Digital Library, among others.

Other organisations are working on infrastructure and standards designed to back up such programmes. The EU-funded Planets (Preservation and Long-term Access through Networked Services) project, for instance, is co-ordinating European national libraries and archives, research institutions and IT companies to address digital preservation issues. The Digital Preservation Coalition is doing similar work at a UK level. Meanwhile the Storage Networking Industry Association (SNIA) has established the 100 Year Archive Task Force, which is aiming to come up with best practices for long-term data retention.

The SNIA is also working with the storage industry on Extensible Access Method (XAM), which is expected to produce interfaces between applications and storage systems that co-ordinate metadata to stabilise interoperability, storage transparency and automation for what's known as information lifecycle management (ILM), sometimes called data lifecycle management.

This all sounds very organised, but it masks the absolute uncertainty...

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

Did you find this article useful?
61 out of 61 people found this useful


Full Talkback thread

0 comments

Company/Topic Alerts

Create a new alert from the list below:









Video icon

Video

Microsoft Futures Special Report

Ozzie: Success of Azure comes down to trust

Ozzie: Success of Azure comes down to trust

News In an interview, Ray Ozzie says businesses will be taking a risk by placing core operations in Microsoft's datacentre, but that the software giant has more to lose if things go bad

More Special Reports

Discussions

182706 182706

translation

Saturday 4 July 2009, 12:15 AM

1 comment
Moley Moley

More on Moblin

Friday 3 July 2009, 7:59 PM

4 comments
whbs whbs

Microsoft US-UK ripoff again!

Friday 3 July 2009, 7:54 PM

1 comment

Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters