Preserving digital documents for the long-term
Abstract
As more and more documents are created and archived digitally, their preservation is becoming critical. A digital document is a sequence of bits that needs to be interpreted and the interpretation process must be archived with the document. The process may be simple or very complex, and various methods have been proposed, depending upon the required functionality. The paper reviews the most popular methods, with particular emphasis on the use of a "Universal Virtual Machine" (or UVC).1,2 The UVC method consists of archiving with the data a program P that dynamically converts the internal format into an easily readable structure which identifies the various elements (much like XML does) down to a point where it becomes much easier to define the rest of the interpretation process. In the future, an interpreter of the UVC program will enable the execution of P on any computer. The focus of the paper is primarily on archiving printable documents which require both content and presentation to be preserved; a short section will briefly cover more general documents. The paper makes reference to an existing prototype and how it was used in some proof of concept projects.