User-Driven Digital Preservation

This prototype demonstrates how a mixture of preservation actions can be smoothly integrated into the search infrastructure of the UK Web Archive, by acting as an 'access helper' for end users.

This page picks a few specific examples of difficult or interesting cases, and allows you to inspect what the system knows about those formats. You can also view transformed versions of those resources (combining format conversion and emulation techniques), as illustrated by the thumbnail images shown here.

The examples provide links to our experimental search interface so you can look for more instances of each format. Every result found via that front-end contains a link back to this access helper service.

By allowing our visitors to decide between the available options, the real needs of our designated communities can be expressed directly to us and so taken into account as part of the preservation planning process.

Developed at the UK Web Archive, as part of the SCAPE Project. Interject is open source and is developed on GitHub.

The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137).

Spectrum tape images

The traces of ZX Spectrum software that can be found in our web archive are in the form of tape images (.TAP, .TZX) or raw system RAM snapshots (.Z80, .SNA). These formats can only be accessed via emulation, but this approach can also be used to generate screenshots.

Some examples are: Lostcave.z80 (shown above), ATICATAC, Gobbleman, Wheelie.

Search for more examples: .TAP, .TZX, .Z80, .SNA.

Virtual Reality Markup Language

VRML 1 and VRML97 were early web-based formats for 3D environments. They have been superseded by the X3D format, but unfortunately this series of three formats are not backward compatible with each other. Worse still, VRML 1 was based on a very different underlying information model to the later formats, and is poorly supported by the currently available tools.

Some examples are: penguin1.wrl (VRML 1), 03fig10.wrl (VRML 1), tut32c.wrl (VRML 97).

Search for more examples...

X-Windows image formats

The X BitMap image format was the first image format on the web. However, despite this early and important role, these formats are not widely support today and so usually require format conversion for access.

Some examples are: image.xbm (shown above), _8917_tex2html_wrap1129.xbm, separator.xbm.

Search for more XBM example files...

The X PixMap format is the colour version of the X BitMap format outline above, and suffers the same accessibility issues.

Some examples are: xterm-linux.xpm (shown above), tube.xpm , guitar.xpm .

Search for more XPM example files...

Characterisation Examples

Although this framework is primarily aimed at enabling access to resources, it can also be used as a characterisation system. This allows users to explore what information we can extract from individual resources (in terms of formats, metadata and full text) via a relatively simple web interface.

Some examples are: aghsecondannaulreport.pdf, semwebpl.doc, heritag.doc, cv.docx.

Search for more examples: PDFs, DOCs, DOCXs.

Warning! This is a research prototype for a web archive access helper service, and may be taken down at any time.