>> Ressourcen > Download Area für Papers > Maurer, H.: Mul[..]

Multimedia Repositories and the LIBERATION Project

H. Maurer, Graz University of Technology and The University of Auckland

Abstract

In this paper we describe what we mean by multimedia repositories, what we believe are fundamental requirements of such repositories and what the TAP-Project LIBERATION is all about. We then explain the main features of the WWW server Hyperwave and show, why Hyperwave must be seen as the obvious basis of any sizeable multimedia repository.


1. Introduction

Multimedia is often defined as combining traditional "media" such as text, graphics, pictures, animation sequences, video- and audioclips in digital form. This is the truth, but only half of it: digital multimedia also must allow the integration of new elements such as 3D models and 3D scenes ("virtual reality") including suitable viewing and navigation paradigms, interactive components in all media, and such down-to-earth but often overlooked elements as cartographic material, electronic courseware, multilinguality, etc.

Any multimedia repository of the future must permit the usage of multimedia material in this general sense. It must also provide powerful tools for the administration of such data (i.e. requires an integrated database and search engine), must allow the hyperlinking of arbitrary data-types (hence has to store links as separate databases rather than as parts of the documents), has to permit the definition of arbitrary views of the same database depending on the user or user group and has to support ease of customisation. It is this latter aspect that is particularly important in the Telematics Application Program (TAP) LIBERATION (an acronym standing for: LIBraries. Electronic Remote Access To Information Over Networks). LIBERATION is a project in the Libraries Sector: its aim is to experiment with a substantial digital library that allows users to combine (without copying) arbitrary pieces of multimedia material into coherent views of their own. Not only is this what is needed if digital libraries are ever to be accepted as tool for co-operation and teaching, it is also essential to allow publishers to install charging mechanisms that are fair to users, authors and publishers by using the "transclusion principle" already propagated by Ted Nelson [Nelson 1987].

Those of the readers who know Hyperwave [Maurer 1996] will realise at once that the only system currently supporting all necessary features is Hyperwave:

  • it does support all types of multimedia mentioned including multilinguality;
  • its integrated database and search facilities allow the administration of the material at issue even when using a very large number of documents;
  • it has an external link database and access authorisation mechanisms that allow to design arbitrary multiple views, the customisation for different user groups and the creation of hyperlinks in arbitrary documents;
  • finally, its advanced annotation facilities give a further degree of personalisation and the basis for co-operative work so essential for emerging digital libraries.

In the remainder of this paper we present the main features of Hyperwave as they related to LIBERATION. For a first pilot of a digital library based on Hyperwave see www.iicm.edu/electronic_library. For a substantial further body of multimedia material on a Hyperwave WWW server see www.aeiou.at. Note further that both MONZ (the Museum Of New Zealand) and the NLNZ (the National Library of New Zealand) (the largest institutions of their kind in New Zealand) have implemented their multimedia repositories based on the European WWW system Hyperwave for the very reasons elaborated. LIBERATION will help to evaluate interfaces and techniques to further improve the usability of large multimedia repositories ("digital libraries") as they are emerging world-wide.

2. Hyperwave

We first discuss some general aspects of distributed hypermedia systems. We use Hyperwave as example, a system developed by a group of researchers and developers in Graz/Austria. Hyperwave is one of the first advanced WWW systems that can best be described as a "distributed database system that is WWW transparent", as a "WWW oriented document management system" or as an "advanced WWW server with integrated database facilities". Hyperwave is an evolutionary step in the development of WWW: while it uses the WWW data presentation protocol HTML 3.0 as one option and understands the WWW protocol HTTP, it also supports other data formats and protocols. However, Hyperwave differs from ordinary WWW servers in a particularly crucial way in its data structuring concept, how it deals with links and that it is a distributed database. I.e., Hyperwave servers, even if physically separate can act as if they are a single system; and documents in Hyperwave can be manipulated with data base functionalities like attributes, search functions, etc.

It turns out that when dealing with distributed hypermedia systems one of the crucial issues is how to deal with links that cross server boundaries. The term "surface links" has been aptly coined in the literature to distinguish such links from "inter-database links". Surface links present a major challenge when automatic maintenance of link consistency is desired.

Not surprisingly, the issue of how to deal with links plays a major role also in other situations. On the one hand, links have been considered as the "defining property of hypermedia systems" while on the other hand links have also been termed "constructs that are as undesirable as goto's in programming languages", and hence should be replaced by structuring techniques as much as possible. However, a proper link concept also provides many benefits for diverse applications such as e.g. customising a database to achieve a "personal view" of the data, for computer conferencing, computer based co-operation and version control.

Thus, any successful implementation of distributed hypermedia systems has to be based on a clean yet powerful model of the concept of links. One such concept that is the basis of a fully functioning system is discussed in the sequel. It is also shown that data structuring and database techniques are a necessity in large distributed hypermedia systems.

The remainder of this paper is structured as follows. After this introductory section we explain the general need for more advanced WWW systems and how such needs are handled in Hyperwave in Section 3. We concentrate on two vital aspects (links and data structuring) in Section 4 and on the distributed database aspect in Section 5. Important references are collected in the final Section 6.

3. The Need for Advanced WWW Systems

In this section we explain that the navigational and structural tools currently available on the Internet are not sufficient to fully exploit the tremendous power of the largest information and communication resource mankind has ever had. Most current hypermedia systems and the most widely spread WWW servers do not have enough functionality to provide the power that is needed.

The steady growth of the Internet [Fenn et al 1994] has made resource discovery and structuring tools increasingly important. Historically first was the introduction of various dictionary servers, Archie [Deutsch 1992] being probably the first and most prominent. As an alternative to having servers that need constant updating, WAIS [Kahle et al 1992] introduced a powerful search engine permitting full-text searches of large databases and returning a ranked list, the ranking based on a heuristic approach. Although directory servers like Archie (to locate a database) and search engines like WAIS (to locate a desired item in that database) alleviated the problem of finding information in the Internet somewhat, it soon became apparent that other techniques would also be necessary.

The most important such technique is to emphasise the a-priori organisation of information, rather than try to search for information in a universe of completely different databases. Two efforts in this direction have proved particularly successful: Gopher, originally developed at the University of Minnesota [Alberti et al 1992] and the World-Wide Web (WWW, W3, or "The Web" for short) originally developed at CERN in Geneva [Berners-Lee et al 1992 and 1994].

In both cases information is stored in a simple structured fashion on servers, and can be accessed via clients, with clients available for most major hardware platforms. Some 100.000 thousand Gopher and WWW servers are currently reachable in the Internet, albeit most of them with little more than a token presentation of the institution running the server: the average number of WWW pages per WWW server has been calculated to be only about 100!

Information in Gopher is structured in a hierarchical fashion using menus, an access technique which, though simple to use, has many well-known weaknesses. Information in WWW is structured in documents; documents are linked together according to the hypertext-paradigm (see [Conklin 1987], [Tomek et al 1991] and [Koegel-Buford 1994] for a general and thorough discussion of hypertext and hypermedia). "Anchors" within documents are associated with "links" leading to other documents. Although many stand-alone systems using the hypertext-metaphor have emerged since the introduction of HyperCard on the Mac in 1987, WWW can claim to be the first wide-spread hypertext system whose component servers are accessible via the Internet. Indeed, WWW is not just a hypertext system but a hypermedia system, i.e., documents can comprise text, images, and audio- and film clips.

Ordinary WWW servers are easy to install and clients are available on all major platforms. Much software is free and sources are sometimes available. The node-link technique for navigating and finding information is quite appealing at least for small to medium amounts of data, and the mix of media makes the use of WWW aesthetically pleasing. All this has contributed to the proliferation of WWW. Indeed there is no doubt that WWW is not only the first widespread hypermedia system available through the Internet, but that WWW has actually replaced some earlier more traditional information systems. The success of WWW, the number of WWW proponents and freaks, and its publicity even in non-scientific publications may create the impression that WWW, as it now exists, is the solution for most information needs and defines the dominating technology for the foreseeable future.

The reality is different, however. Whilst WWW is undoubtedly a big step forward compared to pre-WWW times, experience shows that much functionality required for sizeable applications is missing from ordinary WWW. In this sense, ordinary WWW should be considered a first generation networked hypermedia system. More advanced hypermedia systems are required to cope with the problems currently being encountered on the Web. Just to give one example, while pure node-link navigation is satisfactory in small systems it tends to lead to confusion and disorientation, if not chaos, when applied to large amounts of data [Conklin 1987]. For substantial applications, some additional structuring and searching facilities are clearly required. That links may actually be more harmful than useful has been already pointed out in [Van Dam 1988] and elaborated in [Maurer et al 1994a]. Similarly, the necessity to keep links separate from rather than embedded in documents as is the case in most current WWW systems has already been demonstrated in the pioneering work on Intermedia at Brown University [Haan et al 1992]. Hyperwave [Maurer 1996] and Microcosm [Hall et al 1996] are the only two major systems that support this important feature at the moment.

In what follows, we concentrate on features that are desirable in advanced WWW information systems. We compare the features found in first generation WWW systems with those found in the more advanced WWW system, Hyperwave. This is not to belittle early WWW systems or to glorify the advanced WWW system Hyperwave, but rather to clarify why certain facilities are needed. We also briefly look at communicational and co-operational features that will have to be integrated in hypermedia systems if they are to be successful: such features are currently scarcely supported by any hypermedia system. They are often dealt with in the context of computer supported co-operative work, rather than hypermedia. Although we look in a little more detail at Hyperwave in this section and how it allows a smooth transition from first generation WWW systems to second generation versions like Hyperwave, we consider two of the most important aspects of Hyperwave in a separate Section 4.

Information in a hypermedia system is usually stored in "chunks". Chunks consist of individual documents which may themselves consist of various types of "media". Typically, a document may be a piece of text containing a picture. Each document may contain links leading to (parts of) other documents in the same or in different chunks. Typical hypertext navigation through the information space is based on these links: the user follows a sequence of links until all relevant information has hopefully been encountered.

In WWW, a chunk consists of a single document. Documents consist of textual information and may include pictures and the (source) anchors of links. Pictures and links are an integral part of the document. Pictures are thus placed in fixed locations within the text ("inline images"). Anchors can be attached to textual information and inline images, but not to parts of images. Links may lead to audio or video clips which can be activated. The textual component of a document is stored in so-called HTML format, a derivative of SGML.

In Hyperwave the setting is considerably more general: chunks, called "clusters" in Hyperwave terminology consist of a number of documents. A typical cluster may, for example, consist of five documents: a piece of text (potentially with inline images), a second piece of text (for example in another language, or a different version of the same text, or an entirely different text), a third piece of text (the same text in a third language perhaps), an image and a film clip. Anchors can be attached to textual information, to parts of images, and even to regions in a film clip. Links are not part of the document but are stored in a separate database. They are both typed and bi-directional: they can be followed forward (as on any WWW server) but also backwards.

The support for multiple pieces of text within a cluster allows Hyperwave to handle multiple languages in a very natural way. It also elegantly handles the case where a document comes in two versions: e.g., a more technical (or advanced) one and one more suitable for the novice reader. On different versions for different bandwidth, for different drivers, etc.

Text can be stored in Hyperwave in a variety of formats including PostScript and PDF. In addition to the "usual" types of documents found in any modern hypermedia system, Hyperwave also supports arbitrary other types of information including 3D objects and scenes [Pichler et al 95].

One of the most crucial differences between simple WWW and Hyperwave is the treatment of links. In simple WWW systems links are unidirectional, have no type and are embedded in documents. In Hyperwave they are bi-directional, can have types and are stored in a link database separate from the actual documents. This difference is very significant, hence we dedicate a good part of the separate Section 4 to it.

Navigation in ordinary WWW is performed solely using the hypertext paradigm of anchors and links. It has become a well accepted fact that structuring large amounts of data using only hyperlinks in a way that users don't get "lost in hyperspace" is difficult to say the least. Ordinary WWW databases are large, flat networks of chunks of data and resemble more an impenetrable maze than well- structured information. Indeed every simple WWW server acknowledges this fact tacitly, by offering pages that look like menus in a hierarchically structured database: items are listed in an orderly fashion, each with an anchor leading to a subchapter (subdirectory). If links in WWW had types, such links could be distinguished from others. But as it is, all links look the same: whether they are "continue" links, "hierarchical" links, "referential" links, "footnote links", or whatever else.

In Hyperwave not only can have links a type, links are by no means the only way to access information. Clusters of documents can be grouped into collections, and collections again into collections in a pseudo-hierarchical fashion. We use the term "pseudo-hierarchical" since, technically speaking, the collection structure is not a tree, but a DAG. I.e., one collection can have more than one parent: an impressionist picture X may belong to the collection "Impressionist Art", as well as to the collection "Pictures by Manet", as well as to the collection "Museum of Modern Art". The collection "hierarchy" is a powerful way of introducing structure into the database. Indeed many links can be avoided this way [Maurer et al 1994a], making the system much more transparent for the user and allowing a more modular approach to systems creation and maintenance. Collections, clusters and documents have titles and attributes. These may be used in Boolean queries to find documents of current interest. Finally, Hyperwave provides sophisticated full-text search facilities. Most importantly, the scope of any of such searches can be defined to be the union of arbitrary collections, even if the collections reside on different servers. We will return to this important aspect of Hyperwave as a distributed database in Section 5. The concept of collections has one other very significant advantage: it allows insertion and deletion of documents into a Hyperwave database without any link adjustment, a luxury unknown in ordinary WWW systems. We return to this aspect in the second half of Section 4.

Note that some WWW servers also permit full-text searches. However, no full-text search engine is part of "standard" WWW. Thus, the functionality of full text search is bolted "on top" of most WWW servers: adding functionality on top of WWW leads to the "Balkanisation", the fragmentation of WWW, since different sites will implement missing functionality in different ways. Thus, to stick to the example of the full text search engine, the fuzzy search employed by organisation X may yield entirely different results from the fuzzy search employed by organisation Y, much to the bewilderment of users. Actually, the situation concerning searches on most WWW servers is even more serious: since documents in such WWW servers do not have attributes, no search is possible on attributes; even if such a search or a full text search is artificially implemented, it is not possible to allow users to define the scope for the search, due to the lack of structure in most WWW databases. Hence full-text searches on most WWW servers always work in a fixed, designated part of the database residing on one particular server.

Hyperwave provides various types of access rights and the definition of arbitrarily overlapping user groups. Hyperwave is also a genuine distributed database: servers (independent of geographical location) can be grouped into collections, with the hyperroot at the very "top". Thus, users can define the scope of searches by defining arbitrary sets of collections on arbitrary servers. We will return to this in more detail in the separate Section 5. Note further that proper authorisation schemes allow different groups to work with the same server without fear of interfering with each other's data.

First generation WWW systems have traditionally been seen mainly as (simple) information systems. Most applications currently visible support this view: very often WWW servers offer some pleasantly designed general information on the server-institution, but only rarely does the information go much deeper. If it does, usually a "hybrid" system is used, WWW with some add-ons or a database in the background using the scripting interface of WWW.

It is our belief that hypermedia systems acting as simple information systems, where someone inputs information to be read by other users, do not offer much potential: they will disappear into obscurity sooner rather than later. To ensure the success of a hypermedia system, it must allow users also to act as authors, allow them to change the database, create new entries for themselves or other users, create a personal view of the database as they need it, and, above all, allow the system to be used also for communication and co-operation.

First generation hypermedia systems like WWW almost entirely lack support for such features. Emerging more advanced hypermedia systems are bound to incorporate more and more features of the kind mentioned; Hyperwave provides a start.

Hyperwave supports annotations (with user-definable access rights): Hyperwave annotations become part of the database, i.e., are also available when working with other clients, or from another user account or machine. Annotations can themselves be annotated; the network of annotations can be graphically displayed using local map functions. Thus, the annotation mechanism can be used as the basis of (asynchronous) computer-conferencing, and has been successfully employed in this fashion.

We believe that many of the features discussed in the area of computer supported co-operative work (for a compact survey see [Dewan 1993]) will eventually be incorporated into advanced hypermedia systems.

As has become clear from the above discussion, first generation WWW hypermedia systems do not have enough functionality to serve as a solid and unified basis for substantial multi-user multimedia repositories with a strong communicational component.

Hyperwave is a first attempt to offer much more basic functionality, yet to continue to work as WWW system: every WWW client can be used to access every Hyperwave server.

4. Linking and Data Structuring

The WWW system Hyperwave differs in its design significantly from ordinary WWW systems by adding much functionality for both users and information providers.

In this section we discuss two important points in more detail: the link concept and the notion of collections.

As has been mentioned earlier, ordinary WWW uses unidirectional links that are embedded in the documents and cannot be typed. In Hyperwave WWW links are bi-directional, can have a type and attributes and are handled not as part of the document but in a separate database. These extensions of the link concept have far-reaching consequences. We mention a few of the most important ones in what follows.

When viewing a document it is clearly always necessary to know which other documents can be reached from it: this is after all what links are all about, and such "out-links" are of course supported in all WWW systems including Hyperwave. However, it is equally important to also know which documents are pointing to the current one, i.e., to know the "in-links"; this is what bi-directional links are all about. There are many reasons why also the knowledge of "in-links" is important:

First, it gives users (and information providers) the valuable information who is pointing to a particular document for "information only" reason: the context of the document becomes clearer, its "popularity" can be determined, etc. To be specific, suppose you are interested in impressionist art and you have located a picture in the (virtual) gallery of the Museum of Modern Art in Vienna; by examining all "in-links" you are bound to find valuable information about the painting that would have escaped your attention otherwise.

Second, having bi-directional links allows to generate a graphic representation of the "local surroundings" of the current document (the "local map" of Harmony, see [Fenn et al 1994]).

This local map will show you an iconized version of your current document, plus all those that (via one or more steps) lead to it or can be reached from it: this is a very powerful support for navigation and helps to avoid the "lost in hyperspace" syndrome mentioned earlier.

Third, and still more important, it is the only way towards better link maintenance to avoid "dangling links", a phenomenon that is a problem on all large WWW servers. Again, to be specific, suppose an important document X is pointed to by documents A_1, A_2, ..., A_n. If X is removed and only unidirectional links are used there is no way to know that A_1, A_2, ...., A_n point to X, particularly if the A_i's reside on different servers. Hence the links in all A_i's when activated by users will all yield the very frustrating message "document cannot be found" or such, a phenomenon well-known to every WWW user. However, if bi-directional links are used then the removal of X can at least result in notifying the owners of the documents A_1, A_2, ..., A_n that X has disappeared so that the links in the A_i's can be adjusted. Indeed, if the links in A_1, A_2, ..., A_n are not embedded within the A_i's rather than just notifying the A_i's the links in the A_i's leading to X can be deactivated automatically.

This, then, is a first step towards automatic link maintenance, a feature imperative for large systems and addressed in Hyperwave for the first time: to be able to carry it out it is necessary to have bi-directional, not-embedded links and to be able to assign rights and attributes to links that differ from those assigned to the document at issue. Typically, you may permit automatic link deletion of a link to X in a document A_i you have authored although you are unlikely to permit any change of the contents of A_i. This is one reason why keeping links separate from the documents is very important! Automatic link maintenance does not stop at deleting links that are no longer valid, it can also be used to automatically generate links. Suppose you have a contribution about St. Stephen's Cathedral in Vienna and you have activated automatic link generation for documents of type picture. Then, in a properly working hypermedia system, when a picture of St. Stephen's Cathedral is inserted with "Title: St. Stephen's Cathedral; Type: Picture" an automatic link can be generated from your essay about the cathedral to that picture ... again, of course, only if non-embedded links are available.

Links are usually attached to pieces of text or to pictures. However, one may also want to attach them to parts of a picture, to a moving object in a movie, a 3D object in a 3D scene, etc. In all such cases it is clearly impossible to embed the link information in the document (how would you embed such information in an MPEG film without destroying the MPEG coding?). I.e., links in "unorthodox documents" are only feasible if non-embedded links are used.

Another reason why it is of paramount importance to keep links and documents separate and to be able to differentiate between the rights of links and documents and to add attributes to the links ("typed links") is that only in this way can hypermedia systems be "personalised" and "customised". Suppose teachers want to prepare multimedia presentations based on existing material for a particular class without (e.g., for copyright reasons) being allowed to copy the material. The teachers can connect various bits and pieces of information using links with the name of the class at issue as attribute. Students of the class are identified as such when they log-in and hence only the links generated for that class become visible. Thus, the concept of being allowed to add "private" links to arbitrary documents combined with link filtering based on some link attribute allows to customise a hypermedia system arbitrarily without copying any information. This is basically the original "transclusion" idea of Ted Nelson, see [Nelson 1987]. See [Maurer et al 1995] and [Lennon et al 1994] for applications to teaching support, and [Calude at al 1994], [Marchionini et al 1995] and [Maurer et al 1994 b] for applications to electronic journals and digital libraries. For electronic journals see also in particular the beautiful "classic" papers by Odlyzko, [Odlyzko 1994] and [Odlyzko 1996].

To be able to type links offers a tremendous additional potential: it is possible to display different links differently (indicating that one is a footnote, the other one a reference, the next one a link back to a table of contents), to filter out links (e.g., to only show links created by certain authors within a certain time period), or to perform even more complex computations based on the link types.

Hyperwave allows annotations, another feature that requires non-embedded, typed links: an arbitrary document can be attached as annotation to another one by you (even if you have no editing privileges for either of the two documents) and the link to the document attached is shown as annotation. This can be generalised to computer conferencing where link types can be used to show that for a particular thesis a certain number of counter-examples, supporting arguments, generalisations, etc. have been proposed.

Typed links are also convenient tools for version control! Summarising, the original link concept of WWW works well for small amounts of data (say 50 documents) but just does not support larger amounts of data, multi-person co-operation, customisation and other desirable features: more powerful additional features as available in Hyperwave are essential.

There is still another twist to links: without additional structure, using links in a large system becomes very confusing, much like "spaghetti-programming" in first generation high level programming languages such as FORTRAN or Basic. It is desirable to organise and structure information in a way going beyond having a "flat database with myriad's of links". In Hyperwave one crucial concept in this direction is the notion of collection mentioned earlier. Documents are gathered into collections, collections may belong to other collections, etc. with a DAG-like structure.

Not only does this allow to structure material better so that it is easier to find; not only does it allow to define the scope for searches, or for the material to be packaged on a CD-ROM, or printed out, or whatever. It is also a powerful tool to replace links to some extent. To be specific, suppose we have defined a collection "pictures of Graz". Once this has been done adding a picture to this collection does not require the creation of any links (nor would the removal of a picture necessitate adjustment of links): when the collection is accessed the titles of all pictures are shown and the relevant ones can be selected. If the collection has the attribute sequence one can automatically step forwards and backwards in the sequence of pictures, again requiring no links at all.

It may be worth mentioning that this concept is used particularly extensively in HM-Card [Maurer et al 1996] and makes administration of large amounts of material much easier. To quote Dieter Fellner from the University of Bonn, Germany: "The collection concept alone is enough reason to choose Hyperwave as WWW server".

In this section we have just mentioned two of the important advanced concepts of Hyperwave, features that go beyond ordinary WWW.

It is our contention that ordinary WWW and HTML should be seen as "thin interface layers" (and this is how Hyperwave treats them) but that more powerful tools must be used as further underpinning (and this is what Hyperwave does). For detailed information see the "Hyperwave" book [Maurer 1996]. An electronic version of this book is available under http://www.iicm.edu/hgbook. Additional information on Hyperwave can also be found in [Andrews at al 1995], [Kappe et al 1993a,b and 1994 a,b].

5. (Distributed) Database Facilities

In ordinary WWW servers HTML pages or other documents are stored as such, without any "meta-information", i.e. without any information about the documents. In contrast, in Hyperwave every document has some standard attributes and potentially further ones defined by the user. Standard attributes include author, creation date, date when the data is to be made public, expiration data, keywords, etc.

Note that such attributes are invaluable for searching and for administration. Typically, if a document concerns an event on a particular date, clearly the document should be removed (and links pointing to it deactivated) after this date. In ordinary WWW systems this has to be done manually. This requires much effort, tends to lead to many documents whose removal has been forgotten and hence are obsolete, and adds to links whose activation results in a message "object not found" or such, since when the document was removed the deactivation of some link in some document (invisible due the unidirectionality of links in ordinary WWW systems!) was forgotten. In Hyperwave a document as mentioned would be entered with an appropriate expiration date. The removal of the document and the deactivation of links pointing to it (or at least the notification of owners of links pointing to it) can be handled without manual intervention by Hyperwave.

Consider as further example some server that wants to present a new joke every day. In an ordinary WWW server some person has to manually (even on Sundays and holidays!) replace the current joke by the new one...and if possible at 00:00 hours, so that early risers find a new joke and won't be disappointed. Using Hyperwave an arbitrary set of jokes can be entered into the system, successive jokes with successive opening and expiration dates. The system will do the rest!

Of course creation/modification dates are also valuable in other contexts. It may be a good idea for the Webmaster to have the system show all documents that have not been modified for (say) 2 years: the likelihood that such documents are not of interest any more is very high. In ordinary WWW systems it is impossible to even determine if a document has not been modified for a long time or not. The author attribute of a document is also helpful for administrative purposes. Consider again a specific example: if the presentation of an organisation, as becomes more and more usual, includes a presentation of the persons in the company including some WWW pages edited by the persons at issue it might be wise to check once in a while whether the server contains information on persons no longer associated with the organisation. Again, this is virtually impossible with ordinary WWW servers, but is easy to handle with Hyperwave.

Thus, document (and collection) attributes in Hyperwave are just one other feature supporting system administration. Remember that structuring information (see Section 4) was one other such helpful feature, and turned out not only to be convenient for the system administrator, but also for users: it allowed users to e.g. search for information within certain scopes ( unions of collections) eliminating many useless "hits" when applying a search engine. The same holds for attributes: they can be used for searching. Hence searching within a certain scope and e.g. only for documents created after a certain date, or by a certain author, allows much more specific queries than would otherwise be feasible.

As has been mentioned before, Hyperwave is a distributed database system in the sense that the databases in physically different servers can be seen as one logical database. Users may e.g. decide to perform a full-text search for a word like "butterfly" in three collections that reside on separate servers (and potentially in completely different corners of the world), and indeed restrict the search to documents that have been created after a certain date...all this with a single command!

Hyperwave supports link consistency within a server: a highlighting that indicates a link from document X to document Y is automatically deactivated if the document Y is removed. This is very valuable in its own right, yet becomes particularly interesting since Hyperwave preserves link consistency beyond the boundaries of servers: if the document Y is removed on one server, all (Hyperwave) servers within a "tribe" having links pointing to Y are notified of this fact (and the link highlighting is deactivated). To avoid that the maintenance of such "surface links" (links from a document in one server to a document in another server) causes undue traffic on the net an ingenious algorithm ("p-flood", see [Kappe 1995] ) is used that propagates the information to only some servers "in the vicinity" that propagate the information to others, etc. The problem that has to be addressed by the algorithm at issue is that all servers must be notified "fast", even if many of the servers are currently "down" and unable to receive, let alone propagate information.

Summarising, much of the power of Hyperwave and a power that will be required by all future distributed hypermedia systems is that much coherence of data can be assured across physical server boundaries.

It should also be clear, however, that Hyperwave is just a first step in a necessary direction for large multimedia repositories as in LIBERATION, but further steps will be unavoidable. Only the actual use of very large WWW databases will show what new features or concepts will be needed. We want to state, however, very explicitly that we are convinced that at this point in time a number of assumptions are made about WWW services that will not be true for ever.

6. References

  • [Alberti et al 1992] Alberti,B., Anklesaria,F., Lindner,P., McCahill,M., Torrey,D.:Internet Gopher Protocol:A Distributed Document Search and Retrieval Protocol; FTP from boombox.micro.umn.edu, directory pub/gopher/gopher_protocol.
  • [Andrews et al 1995] Andrews, K., Kappe, F., Maurer, H., Schmaranz, K.: On Second Generation Hypermedia Systems; Proc.ED-MEDIA'95, Graz (June 1995), 75-80. See also J.UCS 0,0 (1994),127-136 at http:///www.iicm.edu/jucs.
  • [Berners-Lee et al 1992] Berners-Lee, T., Cailliau, R., Groff, J., Pollermann, B.: WorldWideWeb: The Information Universe; Electronic Networking:Research, Applications and Policy 1,2 (1992),52-58.
  • [Berners-Lee et al 1994] Berners-Lee, T., Cailliau, R., Luotonen, A., Nielsen, H.F. and Secret, A. (1994). The World-Wide Web; C.ACM 37 (8), 76-82.
  • [Calude et al 1994] Calude, C., Maurer, H., Salomaa, A.: J.UCS: The Journal of Universal Computer Science; J.UCS 0,0 (1994) 109-117 at http://www.iicm.edu/jucs.
  • [Conklin 1987] Conklin, E.J.: Hypertext: an Introduction and Survey; IEEE Computer 20 (1987), 17-41.
  • [Deutsch 1992] Deutsch, P.: Resource Discovery in an Internet Environment-the Archie Approach; Electronic Networking: Research, Applications and Policy 1,2 (1992), 45-51.
  • [Dewan 1993] Dewan, P.: A Survey of Applications of CSCW Including Some in Educational Settings; Proc. ED-MEDIA'93, Orlando (1993), 147-152.
  • [Fenn et all 1994] Fenn, B., Maurer, H.: Harmony on an Expanding Net; ACM Interactions 1,3 (1994), 26-38.
  • [Haan et al 1992] Haan, B.J., Kahn, P., Riley, V.A., Coombs, J.H., Meyrowitz, N.K.: IRIS Hypermedia Services; Communications of the ACM 35,1 (1992), 36-51.
  • [Hall et al 1996] Hall, W., Davis, H., Hurtchings, G.: Rethinking Hypermedia; Kluwer Academic Pub., Boston/London (1996).
  • [Kahle et al 1992] Kahle, B., Morris, H., Davis, F., Tiene, K., Hart, C., Palmer, R.: Wide Area Information Servers: An Executive Inforamtion System for Unstructured Files; Electronic Networking: Research, Applications and Policy 1,2 (1992), 59-68.
  • [Kappe 1995] Kappe, F.: A Scalable Architecture for Maintaining Referential Integrity in Distributed Information Systems; J.UCS 1,2 (1995), 84-104.
  • [Kappe et al 1993a] Kappe, F., Maurer, H., Scherbakov, N.: Hyper-G - a Universal Hypermedia System; Journal of Educational Multimedia and Hypermedia 2,1 (1993), 39-66.
  • [Kappe et al 1993b] Kappe, F., Maurer, H.: Hyper-G: A Large Universal Hypermedia System and Some Spin-offs; IIG Report 364, Graz, Austria (1993); also appeared as electronic version, anonymous FTP siggraph.org, in publications/May-93-online/Kappe.Maurer.
  • [Kappe et al 1994a] Kappe, F., Maurer, H.: From Hypertext to Active Communication/Information Systems; J.MCA 17,4 (1994), 333-344.
  • [Kappe et al 1994b] Kappe, F., Andrews, K., Faschingbauer, J., Gaisbauer, M., Maurer, H., Pichler, M., Schipflinger, J.: Hyper-G: A New Tool for Distributed Multimedia; Proc. Conf. on Open Hypermedia Systems, Honolulu (1994), 209-214.
  • [Koegel-Buford 1994] Koegel-Buford, J.: Multimedia Systems; ACM Press, SIGGRAPH Series (1994).
  • [Lennon et al 1994] Lennon, J., Maurer, H.: Lecturing Technology: A Future With Hypermedia; Educational Technology 34,4 (1994), 5-14.
  • [Marchionini et al 1995] Marchionini, G., Maurer, H.: The Role of Digital Libraries in Teaching and Learning; Communications of the ACM 38,4 (April 1995), 67-75.
  • [Maurer 1996] Maurer, H.: Hyperwave: The Next Generation Web Solution; Addison Wesley Pub.Co., UK (1996).
  • [Maurer et al 1994a] Maurer, H., Philpott, A, Scherbakov, N.: Hypermedia Systems Without Links; Journal of Microcomputer Applications 17, 4 (1994), 321-332.
  • [Maurer et al 1994b] Maurer, H., Schmaranz, K.: J.UCS -- The Next Generation in Electronic Journal Publishing; Proc. Electronic Publ. Conference, London (November 1994), in: Computer Networks for Research in Europe 26, Supplement 2-3 (1994), 63-69.
  • [Maurer et al 1995] Maurer, H., Lennon, J.: Digital Libraries as Learning and Teaching Support; Proc. ICCE'95, Singapore (December 1995).
  • [Maurer et al 1996] Maurer, H., & Scherbakov, N. (1996). Multimedia Authoring for Presentation and Education: The Official Guide to HM-Card. Addison Wesley Pub.Co. Germany (1996).
  • [Nelson 1987] Nelson, T.H.: Literary machines; Edition 87.1, 702 South Michigan, South Bend, IN 46618, USA (1987)
  • [Odlyzko 1994] Odlyzko, A.M.: Tragic Loss or Good Riddance? The Impending Demise of Traditional Scholarly Journals; to appear in: Electronic Publishing Confronts Academia: The Agenda for the Year 2000;
  • (Peek., R.P., Newby, G.B., eds.) MIT Press (1995); see also J.UCS 0,0 (1994), 3-53 at http://www.iicm.edu/jucs.
  • [Odlyzko 1996] Odlöyzko, A.M.: On the Road to Electronic Publishing; Euromath Bulletin (1996); see also J.UCS 2,11 (Nov. 1996) at http://www.iicm.edu/jucs.
  • [Pichler et al 1995] Pichler, M., Orasche, G., Andrews, K., Grossmann, E., McCahill, M.: VRweb: A Multi-System VRML Viewer; Proc. VRML 95, San Diego, CA (December 1995),77-85.
  • [Tomek et al 1991] Tomek, I., Khan, S., Muldner, T., Nassar, M., Navak, G., Proszynski, P.: Hypermedia-Introduction and Survey, Journal for Microcomputer Applications 14,2 (1991), 63-100.
  • [Van Dam 1998] Van Dam, A.: Hypertext'87 Keynote Address; Communications of the ACM 31, 7 (1988), 887-895.