Washington, D.C.
October 18-20, 1995
Donald J. Waters, Associate University Librarian
Yale University
Robert Kagan, a Harvard psychologist, has recently written a wonderful book that I would commend to you for a variety of reasons. It is entitled In Over Our Heads: The Mental Demands of Modern Life. In this book, Kagan retells the following tale, which some of you have no doubt heard (Kagan 1994: 271). It is a story of a “mother getting breakfast ready for her son on a school day,” and it goes like this:
Hearing nothing indicating that he was up and getting dressed, [the mother] went to [her son’s] room, only to find him in bed. “Are you okay?” she asked. “I’m okay,” he replied, “but I’m not going to school today!” Being a modern mother, she decided to engage him in conversation. “Well, then,” she demanded, “you give me three good reasons why you aren’t going to school.” “Okay,” said her son. “I don’t like school. The teachers don’t like me. And I’m afraid of the kids.” “Okay,” said his mother, “now I’m going to give your three good reasons why you are going to school. Number one, I’m your mother and I say that school is important. Number two, you’re forty five years old. And number three, you’re the principal of the school!”
The problems and prospects of archiving digital information have made many of us feel like the school principal in this story: better to stay home in familiar surroundings — the comfort of a warm bed and Mom fixing breakfast in the kitchen — than to face what seems like a terrifyingly uncertain, expensive and time-consuming effort. The Commission on Preservation and Access and the Research Libraries Group (RLG) created the Task Force on the Archiving of Digital Information to help relieve building anxiety about digital archiving. The Task Force sponsors asked that it frame digital archiving as a set of problems and tasks and to suggest an orderly, perhaps even manageable, approach to their resolution.
To achieve these goals, the Commission and RLG composed the Task Force of members with a breadth of experience from a broad range of disciplines and backgrounds, including many from the research library community. The Task Force sponsors also asked that the group seek wide input from other specialists and interested parties by issuing a draft report, distributing it widely, and inviting comment before composing a final report. We are now in the comment phase, which ends October 31, 1995. I invite ARL as an organization to comment on the report. I also appeal to each of you individually to engage the substance of the draft report, if you have not already done so, to encourage your home institution to do so in some form, and to help us with comments, criticisms, and suggestions.
To stimulate your attention to the issue of digital archiving, I will, in the brief remarks that follow, attempt to cast the work of the Task Force in terms of the theme of the hour: How can we realize economic benefits through inter-institutional agreements? I assume that you all are most interested in economic and other benefits that could accrue in the nexus of activities in which ARL has been defined, under the general rubric of “scholarly communication.” I hope to develop here the argument that inter-institutional agreements regarding digital archiving will generate economy if and only if they are directed at each of at least three different dimensions of the system of scholarly communication:
• First, we need to forge — or renew — agreements about the centrality of archiving in the process of scholarly communication.
• Second, we need to affirm the utility of a systematic approach to the development of digital archiving.
• Third, we need to set the mechanics of digital archiving in motion as a pervasive and trusted foundation for cultural discourse that includes scholarly communication.
Any discourse about economy, about the efficient management of scarce resources toward valued ends, is ultimately a discourse about values. Agreements about digital archiving that generate economic value must of course be able to answer the central question: Of what value or good is archiving and why should any scarce resources be pushed its way? This is a difficult question about purpose that may immediately open questions about and prompt defenses of particular forms of organization for archiving. In considering the answer, however, we must separate issues of purpose and function from those of organization.
I note in passing here that the Task Force report consistently equates long-term preservation with archiving, and identifies digital archives, rather than digital libraries, as the unit of activity for the long-term preservation of digital materials. I maintain this usage here. We all know that many libraries frequently assume responsibility for the long-term preservation of the record of knowledge, but we have come to designate those that exercise such responsibility as a matter of course with special semantic markers as in the phrase “research library.” Moreover, although we now refer to “digital libraries,” discussion of such entities to date has made almost no reference to the long term value of the content, nor to the mechanisms that might be employed to preserve such value over time. Rather than use the semantically marked phrase that Peter Graham (1995) has suggested, namely the “digital research library,” we have adopted the simpler designation of “digital archive.”
In answer to the question about the value of archiving, the Task Force report opens by invoking the principle that culture — any culture — depends on the quality of its record of knowledge. If that record is defective, as it will be if urgent attention is not given widely to the preservation of information in digital form, then the quality of the culture is also at risk (Task Force 1995: 1-2). This “culture at risk” argument for the preservation of digital information may be sufficient for the Task Force report. However, it does not provide a sufficiently strong and compelling case about the economic motives that might drive actors, like ARL member libraries, to invest aggressively in the preservation of digital information.
The stronger case of economic motive requires us to identify the principles underlying a knowledge economy and to demonstrate the place of archiving among them. The basic principle that enables us to regard the knowledge economy as a construct separate from other kinds of economy is the notion that the pursuit of knowledge is its own end. As I craft the stronger case of economic motive for your review, I turn for help to the work of the great Yale religious historian, Jaroslav Pelikan, who has produced one of the most eloquent recent defenses of the pursuit of knowledge as its own end, rather than for the utility it provides.
In The Idea of the University: A Reexamination (1992), Pelikan critically examines the principle of knowledge as its own end and argues that it provides the rationale for education generally, and for the university in particular. Moreover, according to Pelikan, the principle of knowledge as its own end is merely one of a more comprehensive set of first principles that he calls the “intellectual virtues.” These virtues are essential for the pursuit of knowledge as its own end, and include principles of free inquiry and intellectual honesty, an obligation to convey the results of research, and an affirmation of the continuity of the intellectual life, upon which each generation builds and to which it contributes in turn (32-56). Building on this set of first principles, Pelikan argues that the advancement of knowledge through research, the transmission of knowledge through teaching, the diffusion of knowledge through publishing, and the preservation of knowledge in scholarly collections are the four legs supporting any table made for the pursuit of knowledge; they particularly support the table that has come to be known as the research university (16-17, 78-133).
Invoking the 19th century phrasing of John Henry Newman, Pelikan goes on to suggest that support for teaching, research, and publication constitutes the “endowment of living [genius],” while efforts to preserve, or archive, knowledge by organizations like libraries, museums and archives, represent “the embalming of dead genius” (110). Lest the connotations of these phrases give you pause, note that Pelikan is careful to distinguish embalming from entombing, and his use of “embalming” is a colorful synonym for preservation and archiving which he takes to include all of the means necessary to make knowledge accessible to present and future generations. Moreover, he vigorously argues that “new knowledge has repeatedly come through confronting the old, in the process of which both old and new have been transformed” (120). The two motives at work in what we today call the process of scholarly communication — embalming and endowment of genius, the looking backward in preservation and the looking forward in research, teaching and publication — thus are inextricably linked and flow from the principle that the pursuit of knowledge is its own end; preserved work from past generations is a necessary foundation for present and future work, which in turn define the accessibility of the preserved work.
If we accept Pelikan’s argument that knowledge is its own end and that the broadly defined function of preserving or archiving the record of knowledge is essential to the scholarly communication process, then where is the archiving function in the calculus of the emerging knowledge economy? A story that we in ARL seem to be constructing about scholarly communication from the point of view of research libraries is that the service we provide of preserving knowledge is increasingly held hostage by a tangled web of external factors and agents. The story lends itself to apocalyptic tones. It focuses on an outmoded tenure process that is dependent on research and teaching in increasingly narrow fields of specialization and is coupled to a system of publication governed by an oligarchy of avaricious publishers intent on maintaining profit levels by controlling pricing and gutting the copyright regulations of provisions that might limit the compensation the publishers receive for the intellectual property they control. Given a set of problems framed in this way, the solutions we have invented include sweeping reform of the outmoded tenure process, take-back-the-night approaches to copyright, large-scale cuts in acquisitions budgets on university campuses, and the metamorphoses of scholars and/or libraries into entrepreneurial publishers eager to compete with the big guys. There are many useful themes and innovations embodied in these solutions. Most of them, however, stray far from the touchstone principle of institutional interest among research libraries in preserving knowledge for future generations of scholars.
Can we instead generate an hypothesis about the current state of scholarly communication that frames the problems directly — or at least more directly — in terms of preservation? I believe that we can. Let us imagine that the core problem in the scholarly communication process, for at least a subset of scholarly disciplines, is that the conventional published record simply does not adequately capture the intellectual action. The real action occurs elsewhere: in on-line databases, on-line exchanges of pre-prints, listservs, and so on. Conventional publication in these disciplines adds little value to the work that has already been disseminated in other channels; rather, it is a redundant process, undertaken to generate, in effect, a certified archival record of the work. Because the audience paying attention to the field has already seen and absorbed the work in on-line versions, the printed publication channel grows increasingly narrow, consisting primarily of libraries that serve as archival institutions. Because of the narrow market, costs and prices consequently rise on the supply side. On the demand side, libraries respond by cutting titles from their collections.
There is clearly little logic or economy in a process whereby scholars use printed publications to establish an archival record only to find that the institutions responsible for ensuring that the archive endures for future generations cannot afford to purchase the publications. Framed in this way, the problems in the scholarly communication system are archival problems, and a focus on tenure, the mechanics of print publication, electronic versions of print publications, and institutional retention of copyright is looking for solutions in all the wrong places — or at least not in some of the right places. A focused archival solution might aim instead to capture the real intellectual activity from the on-line places wherever it is now naturally occurring and to ensure that such activity is housed in certified, durable and readily accessible archives. Where there is redundancy between print and electronic form, as there increasingly is in disciplines such as mathematics and physics where pre-print markets flourish, might not such a solution save scholars, publishers, libraries, and universities the trouble and expense of writing, publishing, collecting, and financing in conventional print forms merely to establish an archival record? Given a digital archive system on which they can depend and which provides real, tangible economic benefits, scholars might not only be moved to change the way that they conduct scholarship but also the mechanisms, such as tenure review, by which they measure the quality of that work.
If all these hypotheses are plausible, then do we not also need to say bluntly that our own unwillingness or inability, as archival institutions, to provide a trustworthy archival record of substantially changed and changing intellectual activity is itself a critical barrier to the rehabilitation and renewal of a viable (read: affordable) system of scholarly communication? The process of coming to terms with each other, with our academic colleagues and with publishers about the investment we must make in the system of scholarly communication and the savings that we must extract from that system is essentially a coming to terms about the centrality of archiving — the embalming of dead genius — in the pursuit of knowledge. But these understandings and agreements cannot be achieved immediately. And this brings me to my second point: that we need to affirm the utility of a systematic approach to the development of digital archiving.
As we contemplate the archiving of digital information, we have to understand that we are not seeking to fine-tune some technical variables of a system that is already long in place. While the goals are ultimately the same, we are not placing brittle books under a microfilm camera in a well-defined process. Instead, we are faced with what the Task Force report calls “a grander problem of organizing ourselves over time and as a society to maneuver effectively in a digital landscape” (Task Force 1995: 4). The effort to meet the cultural and economic imperatives of digital preservation requires us to build, almost from scratch, a system of infrastructure for moving the record of knowledge naturally and confidently into the future. The systematic approach, on which I believe we need to agree in order to build this infrastructure, has at least two dimensions: the elements of the system, and the manner in which we interact to deploy those elements and construct the system and subsystems for digital archiving.
The various elements of a system for archiving digital information — the kinds of information, the stakeholders and the operational functions — are discussed at length in the Task Force report. The discussion there is not perfect, nor have we identified all the factors that one might judge relevant. We would welcome your assessment of our judgments. However, it is perhaps less important that we have all the factors perfectly in hand than that we adopt a systematic process to ensure that over time we formulate and then confirm or disconfirm hypotheses about the interrelation of those elements and, in so doing, that we measurably improve our archival capabilities for digital information.
I also want to emphasize the manner in which we interact to deploy these elements and to construct the system and subsystems for digital archiving. We must, on the one hand, make a commitment to a complex iteration and reiteration of exploration, development and solution as the relevant factors and their interrelationships emerge and become clearer and more tractable. On the other hand, the manner of our interaction in a systematic approach to digital archiving must result in a complex division of labor. And this brings me to my third and final point: that our agreements to divide the labor as formal partners, as informal allies, or even as competitors, must soon substantially set in motion the mechanics of digital archives as a pervasive and trusted foundation for cultural discourse that includes scholarly communication.
Most of the Task Force recommendations for setting the mechanics in motion invite substantial inter-institutional action. I draw your attention to three of these recommendations. They each illustrate a different form of interaction and they each yield a different kind of economic benefit.
First, the Task Force calls for certified digital archives. In itself, certification yields no direct economic benefit. Yet the process of certification is meant to create an overall climate of value and of trust about the prospects of preserving digital information. Repositories claiming to be digital archives in a changing and uncertain environment must be able to prove that they are who they say they are, and that they can deliver on the preservation promise. The call for individuals and organizations to agree to collaborate in the design and implementation of standards, criteria and mechanisms for certification, and for prospective digital archives to submit to the certification process, is a summons for the wider community to affirm the values — at least in the abstract — of digital preservation and, ultimately, of the pursuit of knowledge as its own end.
The Task Force also emphasizes the need for a fail-safe mechanism in digital archives. Such a mechanism will enable a certified archival repository to exercise an aggressive rescue function to save digital information that it judges to be culturally significant and which is endangered in its current repository. We may not know enough about the use of digital information to reach consensus about what fair use of it is, but we do know that one of the greatest dangers to its long life is the ease with which it can be abandoned or destroyed. If concerted action is needed in the intellectual property arena to protect the rights necessary to support teaching and research, then let us focus at least some of that action on the development of the legal framework needed to support a fail-safe mechanism for digital archives. The economic benefit of such action is, of course, not in the dollars it directly generates or saves, but in the environment it creates for archival institutions to do their job and to realize the value of preserved work for future generations.
Finally, I call attention to the Task Force recommendation for a cooperative venture to preserve the documents, discourse, software products, and other digital information objects that serve to record the early digital age. Because the objects in this focal area are at such risk of loss, the project could provide a useful means of exploring the actual operation of archival fail-safe mechanisms. Moreover, conceived as a cooperative venture among multiple participating archives, the project would provide a necessary testbed for developing an on-line system of linked but distributed archives. One of the biggest unknowns in the digital environment is the full impact of distributed computing over electronic networks. However, as the Task Force report suggests in the section on costs and finances, and as Dr. Bowen of the Mellon Foundation has asserted earlier in his discussion of the JSTOR project, one of the greatest hopes for reducing costs in the scholarly communication process is the prospect of achieving economies of scale in the storage and distribution of electronic information over electronic networks. We need to verify these expectations of economic benefit in actual experience with a range of materials.
I conclude by observing that the notions of archives and archiving today have much currency and import, even outside the context in which we have been discussing them here. Just a week ago, on October 8th, 1995, in the New York Times Magazine, William Safire devoted his “On Language” column to the topic of kids’ slang. He advised that “if you want to stay on the generational offensive, when your offspring use the clichéd ‘gimme a break,’ you can top that expression of sympathetic disbelief with ‘jump back’ and the ever-popular riposte ‘whatever.’” However, he noted that some expressions, such as “I’m outta here” or “I’m history,” are now very much dated. “I’m history,” Safire quotes from a forthcoming study of slang, is “a parting phrase modeled on an underworld expression referring to death, and it has both inspired and been replaced by the more trendy expression, ‘I’m archives’” (Safire 1995: 30).
With regard to the future of digital information in the scholarly communications process, I have no doubt that the expression “I’m archives” will apply truthfully to all the institutions represented in this room. The choice before us, both individually and collectively, is to decide in what sense it will apply.
Graham, Peter. 1995, “Requirements for the Digital Research Library.” College and Research Libraries 56(4): 331-339.
Kagan, Robert. 1994, In Over Our Heads: The Mental Demands of Modern Life. Cambridge, MA: Harvard University Press.
Pelikan, Jarislov. 1992, The Idea of the University: A Reexamination. New Haven: Yale University Press.
Safire, William. 1995, “Kiduage.” The New York Times Magazine. October 8, 1995: 28, 30.
Task Force on the Archiving of Digital Information. 1995, Preserving Digital Information. Draft Report, Version 1.0. Washington, D.C.: Commission on Preservation and Access and Mountain View, CA: The Research Libraries Group, August 23, 1995.