Digital Library Research and Development The Coalition for Networked Information Spring 1995 Task Force Meeting was held in Washington, D.C. on April 10P11. The theme of the meeting was "Digital Library Research and Development." Paul Evan Peters, Coalition Executive Director, opened the meeting with some comments on the "digital library," a phrase that has replaced "virtual libraryS as the term of choice for the ultimate result of the transition of scientific and scholarly communication and publication from a system geared primarily to producing, distributing, and using information in print and other analog formats to a system geared to network and other digital formats. Peters stated that while most early digital libraries are being built to manage digitized versions of things that were already available in analog formats, e.g., books, periodicals, and sound and video recordings, he believes that over time, an increasing number of digital libraries will be built to manage "digital" rather than "digitized" information. The "information objects" managed by this emergent class of digital libraries will be much more like "experiences" than they will be like "things," and each reader will have a unique experience with each such object in an even more profound sense than is already the case. The opening session featured representatives from the three federal agencies that are sponsoring a four-year, $24.4 million joint initiative on digital libraries. The project's focus is to dramatically advance the means to collect, store, and organize information in digital forms, and make it available for searching, retrieval, and processing via communication networks, all in user-friendly ways. Stephen Griffin, Program Manager, National Science Foundation, provided an overview of the projects, which are a mix of experimental testbeds and prototypes. NSF program goals are to: - Advance fundamental research over a large set of interdisciplinary topics; - Develop and demonstrate new digital library technologies through experimental testbeds and prototyping; - Build new applications and services; and - Establish community presence and influence by becoming the "premier" effort in digital libraries and through broad participation by a diverse participation by a diverse set of client groups. Griffin also identified five research areas that NSF feels are fundamental to the development of digital libraries: - Capturing data of all forms (text, images, video, etc.) and information about that data (metadata); - Categorizing, organizing, and combining large volumes of information in a variety of forms and formats; - Developing software and algorithms for data exploration and manipulation and combining large volumes of various types of information; - Developing tools, protocols, and procedures for advancing the utilization of networked knowledge bases distributed around the nation and around the world; and, - Studying the impact of these technologies on individuals, organizations, sectors, and society at large. Nand Lal, Manager of Digital Library Technology Project, Goddard Space Flight Center, noted that NASA has an interest in digital libraries technologies as a developer of content and as a consumer of information. Satellites will be sending down 1/4 terabyte of information per day in the near future. This makes NASA interested in new technologies that will enable them to manage this data better. NASA's involvement in digital library research and development will benefit the agency in performing its engineering and science mission, and in its public access and outreach functions. NASA also feels that substantial advances in technology will be necessary to make the National Information Infrastructure (NII) a reality. Lal stated that a digital library includes the functionality of a traditional library, but is more than simply a digitized version of the same. It is a collection of information resources and services (accessible via the NII) that allows a subscriber easy and timely acess to useful information and knowledge at a reasonable cost. Lal concluded with what he sees as the management challenges of digital library development: the adoption of, and adherence to, appropriate standards; the establishment of metrics for user satisfaction; the demonstration of scalability; and, performance. He stated that in a totally distributed environment with a large spectrum of users consulting a large spectrum of information content, these will be great challenges. Glenn Ricart, Program Manager, Advanced Research Projects Agency (ARPA), currently on leave from the University of Maryland, College Park described ARPA's working hand-in-hand with NSF and NASA on digital library initiatives as an outgrowth of the NREN legislation. ARPA's view is that in addition to having information technology and applications, we need an information technology enterprise for the emerging economy. The National Information Enterprise (NIE) is the ARPA program focus that combines ubiquitous networking with services that link to applications, particularly in national priority areas. ARPA's major emphases for digital library research and development are in service areas, e.g., authenticating and synchronizing large caches of information. They are interested in specific projects that deal with the tough questions of copyright and electronic commerce. Ricart identified a number of key issues that need to be addressed in the development of digital libraries: technologies for locating documents; developing shared, distributed, long-lived repositories; strategies for document translation and interchange; scalable registration/recordation; and rights management systems. William Arms, Corporation for National Research Initiatives (CNRI), followed up on the agency officials' presentations by providing an overview of digital library technical issues and terminology as an outgrowth of work being conducted by CNRI through the Computer Science Technical Reports Project and the Digital Library Forum. He identified eight key points that need to be considered as digital libraries develop: - The technical framework exists in a legal framework (digital library architectures must take into account such issues as intellectual property, obscenity, communications law); - Architecture needs to separate aspects that depend upon content (e.g., identifiers and security are characteristics that are independent of content; text and computer programs are dependent on content); - Names and identifiers are basic to the digital library (and should include a location independent name, globally unique, persistent across time); - Digital library objects are more than collections of bits (they have attachments to the content (bits) such as properties, transaction log, and signature); - Repositories must look after the information they hold (by supplying handles, transaction records, and security); - The digital library object that is used is different from the stored object (users receive the result of executing a program such as SGML or the result of an interaction with a database); - Users want intellectual works, not digital objects (e.g., a "report" refers to groups of objects in a digital library); and - Understanding of digital library concepts is hampered by terminology (terms such as "document" have such strong social, professional, legal, or technical connotations that they obstruct discussion in this environment). Other plenary sessions included a panel on networked information discovery and retrieval and a talk by science fiction author Daniel Keys Moran. Thirty Project Briefings showcased digital library programs and a wide variety of networked information projects and issues. Many documents from the Spring 1995 Task Force Meeting and the full meeting report are available on the Coalition's Internet server. To access the Coalition's homepage, the URL is: http://www.cni.org/CNI.homepage.html. Via gopher, point your gopher client to gopher.cni.org 70. To access the CNI ftp archive, browse the directory /CNI/tf.meetings at ftp.cni.org. -Joan Lippincott, Assistant Executive Director ---------------------------------------------- NSF-ARPA-NASA Digital Libraries Initiative Cooperation and ongoing interaction among the participants of the NSF-ARPA-NASA Digital Libraries Initiative Projects is intended to have the following six , four-year projects function within a single programmatic framework. Principle Institution Carnegie Mellon University (amount) ($4.8 million) Partners Microsoft, DEC, Bell Atlantic, QED Communications, Open University, Fairfax VA County Schools Project Focus Digital video with focus on math and science (Internet sites) (http://fuzine.mt.cs.cmu.edu/im/im-proposal.html) Principle Institution University of California, Berkeley (amount) ($4 million) Partners Xerox, Resources Agency of California, California State Library, Sonoma County Library, San Diego Association of Governments, The Plumas Corp., Shasta County Office of Education, Hewlett Packard Project Focus Environmental information (Internet sites) (http://http.cs.berkeley.edu/~wilensky/proj-html.html) Principle Institution University of California, Santa Barbara (amount) ($4 million) Partners State University of New York-Buffalo, University of Maine, industrial partners Project Focus Geographical information, including images and maps (Internet sites) (http://alexandria.sdc.ucsb.edu) Principle Institution University of Illinois (amount) ($4 million) Partners National Center for Supercomputing Applications, University of Arizona, IEEE, APS, John Wiley & Sons, U.S. News and World Report Project Focus Engineering and science journals (Internet sites) (http://www.grainger.uiuc.edu/dli) Principle Institution University of Michigan (amount) ($4 million) Partners IBM, Elsevier Science, Apple Computer, Bellcore, UMI International, McGraw-Hill, Encyclopedia Britannica, Kodak Project Focus Multimedia with focus on earth and space science (Internet sites) (http://www.sils.umich.edu/UMDL/HomePage.html) Principle Institution Stanford University (amount) ($3.6 million) Partners Association for Computing Machinery, Bellcore, Dialog, EIT, Hewlett Packard, ITC, Interval Research, O'Reilly and Associates, WAIS Inc., NASA Ames, Xerox PARC Project Focus Technologies for a single, integrated virtual library (Internet sites) (http://www-diglib.stanford.edu) ------- ARL 180 A Bimonthly Newsletter of Research Library Issues and Actions Association of Research Libraries May 1995