SPARC

http://www.arl.org/sparc/meetings/ala05/John_Wilkin.shtml

How the development of online search tools and the mass digitization of collections might be changing the role of libraries and affecting people's access to information

2005 SPARC-ACRL Forum

How the development of online search tools and the mass digitization of collections might be changing the role of libraries and affecting people’s access to information

John P. Wilkin
Associate University Librarian for Information Technology and for Technical and Access Services
University of Michigan

June 25, 2005

A few weeks ago, I was asked to be part of a small panel speaking to the Association of Research Library (ARL) directors about the transformative implications of mass digitization of collections—e.g., about Google’s efforts and about the efforts of the Internet Archive. Much of what I’m going to say to you today was what I said at their annual meeting. What happened in that discussion surprised me, but let me put that aside until later.

The focus of much of the discussion in our community and in the press, subsequent to the December 15th announcement of the Google digitization effort, has been on the “what” and “how” of Google’s digitization, rather than the implications. Of course there has been that small copyright issue and whether Google even has the right to digitize our collections. But what remains important here is the array of possibilities that digitization of this sort makes possible. Yes, with Google’s aid we intend to digitize everything in the Michigan collection, but that is not the goal. Digitizing the collection is en route to a set of bigger goals, to real transformative issues. 

Of all the large social issues that the Google digitization deal makes possible, the ones that feel particularly exciting to me right now are the professional issues. Let’s just start with one example, information retrieval. Let’s be honest with ourselves:  libraries and publishers have done an awful job with information retrieval. Whether it’s the large research libraries in the Digital Library Federation, our Integrated Library System (ILS) vendors, or publishers like Proquest or Elsevier, the sort of access we’ve created has been both amateurish and insular, uninformed by deep engagement with users and the sorts of creative thinking that the problem demands. 

At the very least, then, what this effort might do is facilitate a new era of specialization, where we stop trying to be jacks of all trades and masters of none, cede the “generalist” role to Google and begin to do a decent job with the sorts of things our specialized communities need—e.g., integrating information with the tools of research and scholarly communication. I won’t talk about Google Scholar today, but this impact is probably just as true of the sorts of efforts made possible by Google Scholar as it is of Google Print.
 

My Big Five

Shared Digital Library

But of all of the professionally-related transformative implications, if I were asked to name the key ones and could kick off efforts that I think are key to our professional vitality, it would be the five that I’m about to discuss with you today. The first is a bit different in that it’s another example of means rather than ends. It is the opportunity to create a cooperative “universal” digital library.

What would it mean to be able to put online (in one virtual place) a copy of everything, and to put it online in a way that ensured the long-term survivability of the content and, where possible, to provide access to the content digitally? The rights that Michigan has secured in the digital images we get back from Google will allow us to pursue that goal. That is, Michigan may put its copies of content online in a way that it can be collaboratively curated, collaboratively extended by adding more content, and collaboratively used in the service of ends consistent with our mission as librarians. Michigan is now beginning discussions on how that vision might be made real.

The scatter of print collections we’ve grown used to is largely a function of the limitations of that medium: print is bound by place, and so we’ve created great overlap between our print collections at the same time that we’ve tailored those collections to the needs of our constituencies. We can finally begin to pool our resources effectively to build and sustain a comprehensive digital collection. Even though the rights Michigan has in its digital files allow the creation of such a cooperative collection, in order to do that we need to define governance models that ensure that we can sustain this thing.

And though getting agreement on the technology involved in this cooperative is probably a lot easier than getting agreement on the governance model, for digital libraries today we only have a Tower of Babel, and so finding the right model too will present challenges for us. We need a technology that is sufficiently flexible to allow us to create services for a variety of types of audiences, and we need a way to manage a very complex mix of rights to access. This is an exciting opportunity that is worthy of discussion itself, but, as mentioned, it’s also one of those examples of “means” rather than ends.1
 

The Other Four

So, what are some of those ends?  We’ve been dancing around several key issues for a few years, and I’d like to talk about four:  storage of print, coordinating access to materials that are not widely held, libraries as publishers, and new models of libraries that take “place” into account.

Storage of print

We need to move toward more strategic collection storage strategies. We’ve been devoted (sometimes blindly!) to storing the same unused volumes dozens, if not hundreds, of times over. The data assembled by Schonfeld and LaVoie in their study of OCLC holdings show that millions of titles are widely held, and we know that many are unused or little used.2 We can be big fans of the book without being devoted to waste.  We must begin to move toward a handful of well-designed and coordinated regional storage facilities.

By coordinating those regional storage facilities, we can keep only heavily-used materials in local collections. We’ve let ourselves be held hostage to the insinuation that someone else requires us to engage in this kind of waste. It’s been the fault of our provosts or our library directors, requiring us to store everything regardless of its utility to our campuses. I not only think that the directors are letting go of that, but the provosts are going to wise up to how meaningless this is, particularly as the Google-converted content becomes more popular. And whether or not they want us to sustain that overlap and waste, it’s time for us to begin advocating for freeing up our facilities for more vital activities.

Coordinating access to materials that are not widely held

Similarly, let’s make sure that the materials that are not widely held can be easily accessed by our users. The same data from Schonfeld and LaVoie demonstrate that more than 30 percent of the unique titles are held by only one or two libraries scattered geographically. What we see emerging in their data is a complex picture of significant overlap and significant distribution of unique titles. When Schonfeld and LaVoie looked at small numbers of those titles, they found that very few were manuscripts: most are likely to be works not widely collected by research libraries, and perhaps many are technical reports or other forms of local publication.

Whatever they are, we’ve been making users play a game of “find the book.” The same coordinated storage strategy can include an effort to ensure that we digitize those uniquely held non-rare works for a shared digital repository and that we can provide reasonable access to them. Hell, if we can let go of volume counts as a measure of our success, making these materials more readily available by coordinating access to them is going to seem attractive to just about everyone. We would clearly need a different strategy for the rare materials, but imagine if that were the only hurdle left in front of us?

Libraries as publishers

Libraries are about connecting users with information, so let’s embrace our role as disseminators of information, as publishers. We’ve been schizophrenic about collections and service, acting for years like service to collections was a luxury, and then that our physical collections were passé and only services were sexy. Service and collections come together in our new role as publishers.

University presses have increasingly lost their vitality and they have made themselves largely irrelevant by ignoring new economic models and new forms of publishing.  Particularly in college and research libraries, let’s do what we need to do to bring to life those new forms of scholarly communication and make sure that the future of ‘publishing’ is an integral part of our worlds.

Of course this includes mounting institutional repositories, a significant step in this area, but let’s also give attention to helping shape new forms and venues for publishing. Converting our past with Google is only prelude to what we might be able to do with initiatives like SPARC, initiatives where we can shape a new future for publishing with new forms of publication and new economic models.

Library as “place”

Let’s capitalize on the opportunity to develop the library-as-place. As we help create a library without walls and we see use taking place remotely, our libraries are alive with users. Consider the interesting developments at Michigan: with a collection budget of $16M, we spend more than 25 percent on remotely accessible electronic resources, and we have plenty of data confirming the fact that our users are accessing library resources outside of the library, from their homes and offices and around the world. At the same time, our gate counts and circulation statistics have been robust and even increasing. Our physical library is alive with activity.

There is much excellent writing emerging about the paradox of library-as-place. The paradox shouldn’t be surprising: in colleges and universities, for example, the library is the hub of campus life. Let’s do more with our libraries than make them into book mausoleums. The information commons is not a collection of computers, it’s a multi-use locale for use and users. We must create flexible, adaptable spaces for group study, for the application of technology in research, for presentations, for consultation with librarians, and for work in isolation. 

Conclusion

What "Big Idea" does all of this add up to? I hope that you can see a picture of libraries that are activist in bringing together users and older forms of publishing, are focused on current collection needs in the most strategic ways, are champions of new forms of scholarly communication, and have turned their facilities into places where users congregate to work with each other and receive informed assistance from librarians. Being effective stewards of our resources will require us to eliminate waste and engage with the emerging information-seeking and information-using behaviors of our users.

Of course there are the broad social issues that this raises, and I'm eager to see the deep thinkers - for example, our schools of information and library studies - tackle things like

  • What it will mean to have such wide, efficient, democratizing access
  • How access will be a driver for a number of things like development of infrastructure in developing countries.
  • Or how conversion like this will exaggerate and resolve IP issues

I’d like to remind everyone that this is not about Google Print’s success, or even the phenomenon of Google’s announcement. This future has been staring us in the face for years. For example, bookmen like Terry Belanger have been arguing that we should move away from the old modes of storing collections for well over a decade. We should stop putting off defining our future and take charge of those transformations.

In ending, I'd like to come back to that ARL directors meeting. I was grateful for the warm reception my remarks were given, particularly because I thought some of those remarks might be controversial. To my surprise, several of the ARL directors stood up and vocally supported the idea of abandoning volume counts and moving toward setting up a handful of regional repositories, and many echoed my call to engage in scholarly communication. There may be less resistance in our administrations or with our clientele than many assume, but whether that’s the case or not, it’s time to start building the library future that we all know our communities need.

I want to end by expressing my appreciation for the opportunity to speak on the topic. My role in UM's library is for what are essentially back-office operations: technology that underpins all of our work but not the staff responsible for delivering services to users using that technology; technical services that go into supporting the development of our collections and creating descriptive access to them, but not the work mediating between users and collections, or even building those collections.

One of the changes that will become increasingly important is the connectedness of our roles. No public service takes place without technology, no technology gets created without a sense of the needs of users, and technical services don’t operate in isolation from the needs of users. All of us in the profession bring experience and knowledge to bear on these problems, and the problems are "owned" by the entire profession in ways that do not respect the sharp lines drawn between the back office and the services we provide. If we’re going to make this new world work, we need to coordinate our efforts in ways that share these key issues across those boundaries.
 


1. Lest anyone think that creating a single cooperative means creating a single point of vulnerability for our digital materials, let me remind you that there are dozens of models that include generous replication and distribution of mission-critical content.

2. Approximately 7 million titles are held by ten libraries; more than 2 million titles are held by 50 or more libraries; nearly 2 million titles are held by 100 or more libraries.

 

posted: July 01, 2005