Association of Research Libraries (ARLĀ®)

http://www.arl.org/resources/pubs/symp2/rous.shtml

Publications, Reports, Presentations

Scholarly Publishing on the Electronic Networks

Electronic Publishing: A Five-Year Plan

Bernard Rous

Associate Director of Electronic Publishing
Association for Computing Machinery

Background

Let me begin with a quick statement about the Association for Computing Machinery. For those who might not know, ACM is the society for computing and information technology professionals with a membership of about 80,000. Its primary purposes are to advance the arts and sciences of information processing and to promote the interchange of information about them both among professionals and to the public.

ACM is in the process of launching a major electronic publishing program. [Overhead #1] This morning, I am going to try to briefly outline ACM's Five Year Plan, keying on some features as they reflect the rationale behind it. Along the way, I will touch on some of the steps you might want to consider in managing the transition from a print-oriented to an electronic-oriented publishing operation and I will offer some cautionary tales based on experience so far.

To help contextualize this discussion, I will give an overview of the scope of our publishing operation and indicate the nature of the material ACM seeks to capture in forms appropriate for electronic access and delivery as well as print. [Overhead #2] As you can see, ACM is not a particularly large publisher -- the annual print output is about 40,000 pages. The chart has organized ACM's publications into major groups distinguished by the level of editorial work; the rigor of the peer review; and the method of production.

The key to understanding how costly and difficult the transition will be from an operational viewpoint, lies in understanding the current mode of production and what is involved in changing it. Obviously, the conference proceedings and Special Interest Group newsletters will be costliest to integrate within the system since they are now entirely prepared by volunteer effort and turned over to staff as camera copy for printing and distribution.

To set the historical context for ACM's new electronic publishing venture, I've listed some of the electronic initiatives ACM has been involved with over the last decade. [Overhead #3] An important point is that these experimental projects have provided a diverse variety of experience -- from producing our own hypertext and CD-ROM products for sale, to participating in and monitoring the progress of, major research efforts by granting the rights to our publications. And it is through a history and practicum with these very difficult projects that an experience base has been built in ACM's staff -- experience regarding both technical matters and business issues. Such an experience base is critical to moving ahead with a full-scale program. (I might add that in some circles, ACM has had a reputation of being analagous to the cobbler's barefoot children. The fact is that while not in the forefront of electronic publishing, neither has ACM been idle. Other publishers have moved ahead. They have invested literally millions of dollars -- often in proprietary non-standard systems that now present some real problems.)

Important Lessons

Some of the earlier projects were real eye-openers in a number of key areas [[Overhead #4]], discussed below:

Why Take the Plunge?

After all this, (and it is really only the tip of the iceberg), you might wonder why ACM is about to jump off the cliff in launching its full-scale electronic publishing program. Well, despite the above difficulties, ACM does firmly believe in the growing importance of electronic access to scientific and technical information. Some reasons for our moving ahead include:

For these reasons, ACM decided to somewhat reduce its costly one-at-a-time electronic initiatives and focus instead on a systematic approach that had a chance of being really cost-effective.

One of the key aspects to ACM's plan is that it seeks to integrate, as seamlessly as possible within the electronic system capabilities, all the steps in the creation of knowledge, from author origination to final archiving. [Overhead #5] Thus, when we use the term "electronic publishing" ACM is defining it both as the automation of production and dissemination or access in electronic form.

Unfortunately, for information just to be in electronic form is not good enough -- not nearly good enough -- to take full advantage of the possibilities in the digital medium. After all, we print publishers have all had our publications in electronic form for many years now -- as digitized master typesetting tapes (that are fairly useless other than to reproduce the original print publication.) And, we have had desktop publishing systems, whose very success has undoubtedly set back the cause of electronic publishing for years. I say this because the fundamental approach of desktop publishing is at least simplistic if not altogether wrong; it is based on the page paradigm and is essentially an electronic tool to automate the production of the printed page -- cheaply, on the author's desk. Despite the advances offered by Adobe's Carousel -- now called Acrobat -- publishing now needs to move beyond the digital analogue of paper.

A Strategy

Therefore, ACM's program requires first and foremost, the replacement of the technology infrastructure used to produce publications so that print will now be enabled as a byproduct of the process that creates an electronic library.

I want to emphasize that ACM's planning assumes the ongoing importance of print, not only because print generates a known and fairly predictable revenue stream that the society uses to support its activities; but because of the inherent utility of the printed page over current (and possibly even future) screen display technology for certain types of ACM information and certain kinds of usage. Therefore, the system to be built must also have the capability of delivering good quality typeset files for centralized and/or distributed local printing. [Overhead #6]

At the same time that ACM assumes the ongoing importance of print, we do not believe in the page as the storage format standard. We have chosen SGML as the appropriate standard for the publications database.

The choice of SGML as a document standard is in some ways a hard one:

But nevertheless, we see SGML as a necessary choice, perhaps even less as an interchange standard as intended, than as the means of gaining control over and managing an information resource. The fact that SGML is presentation neutral, which makes it hard (since all information must eventually be displayed), also gives it flexibility and allows it to support the kinds of products and services listed on the overhead. In addition, its flexibility should allow it to work with future technologies that may emerge.

Based on its overall Electronic Publishing Plan, ACM invited some 35 vendors to respond to a Request for Information. The 20 responses have been re-assuring: there is a small but growing set of appropriate tools and systems to accomplish our goals. After evaluating initial vendor responses, we are now working on a Phased Implementation Plan. [Overhead #7] We plan to begin with about 8,000 annual pages of our primary journals and focus only on those tasks on the production side, required to build the information resource.

All the other publications and the tasks on the distribution side to enable electronic peer review, electronic access and delivery, and optical disk product development will be taken up in later phases. The reason for this staging is that is not only necessary to produce the resource first, but we are hope that the very murky business picture will be clearer when we're ready for Phase 2.

The various boxes [Overhead #8] represent the modules or functional areas of Phase 1 of the system. Let me state for the record that the names you see inside the boxes are not intended as exhaustive lists, or even as optimal lists, but merely as sample lists of vendors who may offer solutions for these functional areas. Having said that, what I'd really like to talk about is the criteria to consider in making a selection.

In considering the Capture and Conversion module, realize that this could be the most difficult part in setting up an SGML application, particularly if the publisher has (as does ACM) an international authoring community using a varied set of tools on different hardware systems in different operating environments. The good news is that there are many vendors and packages to help. The bad news is the same. There are so many because none of them can provide a really clean lossless conversion process. Some level of manual intervention will be necessary. And it is important to be sure that the DTDs are set up correctly at the start.

The Editing Tool -- find out how tightly integrated it is with the database and composition modules; investigate carefully the costs for that integration if it is not packaged together.

The Composition Engine -- must accept structured standard files from the editing environment and from the publications database; and it must handle the specification requirements for type, design, and quality. The best thing to do is probably benchmark it with one of the existing publications and do a site visit at another application.

The Workflow Management function provides scheduling, tracking, dunning, reporting, and statistics on the objects in the system: basically, who has what document; what task they must perform; and the date when it should be complete. This can be costly to develop. Try to get a package for this that is, or can be, integrated with the document database.

The Publications Database -- Try to specify requirements for it clearly and in detail. There may be substantial development cost for your application -- it is vital to get the correct functionality. One may have to choose between some well-known relational packages and some newer object-oriented databases that are relatively untested but may be better suited to SGML documents.

With all the packages selected, make sure there is decent support, that the platform is extensible and that there is an upgrade path. Remember that you'll probably have to replace the technology in three or four years anyway!

Questions and Cautions

In conclusion, as the distinction between what constitutes scholarly communication and what constitutes scholarly publication first blurs and then re-forms in the communities of cyberspace, publishers face a great uncertainty about the future shape of the scholarly publishing business.

Who will be the publishers of tomorrow? Will it be those who own and/or control the means of communication? Will it be the phone companies? Will it be the sysops? Or the academic computing service centers? What will become of the publishers of today when print no longer serves as the privileged mode of communicating and disseminating scholarly work? What will become of libraries, when distributed virtual knowledge bases render physical archive sites unnecessary? For that matter, what will become of the scholar herself, with the waning of the current recognition and reward system for intellectual achievement? Will the scholar become a self-publisher? Will the authority of authorship itself begin to wane with the deconstruction of linear text and the discursive monologues of St. Jerome? And with their replacement by chunk-sized information bytes in the quasi-anonymous, fiercely egalitarian world of the digital network?

I believe that publishers and librarians of today are in for a particularly rough time in the not too distant future. We will be pressed to demonstrate the values that we add to the knowledge chain -- in its creation, refinement, dissemination, and preservation.

I would hark back to Yuri Rubinsky's introductory talk and suggest that what should be hidden in the subterranean tunnels is the technology itself and that this is a challenge for the hardware and software developers. At the same time, there is an urgent task for the librarians and publishers to lay bare, to expose, and yes, to market, the best value added services they can provide in the world of digital information. If you are in the position of trying to think through or manage the transition from print to electronic publishing, I would keep several things in mind.

OVERHEAD ONE

ACM's Electronic Publishing Five-Year Plan

  1. Outline of the Plan: the approach and its rationale

  2. Practical Steps for planning a transition from print-oriented to electronic-oriented publication

  3. Potential Problems: warnings and admonitions


OVERHEAD TWO

Current Publishing Operation by Volume and Type

Total Annual Pages = 40,000 pages

Natural groupings:

Mode
The primary journals=7,700 pagesTraditional
Reference (CR/Guide)=2,400 pagesElectronic DB
ACM Press Books=2,400 pagesAW, mixed
SIG Newsletters*=14,500 pagesCamera Copy
Conference Proceedings=17,100 pagesCamera Copy
Other**=2,600 pagesMixed

6,500 of the 14,500 newsletter pages are actually conference proceedings. *Includes The Graduate Assistantship Directory (GAD), The Administrative Directory, Self-Assessment Procedures, Roster, CALGO, Curricula, The ACM Publications Catalogue.


OVERHEAD THREE

Early Efforts...

OVERHEAD FOUR

...and their importance

OVERHEAD FIVE

Overall Goals

OVERHEAD SIX

Products and Services Goals

OVERHEAD SEVEN

Recommendations for Phase One Implementation

OVERHEAD EIGHT

Prime Vendor Cendidates

IDI
ArborText
XyVision
Xerox
Interleaf
IBM
Datalogics?

1.1 Capture/Convert

IDI
Avalanche
Atlis
XyVision
SoftQuad
Xerox
Interleaf
IBM
OCLC
Agfa
OpenText

1.2 Write/Edit

Agfa
SoftQuad
ArborText
Interleaf
Xerox
(WordPerfect)

1.3 Compose/Present

Agfa
XyVision
Interleaf
ArborText
(Frame)
(Quark)
Xerox
IBM

1.4 Workflow Mgt/Track

RCP
IDI
Agfa
Interleaf
Xerox
ArborText

1.5 Text & Image Storage

RCP
IDI
Agfa
Interleaf
Xerox
ArborText