Bernard Rous
Associate Director of Electronic Publishing
Association for Computing Machinery
Background
Let me begin with a quick statement about the Association for Computing Machinery. For those who might not know, ACM is the society for computing and information technology professionals with a membership of about 80,000. Its primary purposes are to advance the arts and sciences of information processing and to promote the interchange of information about them both among professionals and to the public.
ACM is in the process of launching a major electronic publishing program. [Overhead #1] This morning, I am going to try to briefly outline ACM's Five Year Plan, keying on some features as they reflect the rationale behind it. Along the way, I will touch on some of the steps you might want to consider in managing the transition from a print-oriented to an electronic-oriented publishing operation and I will offer some cautionary tales based on experience so far.
To help contextualize this discussion, I will give an overview of the scope of our publishing operation and indicate the nature of the material ACM seeks to capture in forms appropriate for electronic access and delivery as well as print. [Overhead #2] As you can see, ACM is not a particularly large publisher -- the annual print output is about 40,000 pages. The chart has organized ACM's publications into major groups distinguished by the level of editorial work; the rigor of the peer review; and the method of production.
The key to understanding how costly and difficult the transition will be from an operational viewpoint, lies in understanding the current mode of production and what is involved in changing it. Obviously, the conference proceedings and Special Interest Group newsletters will be costliest to integrate within the system since they are now entirely prepared by volunteer effort and turned over to staff as camera copy for printing and distribution.
To set the historical context for ACM's new electronic publishing venture, I've listed some of the electronic initiatives ACM has been involved with over the last decade. [Overhead #3] An important point is that these experimental projects have provided a diverse variety of experience -- from producing our own hypertext and CD-ROM products for sale, to participating in and monitoring the progress of, major research efforts by granting the rights to our publications. And it is through a history and practicum with these very difficult projects that an experience base has been built in ACM's staff -- experience regarding both technical matters and business issues. Such an experience base is critical to moving ahead with a full-scale program. (I might add that in some circles, ACM has had a reputation of being analagous to the cobbler's barefoot children. The fact is that while not in the forefront of electronic publishing, neither has ACM been idle. Other publishers have moved ahead. They have invested literally millions of dollars -- often in proprietary non-standard systems that now present some real problems.)
Important Lessons
Some of the earlier projects were real eye-openers in a number of key areas [[Overhead #4]], discussed below:
While there are certain ways in which electronic publishing can make information more timely, it can also delay it, particularly if one is trying to provide added value that takes advantage of the digital medium. The ACM Hypertext Compendium, for example, was a collection of 120 papers woven together within a web of handcrafted hypertext links. Constructing these links was an enormous intellectual effort undertaken by the human editor of this small hypertext library. It took well over a year.
Currently, accessibility is not really an issue for print publishers. They provide access for a market of people who are sighted and literate in a particular language. Distributing electronic publications, particularly those bundled with searching and navigating tools and software for rendering displays, suddenly restricts access: to particular platforms, and even operating environments within those platforms, and even further raises considerations of different monitor display dimensions. To overcome market fragmentation, costs are driven up for each additional platform implementation. One is likely to experience new and sometimes considerable costs in the area of data preparation.
As publishers move from selling printed content to licensing electronic access, the familiar ways of doing business and doing cost-price analyses change. This is one of the reasons why publishers tend to be hesitant about jumping into the electronic arena. There is a lot of revenue risk.
There are also new marketing problems. We were surprised to discover what in hindsight should have been obvious: Selling an entire library on a CD-ROM at a price point in the thousands of dollars is not the same as selling printed journals subscriptions. Significant marketing resources must be put into closing each individual sale. Demo disks are helpful and trial periods often required.
In addition, the publisher must provide an entirely new service --support for the user.
Why Take the Plunge?
After all this, (and it is really only the tip of the iceberg), you might wonder why ACM is about to jump off the cliff in launching its full-scale electronic publishing program. Well, despite the above difficulties, ACM does firmly believe in the growing importance of electronic access to scientific and technical information. Some reasons for our moving ahead include:
Today, there are electronic document interchange standards that have not only been passed by international standards organizations, but, more important, are now being implemented in more and more publishing software.
The technology is also getting cheaper. Even though one cannot go out and buy a complete system that does everything one might wish, most of the pieces now exist and can be integrated with some development work and customization for one's specific application.
And, as I mentioned, there is an experience base with staff to facilitate the transition.
There is also new potential due to the phenomenal growth of network communications.
For these reasons, ACM decided to somewhat reduce its costly one-at-a-time electronic initiatives and focus instead on a systematic approach that had a chance of being really cost-effective.
One of the key aspects to ACM's plan is that it seeks to integrate, as seamlessly as possible within the electronic system capabilities, all the steps in the creation of knowledge, from author origination to final archiving. [Overhead #5] Thus, when we use the term "electronic publishing" ACM is defining it both as the automation of production and dissemination or access in electronic form.
Unfortunately, for information just to be in electronic form is not good enough -- not nearly good enough -- to take full advantage of the possibilities in the digital medium. After all, we print publishers have all had our publications in electronic form for many years now -- as digitized master typesetting tapes (that are fairly useless other than to reproduce the original print publication.) And, we have had desktop publishing systems, whose very success has undoubtedly set back the cause of electronic publishing for years. I say this because the fundamental approach of desktop publishing is at least simplistic if not altogether wrong; it is based on the page paradigm and is essentially an electronic tool to automate the production of the printed page -- cheaply, on the author's desk. Despite the advances offered by Adobe's Carousel -- now called Acrobat -- publishing now needs to move beyond the digital analogue of paper.
A Strategy
Therefore, ACM's program requires first and foremost, the replacement of the technology infrastructure used to produce publications so that print will now be enabled as a byproduct of the process that creates an electronic library.
I want to emphasize that ACM's planning assumes the ongoing importance of print, not only because print generates a known and fairly predictable revenue stream that the society uses to support its activities; but because of the inherent utility of the printed page over current (and possibly even future) screen display technology for certain types of ACM information and certain kinds of usage. Therefore, the system to be built must also have the capability of delivering good quality typeset files for centralized and/or distributed local printing. [Overhead #6]
At the same time that ACM assumes the ongoing importance of print, we do not believe in the page as the storage format standard. We have chosen SGML as the appropriate standard for the publications database.
The choice of SGML as a document standard is in some ways a hard one:
- First, because it costs a lot to build and operate an SGML application;
- Second, because good rendering and display technologies from this base are still limited;
- Third, because the standard is still a little rough in the area of mathematics and tables.
But nevertheless, we see SGML as a necessary choice, perhaps even less as an interchange standard as intended, than as the means of gaining control over and managing an information resource. The fact that SGML is presentation neutral, which makes it hard (since all information must eventually be displayed), also gives it flexibility and allows it to support the kinds of products and services listed on the overhead. In addition, its flexibility should allow it to work with future technologies that may emerge.
Based on its overall Electronic Publishing Plan, ACM invited some 35 vendors to respond to a Request for Information. The 20 responses have been re-assuring: there is a small but growing set of appropriate tools and systems to accomplish our goals. After evaluating initial vendor responses, we are now working on a Phased Implementation Plan. [Overhead #7] We plan to begin with about 8,000 annual pages of our primary journals and focus only on those tasks on the production side, required to build the information resource.
All the other publications and the tasks on the distribution side to enable electronic peer review, electronic access and delivery, and optical disk product development will be taken up in later phases. The reason for this staging is that is not only necessary to produce the resource first, but we are hope that the very murky business picture will be clearer when we're ready for Phase 2.
The various boxes [Overhead #8] represent the modules or functional areas of Phase 1 of the system. Let me state for the record that the names you see inside the boxes are not intended as exhaustive lists, or even as optimal lists, but merely as sample lists of vendors who may offer solutions for these functional areas. Having said that, what I'd really like to talk about is the criteria to consider in making a selection.
The first thing to realize is that a single vendor is unlikely to be able to directly supply the best solutions for the entire system. One therefore needs to decide whether one needs a prime contractor to manage the entire project and to act as the systems integrator or whether one has the staff resources to undertake that role yourself.
Second, success or failure of the project may well depend on the prime vendor. A relationship of basic trust is critical. It might even outweigh the quality of the technology.
Third, one will likely need several vendors who must work together. One might allow the prime vendor choose his partners or the organization can make selections based on its own evaluation. But it is a tricky area because the forced marriage may not work out well, even if the technical solution is superior.
In considering the Capture and Conversion module, realize that this could be the most difficult part in setting up an SGML application, particularly if the publisher has (as does ACM) an international authoring community using a varied set of tools on different hardware systems in different operating environments. The good news is that there are many vendors and packages to help. The bad news is the same. There are so many because none of them can provide a really clean lossless conversion process. Some level of manual intervention will be necessary. And it is important to be sure that the DTDs are set up correctly at the start.
The Editing Tool -- find out how tightly integrated it is with the database and composition modules; investigate carefully the costs for that integration if it is not packaged together.
The Composition Engine -- must accept structured standard files from the editing environment and from the publications database; and it must handle the specification requirements for type, design, and quality. The best thing to do is probably benchmark it with one of the existing publications and do a site visit at another application.
The Workflow Management function provides scheduling, tracking, dunning, reporting, and statistics on the objects in the system: basically, who has what document; what task they must perform; and the date when it should be complete. This can be costly to develop. Try to get a package for this that is, or can be, integrated with the document database.
The Publications Database -- Try to specify requirements for it clearly and in detail. There may be substantial development cost for your application -- it is vital to get the correct functionality. One may have to choose between some well-known relational packages and some newer object-oriented databases that are relatively untested but may be better suited to SGML documents.
With all the packages selected, make sure there is decent support, that the platform is extensible and that there is an upgrade path. Remember that you'll probably have to replace the technology in three or four years anyway!
Questions and Cautions
In conclusion, as the distinction between what constitutes scholarly communication and what constitutes scholarly publication first blurs and then re-forms in the communities of cyberspace, publishers face a great uncertainty about the future shape of the scholarly publishing business.
Who will be the publishers of tomorrow? Will it be those who own and/or control the means of communication? Will it be the phone companies? Will it be the sysops? Or the academic computing service centers? What will become of the publishers of today when print no longer serves as the privileged mode of communicating and disseminating scholarly work? What will become of libraries, when distributed virtual knowledge bases render physical archive sites unnecessary? For that matter, what will become of the scholar herself, with the waning of the current recognition and reward system for intellectual achievement? Will the scholar become a self-publisher? Will the authority of authorship itself begin to wane with the deconstruction of linear text and the discursive monologues of St. Jerome? And with their replacement by chunk-sized information bytes in the quasi-anonymous, fiercely egalitarian world of the digital network?
I believe that publishers and librarians of today are in for a particularly rough time in the not too distant future. We will be pressed to demonstrate the values that we add to the knowledge chain -- in its creation, refinement, dissemination, and preservation.
I would hark back to Yuri Rubinsky's introductory talk and suggest that what should be hidden in the subterranean tunnels is the technology itself and that this is a challenge for the hardware and software developers. At the same time, there is an urgent task for the librarians and publishers to lay bare, to expose, and yes, to market, the best value added services they can provide in the world of digital information. If you are in the position of trying to think through or manage the transition from print to electronic publishing, I would keep several things in mind.
Why are you making the transition? Make sure you know why you are doing this and what specific advantages you hope to achieve. Remember that print is better in many cases and that there is a difference between "reading literature" on the one hand and "using information" on the other. Where do your publications fall on that continuum?
Make a firm commitment. Start laying the groundwork. Convince others. Build consensus. Getting organizational support may be more important than anything else. It can be effective to do small pilot projects to gain experience and build up staff knowledge and expertise which is vital. Do not try to justify electronic publishing on economic grounds (or, at least, not solely on them, in the short run) because it is much easier to create electronic products than it is to build markets for them.
Expect trouble. I don't mean systems problems or software bugs, though you'll have those. I mean people trouble. There will be resistance, fear, envy, and turf battles. That is what happens when fundamental changes in a production system affect jobs, workflow, and hierarchies. Expect to spend time working this through; don't avoid it; it doesn't just go away; and do as much ahead of time as possible.
Analyze your existing DP or MIS facility carefully, both technologically and organizationally. Should you use them or build your own separate system? There are definite advantages and some hidden benefits to unifying the processing of all the organization's information. But it may also be that traditional DP departments don't understand publishing applications. There can be a culture clash here.
Finally, good luck. You'll need that. And a good consultant you can trust who will be able to tell you the difference between what the vendors say their technology can do and what it actually does.
OVERHEAD ONE
ACM's Electronic Publishing Five-Year Plan
Outline of the Plan: the approach and its rationale
Practical Steps for planning a transition from print-oriented to electronic-oriented publication
Potential Problems: warnings and admonitions
OVERHEAD TWO
Current Publishing Operation by Volume and Type
Total Annual Pages = 40,000 pages
Natural groupings:
| | Mode
| | The primary journals= | 7,700 pages | Traditional
|
| Reference (CR/Guide)= | 2,400 pages | Electronic DB
|
| ACM Press Books= | 2,400 pages | AW, mixed
|
| SIG Newsletters*= | 14,500 pages | Camera Copy
|
| Conference Proceedings= | 17,100 pages | Camera Copy
|
| Other**= | 2,600 pages | Mixed
|
6,500 of the 14,500 newsletter pages are actually conference proceedings.
*Includes The Graduate Assistantship Directory (GAD), The Administrative Directory, Self-Assessment Procedures, Roster, CALGO, Curricula, The ACM Publications Catalogue.
OVERHEAD THREE
Early Efforts...
- Hypertext on Hypertext
- The ACM Hypertext Compendium
- Computer Select
- Computing Archive
- Design Automation Library
- SIGGraph CD-ROM
- CALGO
- Online Database Files
- Resource Series
- Interactive Digital Video videotape
- Ei Document Delivery Service
- Projects: AT&T RightPages, CMU Mercury, VPI Envision
OVERHEAD FOUR
...and their importance
- The myth of timeliness
- Market fragmentation - The Platform Problem
- The myth of economy
- Standards - for distribution and access
- Pricing - no standard model and no east way to develop one
- Pay per Drink
- Site License
- Subscription
- Connect Time
- New Partners and new Contractual arrangements
- Selling Content vs. Licensing Access
- Sales
- Easier to create products than to develop a market
- New resource requirements for big ticket items
- Support
- Copyright protection - conflicting goals
- Experience base
- real applications
- models and prototypes
- proof of concepts
- feasibility - technological, economic, staff resources
- Electronic Database Publishing System - CR Guide/Resources
OVERHEAD FIVE
Overall Goals
Create a publication database and electronic archive that can support a variety of member and market-responsive information products and services.
Accept and process documents in electronic form.
Centralize control of ACM's publication information resources, relying less heavily on vendors.
Improve the document interchange processes used by the ACM professional staff and key external resources, and
Anticipate and respond quickly to changing, evolving ACM missions, priorities, industry trends, member information needs, and resource constraints.
OVERHEAD SIX
Products and Services Goals
Production of print publications
On-demand publications
Article reprints
Derived publications - extracted and/or enhanced
Output to on-line service
Network distribution - e-mail, bbs, fax
CD-ROM production
Citation analysis
OVERHEAD SEVEN
Recommendations for Phase One Implementation
- Limit scope
- subset of ACM publications
subset of tasks
Focus on technology of infrastructure
- rapid technology changes
evolving user needs and preferences
Monitor all experiments - ACM's, commercial, and other non-profit societies
Build the resource while business picture emerges
OVERHEAD EIGHT
Prime Vendor Cendidates
IDI
ArborText
XyVision
Xerox
Interleaf
IBM
Datalogics?
1.1 Capture/Convert
IDI
Avalanche
Atlis
XyVision
SoftQuad
Xerox
Interleaf
IBM
OCLC
Agfa
OpenText
1.2 Write/Edit
Agfa
SoftQuad
ArborText
Interleaf
Xerox
(WordPerfect)
1.3 Compose/Present
Agfa
XyVision
Interleaf
ArborText
(Frame)
(Quark)
Xerox
IBM
1.4 Workflow Mgt/Track
RCP
IDI
Agfa
Interleaf
Xerox
ArborText
1.5 Text & Image Storage
RCP
IDI
Agfa
Interleaf
Xerox
ArborText