Association of Research Libraries (ARLĀ®)

http://www.arl.org/resources/pubs/scat/4.shtml

Publications, Reports, Presentations

Scholarly Communication and Technology

Online Books at Columbia

4. USE OF ONLINE BOOKS

4.1 Methodology For Studying Use of And Reactions to Various Formats

We laid out the evaluation methodology for this Project in our Analytical Principles and Design. This methodology, formulated in the first year of the Project, remains the working plan.


4.1.1 Measurement Plans

Analytical Principles and Design sets forth our plans in this area as follows:

Success of online books is in large part measured by the rate of adoption by the scholarly community and the extent to which they appear to be replacing print books in use. Data on the use of online books and circulation of print books are also available which will allow us to draw certain conclusions on how the various formats are being used.

A related component of our plan is to study the socio-technical environment in which the Columbia community functions and adoption of other forms of electronic communication and scholarly research under the hypothesis that the more Columbia scholars are familiar and comfortable with computing and electronic resources the more likely they are to adopt online books. We summarize some of the early data on this socio-technical environment below. (Section 7 discusses this analysis further.)


4.1.2 Documentation Measures for Use of Online Books

Some of the key measures for documenting use of the online books are:


4.1.3 Documentation Measures for Reactions to Online Books

We are using a wide range of tools in trying to understand the factors that influence use of online books.

Table 1 summarizes our complex array of surveys and interviews.

Table 1. Types of Surveys
PopulationMethod ContactRate Remarks
Users of Online Books Online instrumentPassive Low
Users of Online Books Online post-use survey PassiveVery Low
Users of paper alternatives Response slips in books PassiveUnknown Levels of use not known
Users of course materials in either form Interviews distributed in class ActiveHigh
Users and non-users Library & Campus-Wide surveys ActiveModerate No full active survey of the campus has been done
Discipline-specific potential users Surveys & Interviews ActiveHigh Thus far only conducted before books were online
Note: Passive instruments are ones which the user must elect to encounter. Active instruments are distributed in some way, to the attention of the user. High response rates are in the range of 80-90 percent completion, with better than 60 percent usable.


4.2 Socio-Technical Environment

In our analytical construct, we posit that three sets of socio-technical environmental factors and their change over time will influence the adoption of online books by the Columbia community. These are external (U.S.), disciplinary, and Columbia-related factors. The first and the third of these are discussed below.


4.2.1 External Socio-Technical Environment

In tracking the external socio-technical environment that might affect adoption of online books by members of the Columbia community, we look at three primary measures:

  1. Attention to the Internet and related issues in the press, measured by New York Times articles;

  2. Trends for prices and technical specifications for personal computers, measured both by looking at recommendations for minimum computer standards offered by various writers and at the offerings of Gateway 2000; and

  3. Penetration of computers, modems, Internet access into American homes as reported by various market research companies.

Our findings to date are summarized below.


4.2.1.1 Media Coverage Of the Internet

We hypothesize that members of the Columbia community are more likely to feel that up-to-date personal computer systems and online resources are important to their lives and scholarly work the more the media that they see report on them. The New York Times is our media proxy in tracking the number of stories that community members might have seen involving online-related topics over the past three years.

Table 2. New York Times Stories Involving Information Services
Descriptor Term 1994 19951996 1994 - 1996 Pct Chg. '94-'95 Pct Chg. '95-'96
Internet66 315360 741377% 14%
Online Information Services 0161 140301 NA-13%
World Wide Web 0112 106218 NA-5%
Information Superhighway 2712 544 -56%-58%
Electronic Publishing 3029 2483 -3%-17%
Computer Networks 187129 46362 -31%-64%
Source: Periodical Abstracts, using so=New York Times, de=Descriptor Term here, and period=Year given here.

Discussions of the Internet soared from 1994 to 1995 and then stayed at a relatively even level of about one story a day. Online Information Services and World Wide Web went from not even being descriptor terms in Periodical Abstracts for 1994 to coverage at about half the rate of the Internet in general in the next two years. These terms seem to have supplanted Computer Networks which was a significant term in 1994.


4.2.1.2 Personal Computer Specifications & Pricing Trends

Since the development of personal computers we have seen a continual growth in the quality of the systems on offer and a flat or declining price for the systems recommended for household purchase. In 1997 for the first time, manufacturers have introduced systems priced at around $1,000 that will allow a household to access the Internet smoothly if not with the speed and monitor performance of a system costing twice that much. In June 1997, Gateway 2000 was offering a family-oriented system for $1,499 that was significantly more powerful in almost every parameter than a system priced at $1,999 in May 1996.

Appendix 3 tracks the minimum recommended specifications for home computers given by various writers from May 1994 to April 1997. Summarizing these data by looking at three major factors (CPU, RAM, and hard drive capacity)), we see dramatic increases over the past three years. In the earlier years, neither Pentium CPUs nor personal computer hard drives with capacity above 340 MB were even available.

Table 3. Minimum Recommended Specifications for Home Computers, May 1994 - April 1997

May 1994 (for student)
April 1995
April 1996
April 1997
CPU
486
486DX2/66
75 Mhz Pentium
166 Mhz MMX Pentium
RAM
4 MB
8 MB
8 MB
16 MB
Hard Drive
100 MB
340 MB
1 GB
2 GB
Price Est.
$1,500
$1,800 - $2,000
$2,000
Not given in the source
Note: This is an extract from Appendix 3.

As one might expect given Gateway 2000's leading position in the family personal computer market, its offerings track these recommendations by journalists. As Appendix 4 shows, the personal computer capability available for about $2,000 has escalated since late 1994, our first data point. All of these computers are equipped with CD-ROMs, sound systems, and modems. Summarizing that appendix, we find that a $1,500 computer today is over twice as large and twice as fast as a $2,100 computer thirty months ago.

Table 4. Characteristics of a $2,000 Computer, December 1994 - June 1997

Dec. 1994
April 1995
May 1996
May 1997
June 1997
CPU
60 Mhz Pentium
60 Mhz Pentium
120 Mhz Pentium
200 Mhz MMX Pentium
166 Mhz Pentium
RAM
8 MB
8 MB
16 MB
16 MB
16 MB
Hard Drive
540 MB
540 MB
850 GB
1.6 GB
1.2 GB
Price (+ shipping)
$2,099
$2,099
$1,999
$2,064
$1,499
Note: This is an extract from Appendix 4.


4.2.1.3 Household Computer Penetration & Internet Access

Many market research reports estimate the penetration of computers and modems into U.S. households, access to and use of the Internet, and the like over the past few years. Unfortunately, the findings vary considerably for single points in time (see Appendix 5). Data from one source, Find/SVP, are summarized here.

Find/SVP's Emerging Technologies Research Group issued the results of its latest survey in early May 1997. The telephone survey, conducted from February to April 1997, included 1,000 adult current Internet users and 1,000 adult non-users. Its Web site (http://www.findsvp.com/) has a substantive summary of its results. The report also summarizes historical penetration data back to 1994 and makes projections through 2001 in a chart (at http://www.columbia.edu/cu/libraries/digital/texts/forecast/) that tracks PC Households, Modem Households, Internet Households, and Non-PC Internet Access Households (NetTV). According to that chart,

An early 1997 Baruch College-Harris Poll survey of 1,000 households found 21 percent of U.S. adults (40 million) using the Internet and/or the World Wide Web. This figure is half of all computer users and double the number using the Internet a year ago. An additional 12 percent of respondents use commercial online services.


4.2.2 Columbia Socio-Technical Environment

Columbia infrastructure, penetration of ready access to computing, and amount of time spent in online activities are among the Columbia socio-technical environmental factors that may affect adoption of online books.


4.2.2.1 Campus Infrastructure: February 1997

Columbia's campus infrastructure is similar to that of other universities in its components and in its constant expansion to meet community demand for access to email and other Internet services. Currently, a 10BaseT fiber optic campus network connects 65 buildings and a T3 line connects the campus to the Internet. Over 9,000 ports are connected to the network and over 20,000 computers are registered to community members. All fifteen undergraduate residence halls are pre-wired; the residence hall network has over 4,500 ports. Our modem pool is constantly growing to serve demand; 298 modems with SLIP/PPP support now handle over 52,000 calls on a typical week. Email servers managed over 442,000 email messages in 1996. The campus has 366 public workstations, kiosks, and lab computers; all are connected to the network.


4.2.2.2 Community Perceptions of Access To Computing Resources

Is there a computer (in the library or elsewhere) attached to the campus network (directly or by modem) that you can use whenever you want? is one of two constant questions on our various questionnaires. The most recent response to that question to date came in the Libraries' onsite user survey in March 1997.

Table 6. March 1997 In-Library Survey: Is there a computer (in the library or elsewhere) attached to the campus network (directly or by modem) that you can use whenever you want?

CohortSample Size Responding YES
Faculty Member44
86%
Doctoral Student468
85%
Masters Student611
67%
Undergraduate1,065
87%

In Fall 1995, we cooperated with the Office of the Provost in conducting a campus computing survey. The initial means of distributing this survey was an "opinion festival" in the rotunda of the main administration building. This festival was billed primarily as a food tasting; it attracted many students and few faculty members. The computing survey garnered 414 student responses - 125 graduate students and 289 undergraduate students spread fairly well across the four classes. To amplify the graduate student and faculty counts we did follow-up mailings - to a sample of 2,000 graduate students and all faculty members. Responses were modest in number and quite skewed by department, especially for the faculty survey, so these data are unlikely to be reliable.

The share of Columbia community members reporting ready access to a networked-linked computer (the same question asked in the onsite library survey) by cohort is as follows.

Table 7. Fall 1995 Campus Survey: Is there a computer (in the library or elsewhere) attached to the campus network (directly or by modem) that you can use whenever you want?

Cohort
Sample Size
Responding YES
Faculty Member143
90%
Graduate Student301
80%
Senior88
65%
Junior71
63%
Sophomore76
63%
Freshman54
78%

With such small sample sizes for the undergraduate cohorts, there is no significant relationship between the shares reporting such computer access and level of study.

About 72 percent of undergraduates, 80 percent of graduate students, and 85 percent of faculty members responded Yes to the question Do you have your own computer in your residence? in this survey. That these values are higher than those for the access question may reflect that some of the students do not have modems or network cards in their computers or do not use them. Questions asking for details about the power of these computers and the degree to which they have communications hardware were not answered fully.


4.2.2.3 Community Use of Online Resources

A related question that we ask on all of our questionnaires regards time spent on online activities. For the 1996 and 1997 onsite library surveys, this was phrased as On average this semester, how many hours per week do you spend in online activities (Email, Listservs & Newsgroups, CLIO Plus, Text, Image or Numeric Data Sources, Other WWWeb Uses)? The respondent was instructed to write a value in the blank provided.

The following table gives a grouping of the distribution of the total responses to this question in 1997 in column 2, of the responses by those who claimed easy access to computers with online access in column 3, and of the responses by those who said that they did not have such access in column 4.

Table 8. March 1997 In-Library Survey: Weekly Hours on Online Activities by Access to Computers Linked to Campus Network, Winter 1997

Percent of Respondents In Group
Hours/Week
All

(N=2,493)
W/Easy Access

(N=1,853)
W/O Easy Access

(N=428)
O2.9% 1.4%5.6%
1-346.4% 45.2%49.8%
4-623.0% 23.8%21.7%
7-96.1% 6.1%7.5%
10-1211.7% 12.4%9.3%
13-153.2% 3.8%1.2%
16-180.4% 0.4%0.2%
19-213.7% 4.0%2.8%
22-280.6% 0.4%1.2%
29-351.2% 1.6%0.2%
More than 350.8% 0.9%0.5%

Even those who answered No to the previous question, i.e., they do not feel that they can use a computer attached to the campus network whenever they want, report spending substantial time on online activities each week (column 4 data). The mean number of weekly hours in online activities reported by those who reported any such use was 5.8 hours, with the greatest amount reported 60 hours (8 respondents).

Another way to look at these data is to group the responses by Columbia status of the respondent. This is done below for the four major scholarly components of the community. The cohorts include only those individuals who provided status information. Time spent in online activities was quite consistent across cohorts within the Columbia community; differences among cohorts were not statistically significant.

Table 9.March 1997 In-Library Survey: Weekly Hours In Online Activities by Columbia Status, Winter 1997

Percent of Respondents In Group
Hours/Week
Undergraduate Students (N=1,107)
Masters Students (N=649)
Doctoral Students (N=477)
Faculty Members (N=45)
O2.1% 2.5%1.7% 6.7%
1-349.4% 44.8%44.2% 33.3%
4-622.4% 23.6%23.1% 26.7%
7-96.7% 6.2%6.3% 0.0%
10-1210.2% 12.9%13.8% 17.8%
13-152.6% 3.4%3.8% 8.9%
16-180.4% 0.5%0.6% 0.0%
19-213.3% 3.9%3.6% 4.4%
22-280.4% 1.1%0.6% 0.0%
29-351.3% 0.9%1.7% 1.7%
More than 351.3% 0.3%0.6% 0.0%
Mean5.7 5.96.3 6.5

Differences in reporting make comparison with the 1996 results difficult, but it appears that average weekly hours online increased modestly from winter 1996 to winter 1997.


4.3 Findings On Use Of Books In Online Collection

At this point we will report on (1) trends in use of the CNet and CWeb books; (2); user location and cohort as suggested by host computer address; (3) distribution of use by day of week and time of day; (4) patterns of hits per Web session involving online books for two weeks' use and for the overall use of three social work titles; and (5) use of the online books by individuals from March 15 to May 31, 1997. Summarized below are findings in these areas for the various groups of books.


4.3.1 Reference Books

4.3.1.1 Total Use Over Time

Three reference works have been available online long enough to have generated substantial usage data. These are The Concise Columbia Electronic Encyclopedia, Columbia Granger's World of Poetry, and The Oxford English Dictionary. The three Garland titles have been online only since the turn of the year or later, so our usage data are very short term for these titles. All three are accessible both through CNet and CWeb.

As of the time of this writing, CWeb usage data extended only through March 14, 1997 on a monthly basis. With the exception of Columbia Granger's World of Poetry, usage (number of hits and unique users) from March 15 to May 31, 1997 was reported as a single number. No data are available for Granger's after March 14th. In the CWeb data reported below, the early March data is included with the newer data to give one value for the three month period of March to May.


4.3.1.1.1 Concise Columbia Electronic Encyclopedia

The Concise Encyclopedia remains on the older CWIS-gopher platform CNet. Usage declined 84 percent over the past three years, from 1,551 sessions in April 1994 to 250 sessions in April 1997. Usage has declined most in the current academic year; 7,861 sessions were registered from September 1995 to May 1996 and 2,941 sessions (63% fewer) from September 1996 to May 1997.

Graph 1. Concise Columbia Electronic Encyclopedia Sessions, 1994 - 1997: CNet

Graph 1. Concise Columbia Electronic Encyclopedia Sessions


Potential reasons for this steep decline include:

Columbia scholars seldom use the print copy of the Concise Encyclopedia, which resides behind the Reference desk. Its larger cousin, which is out in the public area, sees much greater use. We plan to put that longer, one volume CUP encyclopedia online on CWeb this year. Its use patterns will be instructive.


4.3.1.1.2 Columbia Granger's World of Poetry*

Columbia Granger's World of Poetry is available on both CNet and CWeb. The CNet version is a lynx, non-graphical Web, formulation of the CWeb version. This resource, which became available to the community in online form in October 1994, locates a poem in an anthology by author, subject, title, first line, or keywords in its title or first line. In addition, it provides easy access to the 10,000 most often anthologized poems. As the following table shows, total usage declined from 1996 to 1997 - by 49 percent from the first quarter of 1996 to the first quarter of 1997. Even so, the 4,289 hits for 1996 is considerable.

Reference librarians report no more than a handful of uses of the print version of Granger's each year; it is kept behind the main reference desk and lacks the database of poems. The CD-ROM version, which is kept in the Electronic Texts Service, has the same functionality as the online version; it is used once or twice a month on average.

Table 10. Columbia Granger's World of Poetry: Number of Hits by Month

CNet
CWeb
Total CNet & CWeb
% Change for Total
1994
1995
1996
1997
1995
1996
1997
1995
1996
1997
'94 to '95
'95 to '96
'96 to '97
Jan. 0 22291 180 466150 222557 168 151%-70%
Feb. 0 204137 310 282312 204419 343 105%-18%
Mar.* 0 29296 410 465236 292561 277 92%-51%
April 0 19973 340 278NA 199351 NA 76%NA
May 0 13435 17682 199NA 816277 NA -66%NA
June 0 8130 695 102NA 776239 NA -69%NA
July* 0 8071 550 383 630464 -26%
Aug. 0 7853 767 27 84583 -90%
Sept. 0 7658 596 179 672238 -65%
Oct. NA 16284 863 262 1,025348 -66%
Nov. 311 11450 800 413 914465 194% -49%
Dec. 207 6828 725 257 793287 283% -64%
Total NA 1,710806 5,678 3,483 6,7584,289 NC -37%
Note: * July 1995 CNet hits are estimated. CWeb data are available through March 15, 1997 only; this estimated value is twice the actual count.


4.3.1.1.3 The Oxford English Dictionary

At this time, The Oxford English Dictionary is the most heavily used reference work in our collection. As noted earlier, it is available on both CNet and CWeb, with the former format having greater functionality but being quite opaque. Users find the latter attractive and easy to use, but it only permits them to look up a definition or browse through the contents.

Usage of the CNet version dropped 59 percent from the fourth quarter of 1994 (2,856 hits) to the first quarter of 1997 (1,167 hits). The CWeb version attracted greater use than the CNet version from its first months. Total usage of the resource was greater with the two versions in place than with only CNet, by 55 percent in February 1997 versus February 1995.

Table 11. Oxford English Dictionary: Number of Hits by Month

CNet
CWeb
Total CNet & CWeb
% Change
1994
1995
1996
1997
1996
1997
1996
1997
94 to '95
95 to '96
96 to '97
Jan.
0
643
497
259
0
385
497
644
-23%
30%
Feb.
0
939
1,065
434
0
1,022
1,065
1,456
13%
37%
Mar. *
0
847
683
474
0
683
-19%
April
0
791
752
372
0
#919
752
2,065#
-5%
#12%
May
0
436
410
300
0
410
-6%
June
0
336
310
0
310
-8%
July *
0
300
328
0
328
9%
Aug.
NA
299
282
8
282
-6%
Sept.
NA
533
391
570
961
80%
Oct.
1,238
1,017
783
647
1,430
-18%
41%
Nov.
975
795
335
271
606
-18%
-24%
Dec.
643
536
318
337
655
-17%
22%
Total
6,926
6,154
NA
NA
NA
8,069
NA
-11%
Note: * July 1995 CNet usage is estimated, as the true value was unavailable.

# March - May 1997 hits; these data are somewhat under-counted as The OED was not included in the user-identified data set initially and as one form of bookmarked access was not included for the whole period. The OED became available on CNet in August 1994, but usage data are available back to October 1994 only.

Columbia College has a one semester Logic and Rhetoric course that is required of all its students (about 1,000 each year). Students in this course must complete an assignment involving the OED and are encouraged to use an online version. That assignment occurred in October 1996 and mid-February to early March 1997. In the period preceding mid-March 1997, almost 42 percent of the hits (1,531) on the CWeb OED came from computers in dormitory rooms, suggesting that students are using this resource. This conclusion is confirmed by the analysis of the data by user in the period beginning in mid-March; see section 4.3.4.

Observation and reshelving activity show that scholars frequently use the print copy. However, statistics on use are unavailable as scholars have direct access to several sets in libraries around campus and have not been cooperative in recording use of volumes. In addition, scholars often owned their own copies of the compact edition of The OED. Finally, some serious scholars use the CD-ROM version in the Libraries' Electronic Text Service which allows refined searches with a search engine that is more attractive and user friendly than that in CNet.


4.3.1.1.4 Garland Reference Works

Garland's Chaucer Name Dictionary was added to the CWeb collection at the end of 1996. Native American Women was added in January 1997 and African American Women in February 1997. The first two were added to the CNet collection in February 1997 and the third in March 1997.

Table 12. Garland Reference Works: Number of Hits by Month, December 1, 1996 - May 31, 1997

Chaucer Name Dictionary African American Women Native American Women
CWeb
CNet
Total
CWeb
CNet
Total
CWeb
CNet
Total
Dec. '9628 NA28 NANA NCNA NANC
Jan. '9762 NA62 8NA 860 NA60
Feb.107 15122 26NA 26107 11118
MarchND 8NC 31 7
April#72 7#90 #9010 #139#63 4#77
May 3 8 3
Total269 33302124 49173230 25255
Note: # March - May 1997 hits. NA - Resource was not available. ND: Data are not available.

NC - Not Calculable.

CWeb is a far more popular means of access to these resources than CNet. Although Chaucer Name Dictionary and African American Women were both available on CNet from February 3rd, their usage on CNet in February was only 10 to 15 percent of that on CWeb. The Libraries' print copies of these reference books are lightly used, so these hits signify substantial expansion of use of these books.


4.3.1.2 Host Computers for Reference Book Use

A user location analysis acts as a proxy for user cohort for the early use data. We have grouped host computers into the following ten categories.

The distribution of use of the five reference works supplied via CWeb across these categories is shown below. With the exception of the three Garland books, a very small share of the uses of these reference works occur on computers in the libraries; the Columbia community is taking advantage of the out-of-library access to these resources. As noted earlier, a large share of the use of The OED occurs from students' on campus residences (rhno host computers).

Table 13. Host Computers for Reference Book Use, May 1, 1996 - March 15, 1997 - Percent Distribution

Host Computer Type
Granger's Poetry
OED
Garland Titles
cc
1%
8%
2%
cul
1%
2%
40%
cunix*
64%
16%
36%
cupress
**
**
6%
dialup
6%
13%
0%
english
0%
0%
0%
pols
0%
0%
0%
rhno
9%
42%
4%
sipa
1%
1%
0%
ssw
7%
1%
3%
other
11%
16%
8%
Notes: * In the later part of this period, a Cunix server was given as the host computer for all uses of Granger's.

** Less than .5%



[Table of Contents] [Next Page]