4. USE OF ONLINE BOOKS
4.1 Methodology For Studying Use of And Reactions to Various Formats
We laid out the evaluation methodology for this Project in our Analytical Principles and Design. This methodology, formulated in the first year of the Project, remains the working plan.
4.1.1 Measurement Plans
Analytical Principles and Design sets forth our plans in this area as follows:
Success of online books is in large part measured by the rate of adoption by the scholarly community and the extent to which they appear to be replacing print books in use. Data on the use of online books and circulation of print books are also available which will allow us to draw certain conclusions on how the various formats are being used.
A related component of our plan is to study the socio-technical environment in which the Columbia community functions and adoption of other forms of electronic communication and scholarly research under the hypothesis that the more Columbia scholars are familiar and comfortable with computing and electronic resources the more likely they are to adopt online books. We summarize some of the early data on this socio-technical environment below. (Section 7 discusses this analysis further.)
4.1.2 Documentation Measures for Use of Online Books
Some of the key measures for documenting use of the online books are:
4.1.3 Documentation Measures for Reactions to Online Books
We are using a wide range of tools in trying to understand the factors that influence use of online books.
| Table 1. Types of Surveys | ||||
| Population | Method | Contact | Rate | Remarks |
| Users of Online Books | Online instrument | Passive | Low | |
| Users of Online Books | Online post-use survey | Passive | Very Low | |
| Users of paper alternatives | Response slips in books | Passive | Unknown | Levels of use not known |
| Users of course materials in either form | Interviews distributed in class | Active | High | |
| Users and non-users | Library & Campus-Wide surveys | Active | Moderate | No full active survey of the campus has been done |
| Discipline-specific potential users | Surveys & Interviews | Active | High | Thus far only conducted before books were online |
| Note: Passive instruments are ones which the user must elect to encounter. Active instruments are distributed in some way, to the attention of the user. High response rates are in the range of 80-90 percent completion, with better than 60 percent usable. | ||||
4.2 Socio-Technical Environment
In our analytical construct, we posit that three sets of socio-technical environmental factors and their change over time will influence the adoption of online books by the Columbia community. These are external (U.S.), disciplinary, and Columbia-related factors. The first and the third of these are discussed below.
4.2.1 External Socio-Technical Environment
In tracking the external socio-technical environment that might affect adoption of online books by members of the Columbia community, we look at three primary measures:
Our findings to date are summarized below.
4.2.1.1 Media Coverage Of the Internet
We hypothesize that members of the Columbia community are more likely to feel that up-to-date personal computer systems and online resources are important to their lives and scholarly work the more the media that they see report on them. The New York Times is our media proxy in tracking the number of stories that community members might have seen involving online-related topics over the past three years.
| Table 2. New York Times Stories Involving Information Services | ||||||
| Descriptor Term | 1994 | 1995 | 1996 | 1994 - 1996 | Pct Chg. '94-'95 | Pct Chg. '95-'96 |
| Internet | 66 | 315 | 360 | 741 | 377% | 14% |
| Online Information Services | 0 | 161 | 140 | 301 | NA | -13% |
| World Wide Web | 0 | 112 | 106 | 218 | NA | -5% |
| Information Superhighway | 27 | 12 | 5 | 44 | -56% | -58% |
| Electronic Publishing | 30 | 29 | 24 | 83 | -3% | -17% |
| Computer Networks | 187 | 129 | 46 | 362 | -31% | -64% |
| Source: Periodical Abstracts, using so=New York Times, de=Descriptor Term here, and period=Year given here. | ||||||
Discussions of the Internet soared from 1994 to 1995 and then stayed at a relatively even level of about one story a day. Online Information Services and World Wide Web went from not even being descriptor terms in Periodical Abstracts for 1994 to coverage at about half the rate of the Internet in general in the next two years. These terms seem to have supplanted Computer Networks which was a significant term in 1994.
4.2.1.2 Personal Computer Specifications & Pricing Trends
Since the development of personal computers we have seen a continual growth in the quality of the systems on offer and a flat or declining price for the systems recommended for household purchase. In 1997 for the first time, manufacturers have introduced systems priced at around $1,000 that will allow a household to access the Internet smoothly if not with the speed and monitor performance of a system costing twice that much. In June 1997, Gateway 2000 was offering a family-oriented system for $1,499 that was significantly more powerful in almost every parameter than a system priced at $1,999 in May 1996.
Appendix 3 tracks the minimum recommended specifications for home computers given by various writers from May 1994 to April 1997. Summarizing these data by looking at three major factors (CPU, RAM, and hard drive capacity)), we see dramatic increases over the past three years. In the earlier years, neither Pentium CPUs nor personal computer hard drives with capacity above 340 MB were even available.
| CPU | ||||
| RAM | ||||
| Hard Drive | ||||
| Price Est. | ||||
| Note: This is an extract from Appendix 3. | ||||
As one might expect given Gateway 2000's leading position in the family personal computer market, its offerings track these recommendations by journalists. As Appendix 4 shows, the personal computer capability available for about $2,000 has escalated since late 1994, our first data point. All of these computers are equipped with CD-ROMs, sound systems, and modems. Summarizing that appendix, we find that a $1,500 computer today is over twice as large and twice as fast as a $2,100 computer thirty months ago.
| CPU | |||||
| RAM | |||||
| Hard Drive | |||||
| Price (+ shipping) | |||||
| Note: This is an extract from Appendix 4. | |||||
4.2.1.3 Household Computer Penetration & Internet Access
Many market research reports estimate the penetration of computers and modems into U.S. households, access to and use of the Internet, and the like over the past few years. Unfortunately, the findings vary considerably for single points in time (see Appendix 5). Data from one source, Find/SVP, are summarized here.
Find/SVP's Emerging Technologies Research Group issued the results of its latest survey in early May 1997. The telephone survey, conducted from February to April 1997, included 1,000 adult current Internet users and 1,000 adult non-users. Its Web site (http://www.findsvp.com/) has a substantive summary of its results. The report also summarizes historical penetration data back to 1994 and makes projections through 2001 in a chart (at http://www.columbia.edu/cu/libraries/digital/texts/forecast/) that tracks PC Households, Modem Households, Internet Households, and Non-PC Internet Access Households (NetTV). According to that chart,
Table 5. U.S. Internet Households
| Year | ||
| 1994 | 3.1 | |
| 1995 | 6.2 | |
| 1996 | 14.7 | |
| 1997 | 21.9 | |
| 1998 | 28.0 | |
| 1999 | 33.0 | |
| 2000 | 36.5 | |
| 2001 | 40.0 |
An early 1997 Baruch College-Harris Poll survey of 1,000 households found 21 percent of U.S. adults (40 million) using the Internet and/or the World Wide Web. This figure is half of all computer users and double the number using the Internet a year ago. An additional 12 percent of respondents use commercial online services.
4.2.2 Columbia Socio-Technical Environment
Columbia infrastructure, penetration of ready access to computing, and amount of time spent in online activities are among the Columbia socio-technical environmental factors that may affect adoption of online books.
4.2.2.1 Campus Infrastructure: February 1997
Columbia's campus infrastructure is similar to that of other universities in its components and in its constant expansion to meet community demand for access to email and other Internet services. Currently, a 10BaseT fiber optic campus network connects 65 buildings and a T3 line connects the campus to the Internet. Over 9,000 ports are connected to the network and over 20,000 computers are registered to community members. All fifteen undergraduate residence halls are pre-wired; the residence hall network has over 4,500 ports. Our modem pool is constantly growing to serve demand; 298 modems with SLIP/PPP support now handle over 52,000 calls on a typical week. Email servers managed over 442,000 email messages in 1996. The campus has 366 public workstations, kiosks, and lab computers; all are connected to the network.
4.2.2.2 Community Perceptions of Access To Computing Resources
Is there a computer (in the library or elsewhere) attached to the campus network (directly or by modem) that you can use whenever you want? is one of two constant questions on our various questionnaires. The most recent response to that question to date came in the Libraries' onsite user survey in March 1997.
| Cohort | Sample Size | Responding YES |
| Faculty Member | 44 | |
| Doctoral Student | 468 | |
| Masters Student | 611 | |
| Undergraduate | 1,065 |
In Fall 1995, we cooperated with the Office of the Provost in conducting a campus computing survey. The initial means of distributing this survey was an "opinion festival" in the rotunda of the main administration building. This festival was billed primarily as a food tasting; it attracted many students and few faculty members. The computing survey garnered 414 student responses - 125 graduate students and 289 undergraduate students spread fairly well across the four classes. To amplify the graduate student and faculty counts we did follow-up mailings - to a sample of 2,000 graduate students and all faculty members. Responses were modest in number and quite skewed by department, especially for the faculty survey, so these data are unlikely to be reliable.
The share of Columbia community members reporting ready access to a networked-linked computer (the same question asked in the onsite library survey) by cohort is as follows.
Table 7. Fall 1995 Campus Survey: Is there a computer (in the library or elsewhere) attached to the campus network (directly or by modem) that you can use whenever you want?
| Cohort | ||
| Faculty Member | 143 | |
| Graduate Student | 301 | |
| Senior | 88 | |
| Junior | 71 | |
| Sophomore | 76 | |
| Freshman | 54 |
With such small sample sizes for the undergraduate cohorts, there is no significant relationship between the shares reporting such computer access and level of study.
About 72 percent of undergraduates, 80 percent of graduate students, and 85 percent of faculty members responded Yes to the question Do you have your own computer in your residence? in this survey. That these values are higher than those for the access question may reflect that some of the students do not have modems or network cards in their computers or do not use them. Questions asking for details about the power of these computers and the degree to which they have communications hardware were not answered fully.
4.2.2.3 Community Use of Online Resources
A related question that we ask on all of our questionnaires regards time spent on online activities. For the 1996 and 1997 onsite library surveys, this was phrased as On average this semester, how many hours per week do you spend in online activities (Email, Listservs & Newsgroups, CLIO Plus, Text, Image or Numeric Data Sources, Other WWWeb Uses)? The respondent was instructed to write a value in the blank provided.
The following table gives a grouping of the distribution of the total responses to this question in 1997 in column 2, of the responses by those who claimed easy access to computers with online access in column 3, and of the responses by those who said that they did not have such access in column 4.
| Hours/Week |
|
|
|
| O | 2.9% | 1.4% | 5.6% |
| 1-3 | 46.4% | 45.2% | 49.8% |
| 4-6 | 23.0% | 23.8% | 21.7% |
| 7-9 | 6.1% | 6.1% | 7.5% |
| 10-12 | 11.7% | 12.4% | 9.3% |
| 13-15 | 3.2% | 3.8% | 1.2% |
| 16-18 | 0.4% | 0.4% | 0.2% |
| 19-21 | 3.7% | 4.0% | 2.8% |
| 22-28 | 0.6% | 0.4% | 1.2% |
| 29-35 | 1.2% | 1.6% | 0.2% |
| More than 35 | 0.8% | 0.9% | 0.5% |
Even those who answered No to the previous question, i.e., they do not feel that they can use a computer attached to the campus network whenever they want, report spending substantial time on online activities each week (column 4 data). The mean number of weekly hours in online activities reported by those who reported any such use was 5.8 hours, with the greatest amount reported 60 hours (8 respondents).
Another way to look at these data is to group the responses by Columbia status of the respondent. This is done below for the four major scholarly components of the community. The cohorts include only those individuals who provided status information. Time spent in online activities was quite consistent across cohorts within the Columbia community; differences among cohorts were not statistically significant.
| Hours/Week | ||||
| O | 2.1% | 2.5% | 1.7% | 6.7% |
| 1-3 | 49.4% | 44.8% | 44.2% | 33.3% |
| 4-6 | 22.4% | 23.6% | 23.1% | 26.7% |
| 7-9 | 6.7% | 6.2% | 6.3% | 0.0% |
| 10-12 | 10.2% | 12.9% | 13.8% | 17.8% |
| 13-15 | 2.6% | 3.4% | 3.8% | 8.9% |
| 16-18 | 0.4% | 0.5% | 0.6% | 0.0% |
| 19-21 | 3.3% | 3.9% | 3.6% | 4.4% |
| 22-28 | 0.4% | 1.1% | 0.6% | 0.0% |
| 29-35 | 1.3% | 0.9% | 1.7% | 1.7% |
| More than 35 | 1.3% | 0.3% | 0.6% | 0.0% |
| Mean | 5.7 | 5.9 | 6.3 | 6.5 |
Differences in reporting make comparison with the 1996 results difficult, but it appears that average weekly hours online increased modestly from winter 1996 to winter 1997.
4.3 Findings On Use Of Books In Online Collection
At this point we will report on (1) trends in use of the CNet and CWeb books; (2); user location and cohort as suggested by host computer address; (3) distribution of use by day of week and time of day; (4) patterns of hits per Web session involving online books for two weeks' use and for the overall use of three social work titles; and (5) use of the online books by individuals from March 15 to May 31, 1997. Summarized below are findings in these areas for the various groups of books.
4.3.1 Reference Books
4.3.1.1 Total Use Over Time
Three reference works have been available online long enough to have generated substantial usage data. These are The Concise Columbia Electronic Encyclopedia, Columbia Granger's World of Poetry, and The Oxford English Dictionary. The three Garland titles have been online only since the turn of the year or later, so our usage data are very short term for these titles. All three are accessible both through CNet and CWeb.
As of the time of this writing, CWeb usage data extended only through March 14, 1997 on a monthly basis. With the exception of Columbia Granger's World of Poetry, usage (number of hits and unique users) from March 15 to May 31, 1997 was reported as a single number. No data are available for Granger's after March 14th. In the CWeb data reported below, the early March data is included with the newer data to give one value for the three month period of March to May.
4.3.1.1.1 Concise Columbia Electronic Encyclopedia
The Concise Encyclopedia remains on the older CWIS-gopher platform CNet. Usage declined 84 percent over the past three years, from 1,551 sessions in April 1994 to 250 sessions in April 1997. Usage has declined most in the current academic year; 7,861 sessions were registered from September 1995 to May 1996 and 2,941 sessions (63% fewer) from September 1996 to May 1997.
Potential reasons for this steep decline include:
Columbia scholars seldom use the print copy of the Concise Encyclopedia, which resides behind the Reference desk. Its larger cousin, which is out in the public area, sees much greater use. We plan to put that longer, one volume CUP encyclopedia online on CWeb this year. Its use patterns will be instructive.
4.3.1.1.2 Columbia Granger's World of Poetry*
Columbia Granger's World of Poetry is available on both CNet and CWeb. The CNet version is a lynx, non-graphical Web, formulation of the CWeb version. This resource, which became available to the community in online form in October 1994, locates a poem in an anthology by author, subject, title, first line, or keywords in its title or first line. In addition, it provides easy access to the 10,000 most often anthologized poems. As the following table shows, total usage declined from 1996 to 1997 - by 49 percent from the first quarter of 1996 to the first quarter of 1997. Even so, the 4,289 hits for 1996 is considerable.
Reference librarians report no more than a handful of uses of the print version of Granger's each year; it is kept behind the main reference desk and lacks the database of poems. The CD-ROM version, which is kept in the Electronic Texts Service, has the same functionality as the online version; it is used once or twice a month on average.
| Jan. | 0 | 222 | 91 | 18 | 0 | 466 | 150 | 222 | 557 | 168 | 151% | -70% | |||
| Feb. | 0 | 204 | 137 | 31 | 0 | 282 | 312 | 204 | 419 | 343 | 105% | -18% | |||
| Mar.* | 0 | 292 | 96 | 41 | 0 | 465 | 236 | 292 | 561 | 277 | 92% | -51% | |||
| April | 0 | 199 | 73 | 34 | 0 | 278 | NA | 199 | 351 | NA | 76% | NA | |||
| May | 0 | 134 | 35 | 17 | 682 | 199 | NA | 816 | 277 | NA | -66% | NA | |||
| June | 0 | 81 | 30 | 695 | 102 | NA | 776 | 239 | NA | -69% | NA | ||||
| July* | 0 | 80 | 71 | 550 | 383 | 630 | 464 | -26% | |||||||
| Aug. | 0 | 78 | 53 | 767 | 27 | 845 | 83 | -90% | |||||||
| Sept. | 0 | 76 | 58 | 596 | 179 | 672 | 238 | -65% | |||||||
| Oct. | NA | 162 | 84 | 863 | 262 | 1,025 | 348 | -66% | |||||||
| Nov. | 311 | 114 | 50 | 800 | 413 | 914 | 465 | 194% | -49% | ||||||
| Dec. | 207 | 68 | 28 | 725 | 257 | 793 | 287 | 283% | -64% | ||||||
| Total | NA | 1,710 | 806 | 5,678 | 3,483 | 6,758 | 4,289 | NC | -37% | ||||||
| Note: * July 1995 CNet hits are estimated. CWeb data are available through March 15, 1997 only; this estimated value is twice the actual count. | |||||||||||||||
4.3.1.1.3 The Oxford English Dictionary
At this time, The Oxford English Dictionary is the most heavily used reference work in our collection. As noted earlier, it is available on both CNet and CWeb, with the former format having greater functionality but being quite opaque. Users find the latter attractive and easy to use, but it only permits them to look up a definition or browse through the contents.
Usage of the CNet version dropped 59 percent from the fourth quarter of 1994 (2,856 hits) to the first quarter of 1997 (1,167 hits). The CWeb version attracted greater use than the CNet version from its first months. Total usage of the resource was greater with the two versions in place than with only CNet, by 55 percent in February 1997 versus February 1995.
| Jan. | |||||||||||
| Feb. | |||||||||||
| Mar. * | |||||||||||
| April | |||||||||||
| May | |||||||||||
| June | |||||||||||
| July * | |||||||||||
| Aug. | |||||||||||
| Sept. | |||||||||||
| Oct. | |||||||||||
| Nov. | |||||||||||
| Dec. | |||||||||||
| Total | |||||||||||
| Note: * July 1995 CNet usage is estimated, as the true value was unavailable.
# March - May 1997 hits; these data are somewhat under-counted as The OED was not included in the user-identified data set initially and as one form of bookmarked access was not included for the whole period. The OED became available on CNet in August 1994, but usage data are available back to October 1994 only. | |||||||||||
Columbia College has a one semester Logic and Rhetoric course that is required of all its students (about 1,000 each year). Students in this course must complete an assignment involving the OED and are encouraged to use an online version. That assignment occurred in October 1996 and mid-February to early March 1997. In the period preceding mid-March 1997, almost 42 percent of the hits (1,531) on the CWeb OED came from computers in dormitory rooms, suggesting that students are using this resource. This conclusion is confirmed by the analysis of the data by user in the period beginning in mid-March; see section 4.3.4.
Observation and reshelving activity show that scholars frequently use the print copy. However, statistics on use are unavailable as scholars have direct access to several sets in libraries around campus and have not been cooperative in recording use of volumes. In addition, scholars often owned their own copies of the compact edition of The OED. Finally, some serious scholars use the CD-ROM version in the Libraries' Electronic Text Service which allows refined searches with a search engine that is more attractive and user friendly than that in CNet.
4.3.1.1.4 Garland Reference Works
Garland's Chaucer Name Dictionary was added to the CWeb collection at the end of 1996. Native American Women was added in January 1997 and African American Women in February 1997. The first two were added to the CNet collection in February 1997 and the third in March 1997.
| Chaucer Name Dictionary | African American Women | Native American Women | |||||||
| Dec. '96 | 28 | NA | 28 | NA | NA | NC | NA | NA | NC |
| Jan. '97 | 62 | NA | 62 | 8 | NA | 8 | 60 | NA | 60 |
| Feb. | 107 | 15 | 122 | 26 | NA | 26 | 107 | 11 | 118 |
| March | ND | 8 | NC | 31 | 7 | ||||
| April | #72 | 7 | #90 | #90 | 10 | #139 | #63 | 4 | #77 |
| May | 3 | 8 | 3 | ||||||
| Total | 269 | 33 | 302 | 124 | 49 | 173 | 230 | 25 | 255 |
| Note: # March - May 1997 hits. NA - Resource was not available. ND: Data are not available.
NC - Not Calculable. | |||||||||
CWeb is a far more popular means of access to these resources than CNet. Although Chaucer Name Dictionary and African American Women were both available on CNet from February 3rd, their usage on CNet in February was only 10 to 15 percent of that on CWeb. The Libraries' print copies of these reference books are lightly used, so these hits signify substantial expansion of use of these books.
4.3.1.2 Host Computers for Reference Book Use
A user location analysis acts as a proxy for user cohort for the early use data. We have grouped host computers into the following ten categories.
cul - computers in the libraries
cunix - in general on campus computers linked directly to a cunix server, also now the host computer for Granger's
cupress - computers at CUP
dialup - computers connected by dialup modem
english - computers in the English department
pols - computers in the Political Science department
rhno - computers on the residence hall network
sipa - computers at the School of International and Public Affairs
ssw - computers in offices and labs at the School of Social Work
other - computers at all other Columbia locations
The distribution of use of the five reference works supplied via CWeb across these categories is shown below. With the exception of the three Garland books, a very small share of the uses of these reference works occur on computers in the libraries; the Columbia community is taking advantage of the out-of-library access to these resources. As noted earlier, a large share of the use of The OED occurs from students' on campus residences (rhno host computers).
| Host Computer Type | |||
| cc | |||
| cul | |||
| cunix* | |||
| cupress | |||
| dialup | |||
| english | |||
| pols | |||
| rhno | |||
| sipa | |||
| ssw | |||
| other | |||
| Notes: * In the later part of this period, a Cunix server was given as the host computer for all uses of Granger's.
** Less than .5% | |||
[Table of Contents] [Next Page]