The Journal of American History
Issues

List of Issues

Web site Reviews

Subscribe to the JAH


Search the JAH at the
History Cooperative


Web Site Review

This review appears in the September 2005 issue of the JAH, 707–708.

Digital Early American Imprints, Series I. Evans (1639–1800)(access by subscription). Readex, a Division of Newsbank, Inc. <http://www.readex.com>. Reviewed Jan. 24, 2005, to March 31, 2005.

The online Evans digital collection, part of Readex's Archive of Americana, consists of more than 2.3 million page images of microform copies of more than 37,000 books, pamphlets, and broadsides—every known American imprint prior to 1800, as listed in Charles Evans's American Bibliography (14 vols., 1903–1959). The online Evans series includes neither books by American authors printed overseas nor overseas imprints popular in America, so it is not a complete body of what early Americans read, but it does contain a large portion of it. An optical character recognition (ocr) engine has been used to scan the microforms and translate the page images into ascii text. The ocr engine provides the main feature of the collection, the ability to perform full-text searches on the complete corpus. The collection is expensive, costing initially somewhere between $20,000 and $100,000 depending on the size of the subscribing institution, with an annual maintenance fee of $2,000.

The digital Evans series has been hailed as revolutionary and democratizing by some of those fortunate enough to be able to afford it. Others—including participants in H-Net's colonial and early American online discussion lists—have described it as part of the twenty-first-century theft of the commons. In my own case, the pricing structure of the Evans series and the other Archive of Americana offerings moves it well out of the range of the possible for a publicly funded state research university in the middle of the Pacific Ocean with only two faculty members working on anything related to early America. Without some gracious (and thus far imaginary) donor, our library simply cannot afford the price of such "democratization" within current budgetary constraints. To be fair, the Evans collection prices out to a little more than two dollars per title for our university. Just the same, finding more than $80,000 (the quoted price, including the trade-in value of the university's Evans series microcards) during a time when the university is losing and not replacing many humanities faculty members is a dim prospect.

Alternative pricing structures that take into account how many people would use the resource rather than the size of the institution would move the digital Evans series—and the rest of the Archive of Americana, once it is ready—more toward being democratizing in any serious sense of the word. For example, I could see writing a grant for a few hundred or even thousand dollars for single-user access to the resource for a week, a month, or some other set period of time. Such access could be priced about the same as a research trip to an archive.

The digital Evans series has some problems under the hood. First and most important is the somewhat erratic ocr engine. A search for "creole" gave back "[in-]crease," "credible," "people," "seele," "groote" (the latter two being German words), "credit," "[illegible]," "creek," "Greele" (a surname), "geese," and "treble," along with 11 instances of "creole," among the first twenty or thirty of 136 results. Such generous interpretations mean that few instances of "creole" will be missed, but there are a lot of false hits, so any statistical usage of the results would need to be carefully and laboriously checked for validity. The inaccuracy of the ocr is most likely why the promised—and powerful—feature of making the underlying text available in ascii form is missing. Any findings must be transcribed by hand rather than being copied and pasted. A second problem is that sessions time out without being saved. If one wants to explore the 8,670 instances of the word "freedom," she can forget about lunch unless it is at the computer. After fifteen minutes or so the session times out, saving neither results nor searches. A new search must be started, and the reader has to navigate back to the place where she left off, a time-consuming process. While complex Boolean searching is a boon, it is advisable to save search strings in a separate notepad in order to retrace one's path through the results if one wants to replicate a timed-out search exactly. Even then, the limits on an advanced search have to be reset each time, introducing a window for errors. While still much quicker than combing through microcards, the interface is slow, even on a fast connection.

Perhaps such a critical assessment of the digital Evans series sounds cranky, but Readex makes big claims and costs a lot, setting expectations high. Despite its problems, it would be revolutionary if made more accessible. The ability to do full-text searches on such a large corpus of materials changes the nature of what kind of historical research is possible, opening up avenues of inquiry that were only imaginable a few years ago. As it stands, though, the digital Evans series is a revolution for the privileged.

Richard Cullen Rath
University of Hawaii
Honolulu, Hawaii