Prosperity and Thrift Home Page

Building the Digital Collection

Project Phases | Documents | Photographs | Motion Pictures | Sound Recordings

Project Phases

The production of Prosperity and Thrift: The Coolidge Era and the Consumer Economy, 1921-1929 proceeded in two phases. The initial phase occurred in 1994-95 as part of the American Memory pilot project. Two factors then interrupted production: the need to determine the best course of action for the inclusion of unpublished or copyrighted content and the migration of the Library of Congress's electronic collections to the new medium of the World Wide Web. In 1998-99, production resumed. The legal advisor for the National Digital Library Program (NDLP) led an analysis of the copyright issues and the modifications necessary for display in the Web environment were made. The technical descriptions that follow make reference to work being carried out in these two distinct periods.

Documents

The documents included in Prosperity and Thrift comprise manuscripts and printed matter. Each document is reproduced as a set of facsimile page images and a searchable text. For 104 documents, the accompanying searchable text consists of a transcription of the full text, while in 178 cases, the accompanying guide text consists of a summary or abstract. In some instances, the guide text includes a table of contents for the work. The selection of items for full transcription was based on several criteria, including assessments of the importance of the document to the collection, the value of a searchable text for user navigation, and the cost of production. In general, longer works-- books with more than one hundred pages--were favored for transcription. Issues of periodicals, however, many of which exceed one hundred pages, were not transcribed. The project budget and production time frame did not permit the development of an approach to transcription that accommodated the complex layout of 1920s magazines, in which numerous articles and advertisements intertwine.

The documents were digitized by Systems Integration Group of Lanham, Maryland. The facsimile images were made first, followed by the transcription and encoding of the texts selected for full transcription. The image capture occurred at the Library and, in order to preserve the originals, bound works were scanned in their bindings. The preparation of the searchable texts occurred offsite, where a subcontractor rekeyed the documents from the page images. The texts have a transcription accuracy of about 99.95 percent and are marked up with SGML (Standard Generalized Markup Language). The online presentation offers access to the SGML versions of the texts for users with appropriate viewing software. The SGML markup produced in 1994-95 used the then-current version of the American Memory document type definition (DTD). This DTD was modified in 1996, however, and NDLP staff upgraded the marked-up texts in 1999. The online presentation of the texts also includes a version in HTML (HyperText Markup Language). This version was produced by the Library in an automated process. Since no special software is required, the HTML version is easier for most users to access. The guide texts were produced by NDLP staff in 1998-99 and exist only in HTML form.

Although a few document images were produced in 1998-99, most of the collection's documents were scanned in 1994-95 using the Xerox (Kurzweil) K5200 scanner. Like certain specialized photocopiers, this scanner has a book-edge design that requires a bound volume to be inverted and positioned along a beveled edge for scanning. Only one page is scanned at a time. The master or archival version of the images for most pages that consist of typography and line art is a 300 dpi bitonal image in the TIFF format, using ITU Group IV compression. Higher resolutions were not considered because there were no plans to produce new paper copies of these books. In any case, the exigencies of scanning the volumes in their bindings precluded the creation of high-resolution bitonal images of the type associated with digital preservation reformatting projects carried out in some university libraries.

In addition to the master images of typographic or line-art pages, the Library created additional image types to reproduce printed halftone illustrations. Printed halftones present special problems that result from interference between the spatial frequency of the halftone dot pattern and the spatial frequency applied by scanning and/or output devices. When the two frequencies combine, the interference between them manifests itself as moiré patterns that degrade the image.

Most of the images of printed halftones in books were produced on the Xerox K5200 scanner, which offers a diffuse-dithering algorithm that randomizes the scanner's pattern of dots to produce bitonal images in which the moiré pattern is suppressed or reduced. The algorithm adds speckles to white areas surrounding an illustration, however, adversely affecting the legibility of captions or other typography included in the same scan as the illustration. When the Xerox K5200's diffuse-dithering treatment is applied, the software creates files in the PCX format. Some of these files later migrated to the TIFF format in an automated process. The Library's diffuse-dithered random-dot-pattern images can be printed on a laser printer with good results, but do not rescale well for screen display.

Since the American Memory pilot team saw that the randomization approach was only partially successful in reproducing printed halftones, the 1994-95 production phase of Prosperity and Thrift offered an early opportunity to test grayscale scanning of printed matter. Thus a few items are reproduced in grayscale JPEG images. When this approach was adopted for this collection, the entire work, including exclusively typographic pages, was scanned in grayscale.

The Xerox K5200 scanner does not accept pages over 8.5 inches in width. Larger pages, including many of the periodical issues in the collection, were scanned on other tabletop scanners. The particular scanners available for the project, however, had no effective means for treating printed halftones. For this reason, the illustrations within many of the periodicals are less successfully reproduced than those in the books. It is worth noting that American Memory scanning projects from 1995 forward have used better methods for reproducing oversize pages.

The browser display images for all document pages are in the GIF format. The NDLP staff produces these images by processing batches of the master or archival images. When bitonal images are being processed, gray tones are added and the resulting image is blurred to mimic grayscale. Then the image is reduced in scale to fit the typical display monitor and sharpened to enhance legibility. When the source image is grayscale, only rescaling and sharpening are carried out.

Photographs

The digital reproductions of the 185 photographs in Prosperity and Thrift were produced by scanning 8 x 10-inch negatives that represent copies of prints in the collections of the Library's Prints and Photographs Division. During the 1994-95 production phase, a set of digital images was produced from these negatives with a nominal spatial resolution of 1024 x 768 pixels. From this original round of scanning, two images remain: the portraits of presidential physician Joel Boone and Federal Reserve Bank governor Charles Hamlin, both copied from the collections of the Manuscript Division. The remaining 183 images, from the holdings of the Prints and Photographs Division, have been replaced by higher-resolution scans. The replacement images resulted from a large-scale digitization effort undertaken by the Division during 1997-99 that has seen the systematic rescanning of about 130,000 copy negatives.

The images for the Prints and Photographs Division project were produced by JJT Incorporated of Austin, Texas, in two production series. The first series of about 100,000 items was scanned from full-frame images on 35 mm film, what might be called "re-copy" negatives. These long rolls of 35 mm motion-picture film served as intermediates in a videodisc project in 1990-91. The images on the film are negative because a reversal film stock was used when the 8 x 10-inch negatives were copied. The 35 mm film was used as a source for digitization because the process is very efficient and relatively inexpensive compared to rescanning the 8 x 10-inch negatives. But the use of multiple generations of analog film meant that some quality had been lost. A series of tests established that a scanning resolution of about 1500 x 1000 pixels extracted everything that these 35 mm frames had to offer, and this is the spatial resolution for the master files in this series. The JJT team captured some of these images using a Kontron (ProgRes) digital camera and others with the new MARC II camera system described below. The JPEG compressed images provided for service have a nominal resolution of 640 x 480 pixels.

Images in the first Prints and Photographs scanning series carry digital identifiers ("item numbers") that begin 3a and 3b. About thirty of the Prosperity and Thrift selections fall in this series.

The Prints and Photographs Division's second series of about 30,000 images are being scanned directly from the 8 x 10-inch negatives. These are copy negatives that were produced after the end of the videodisc project and for which no 35 mm copies exist. The JJT team is using their new overhead-capture MARC II digital camera and is producing master images with a nominal resolution of 4000 x 3200 pixels. The camera's initial capture is at 12 bits per pixel; image processing is then executed at 16 bits per pixel. When saved for delivery, the tonal resolution reduces to 8 bits per pixel. There are two JPEG service images in this series: one at a nominal 640 x 480 pixels and a second at 1024 x 768. These are created by JJT in a post-processing step.

Images in the second Prints and Photographs scanning series carry digital identifiers ("item numbers") that begin 3c. About 150 of the Prosperity and Thrift selections fall in this series.

Motion Pictures

The original motion pictures included in Prosperity and Thrift are 35 mm prints and these were transferred to BetaSP videotape at Roland House in Arlington, Virginia. In the video mastering process, the playback speeds were adjusted to present the appearance of natural motion to the greatest degree possible. The film transfers for Flash Cleaner, Warner Corsets, and Onward Flour were made at 16 frames per second; for Buy an Electric Refrigerator at 18 fps; and for Visitin' Around Coolidge Corners at 22 fps. The videotapes were then sent to Crawford Multimedia in Atlanta, Georgia, which made the MPEG and QuickTime digital files. The MPEG files have a spatial resolution of 320 x 240 pixels and run at a nominal rate of 30 fps, i.e., the frame rate of video. The QuickTime files have a resolution of 160 x 120 pixels and run at 10-12 fps; some video frames were omitted to keep the segments to the proper time and speed. In 1999, NDLP staff created a set of RealMedia streaming video files for the collection by reprocessing the QuickTime files.

In order to facilitate downloading by American Memory users, the file sizes of the best quality (and hence largest) MPEG versions have been limited to 40 megabytes or less, which represents a running time of about four minutes. Since Visitin' Around Coolidge Corners has a duration of a little more than six minutes, it has been divided into segments with running times of 2:19 and 3:50 minutes.

Sound Recordings

The seven sound recordings included in Prosperity and Thrift are a set of Calvin Coolidge speeches selected from the online collection American Leaders Speak: Recordings from World War I and the 1920 Election. In some cases, multiple takes of the same speech have been included. The original recordings are 78 rpm phonograph records. These were copied to digital audio tape (DAT) in the Recording Laboratory of the Library's Motion Picture, Broadcasting, and Recorded Sound Division. The DAT tapes were used by NDLP staff to produce digital computer files in WAVE and RealAudio formats. The WAVE files are sampled at 22.05 kHz using 16 bit-words. The RealAudio files are formatted for users with modems capable of 14.4 kilobits or more per second.


Prosperity and Thrift Home Page