Skip to main content

UC’s Internet Archive-digitized Books Now Loading into the HathiTrust

By Heather Christenson, CDL HathiTrust Project Manager

Through a collaborative effort between the California Digital Library and the University of Michigan, the University of California volumes digitized by Internet Archive are now flowing into the HathiTrust Digital Library.  This accomplishment marks the achievement of a long-held goal, to unite the mass digitized collections from across the UC Libraries in a common location for digital preservation and access.

The Internet Archive-digitized volumes will be accessible and viewable within HathiTrust via the same mechanisms as the Google-digitized volumes: catalog search, full-text search, and a pageturner display.   They will also be discoverable in the Next Generation Melvyl Pilot in the same manner as the Google-digitized items.  The focus of the Internet Archive digitization has been public domain volumes, so these items will be presented in full view within the HathiTrust.

Very appropriately, the first Internet Archive-digitized volume to go into the HathiTrust is entitled “The Dawn of All”:

Other examples include:

HathiTrust is currently loading collections digitized by Internet Archive from SRLF and UCLA. The initial set of approximately 97,000 books will be loaded over the coming weeks.  UC Davis, NRLF, and UC Berkeley Bancroft Library collections will follow.

Once again, many thanks go to the staff and students at the UC libraries and regional library facilities (NRLF and SRLF) who made their collections available for scanning.    And special thanks go to the core project team who persevered in the pioneering and technically challenging development of the “Internet Archive ingest” process: Lynne Cameron, Heather Christenson, Stephanie Collett, Paul Fogel, and Andy Mardesich at CDL, and Shane Beers, Jessica Feeman, Chris Powell, Tim Prettyman, Jon Rothman, Cory Snavely, John Weise, and Jeremy York at the University of Michigan.

The recent HathiTrust March Update contains more technical details:

For an overview of the collections digitized via UC’s mass digitization work with the Internet Archive, please see the related CDLINFO post:

More information on the UC Libraries’ mass digitization projects can be found on the CDL web site