Skip to main content

Spotlight on AGUA Development: Metadata Validation

One of the major projects WEST undertook in 2021 to enhance AGUA was a project to improve metadata validation practices to support more rigorous data analysis by the AGUA Technical Team and to report back findings to WEST members to support local cleanup projects. To accomplish these goals, the AGUA team adopted a Metadata Validator developed by the Center for Research Libraries. WEST uses the CRL Validator to analyze Archivers’ disclosure files as well as unarchived holdings files submitted by WEST members ahead of the biennial collections analysis. The Validator reviews metadata in the contributor record and compares it against the OCLC WorldCat database to identify inconsistencies, missing data, and incorrect data. Reports are used by the Tech Team to identify critical errors that will hamper the collections analysis as well as non-critical errors that may be of interest to members. 

This analysis was performed using the unarchived holdings files submitted by WEST members ahead of the Cycles 12 & 13 collections analysis. WEST received files for 58 OCLC symbols with a total of 1,771,712 records. A total of 494,888 errors were detected by the validator, 307,727 of which were ‘critical’ (impacting the WEST analysis). 

See the full description of each problem with a brief analysis of the dataset: Analysis of CRL Validator Reporting – Bibliographic Records.docx