Birds, Bees, and EZIDs: Where Do CDL’s Persistent Identifiers Come from?
Leading up to UC-DLFx, CDL staff had been asked to explain how our services work with EZID and how the identifiers used in each system are related. At the conference we gave a brief presentation illustrating how Merritt, eScholarship, and Dash each work with EZID and which type of identifiers are used for each service. Below is a synopsis of that presentation:
Merritt, CDL preservation repository:
Merritt uses ARKs (from EZID) as the primary identifier for its digital objects and collections. A depositor can provide a DOI (and in fact any other form of ID) as a secondary “local” ID which can also be used to look up or update an object.
Once upon a time (like, up through about November 2016), Merritt would mint DOIs through EZID for certain collections: DataUP, Dash, SSCZO (Southern Sierra Critical Zone Observatory). The DOIs would resolve to the Merritt landing page, and users could update the DOI metadata by depositing updated DataCite XML files in Dash or Merritt.
Merritt will still accept DOIs as optional secondary identifiers, but getting the DOIs minted, assigning their landing pages, and managing their metadata is the responsibility of the depositor or the depositing system—for instance, Dash now mints DOIs before depositing datasets in Merritt, and provides their landing pages. The DataUP and SSCZO deposits have been folded into Dash.
eScholarship, UC’s Open Access and Repository Platform
eScholarship assigns an ARK to all primary content, using that to create the permalink for each item. We are scoping out a feature to assign identifiers to supplemental content and are strongly considering using DataCite DOIs for that, given the greater specificity in the metadata profile for non-textual material. We are also in the initial scoping and planning phase of assigning DOIs to all content that doesn’t have one.
In addition to the above, eScholarship journals can choose to have DOIs assigned to their content. eScholarship generates these automatically along with the CrossRef metadata, then passes both to EZID where the identifier and metadata are registered in EZID and then CrossRef.
Finally, all eScholarship content is submitted to Merritt for preservation, at which point Merritt assigns its own ARK to each item. ETDs are the only exception to this content flow, since they are initially deposited to Merritt by the campuses or via ProQuest and so are assigned a Merritt ARK first. eScholarship harvests ETDs from Merritt into eScholarship, assigning an eScholarship ARK as part of that process.
Dash, Data Publication Platform:
Dash is a data publishing platform that utilizes Merritt as a preservation repository and storage broker. When you publish (or version) data in Dash, EZID acts as a DOI broker to retrieve a DataCite DOI. This DOI is displayed on the Dash landing page in the citation. DOIs for Dash published datasets are DataCite DOIs. For non-UC campuses, Dash can work directly with DataCite for DOI minting.
Key Takeaways
- Wherever your object resides is where you should update the metadata. Ex. if you submit an article to eScholarship, even though you have an EZID identifier, utilize eScholarship to update the article metadata.
- Each CDL service is intended for a different type of research output, and each gives a stable Crossref or DataCite DOI or EZID Ark. Please continue to encourage researchers to link up these works by citing related work identifiers.