The California Digital Library exists to support the University of California community’s pursuit of scholarship and extend the University’s public service mission. The Merritt curation repository and its companion research data publication portal, Dash, are core CDL services available for use by all members of the UC community for managing, preserving, publishing, and sharing the University’s valuable digital content. While Dash gives the appearance of being a repository in its own right, it actually is just a thin overlay layer integrated with Merritt and providing alternative self-service interfaces and scholar-focused views of selected research data collections. All data managed in Dash is automatically preserved in Merritt.
Merritt and Dash comply with the general CDL terms of service as well as the following policy terms.
Merritt and Dash administrators may be contacted at firstname.lastname@example.org, which automatically opens in a new issue in CDL’s internal ticketing system. To report an urgent problem, call the CDL Help Line at (510) 987-0555.
Merritt and Dash are available on a nominal 24x7x52 basis. The current status of Merritt and Dash availability can be found on the CDL system status page.
Whenever possible, major service outages for purposes of preventative maintenance and periodic enhancement are scheduled outside of normal business hours, Monday – Friday, 8:00 AM – 5:00 PM PT, and announced two weeks before the scheduled outage. In some cases unanticipated conditions may require immediate intervention without prior announcement in order to prevent damage or loss to managed content. However, Merritt’s architecture has been carefully designed for robust fault-tolerance to minimize this necessity. Most diagnostic and maintenance activities can take place without any service interruption.
Merritt and Dash comply with CDL’s and UC’s accessibility policy, which promotes an accessible IT environment at the University of California to help ensure that as broad a population as possible may access, benefit from, and contribute to the University’s electronic programs and services.
By contributing to Merritt or Dash, content owners and curators are acknowledging that they have followed all applicable laws, regulations, policies, ethical concerns, and disciplinary best practices regarding the creation and acquisition of that content, including obligations regarding intellectual property rights, privacy, IRB review, and accepted norms of scholarly discourse, and that they assign to CDL the non-exclusive, perpetual, revocable right to save, copy, enhance, federate, create derivatives for purposes of long-term preservation, and provide access to contributed content, subject to curatorially-designated access controls. Contributors exhibiting inappropriate behavior will be subject to loss of user privileges.
Merritt and Dash are not appropriate repositories for managing content including clinical or personally identifiable information (PII) whose disclosure would constitute a violation of HIPAA/HITECH, FERPA, or other similar statutory, regulatory, or ethical regimes. Content containing PII must be redacted or anonymized prior to submission to Merritt or Dash.
Merritt and Dash are operated on a partial cost-recovery basis, as described in the Pricing section below. Content that is not paid for on a timely basis will be considered abandoned and may be subject to being de-accessioned.
Contributors may request a bulk export of their content, for which CDL may impose a one-time fee to cover the reasonable costs of the export.
By using Merritt and Dash to search for, find, or retrieve managed content, users are acknowledging that they will follow all applicable laws, regulations, policies, ethical concerns, and disciplinary best practices regarding the use of that content, including obligations regarding intellectual property rights, privacy, and accepted norms of scholarly discourse. The latter includes an obligation to provide complete citation to Merritt or Dash in any redistribution of the content or publications and presentations incorporating an analysis of or substantively based upon the content. Users exhibiting inappropriate behavior will be subject to loss of user privileges.
The CDL accepts, manages, and provides access to digital content in in order to support the University’s research, teaching, learning, and public service mission. The CDL will not exploit managed content in profit-generating activity without express permission of its legal owners.
The CDL makes reasonable efforts to provide managed content with the highest level of preservation assurance that is consistent with the form, structure, and packaging of the content, the degree to which that it is accompanied by authoritative and comprehensive metadata, the availability of appropriate tools, and other organizational priorities. Note that this implies a continuum of preservation outcomes dependent upon the nature of the content. At a minimum, however, CDL is committed to providing bit-level preservation of all content. CDL offers consultation and guidance on ways to acquire or create digital content in a manner that is most amenable to the highest level of future preservation service.
Merritt maintains a complete change history of managed content as it may evolve over time. The repository relies upon a primary preservation strategy of replication of content to geographically-dispersed sites and technological heterogeneity. Merritt incorporates a process of continual verification of cryptographic message digests of all content replicas to detect and correct any bit-level damage. The design, implementation, and operation of Merritt are consistent with the community-accepted standard ISO 14721 Open Archive Information System (OAIS) reference model.
Any changes to the Merritt fee structure will be provided to content owners at least 60 days prior to the effective date of the change.
In the event that CDL is unable or unwilling to continue operation of Merritt, it will make reasonable efforts to find another curatorial organization, within or outside the UC system, willing to take on custodial responsibility for all managed content. If that is not possible, CDL will return all content to its contributors at no added expense.
Merritt and Dash will accept submissions in any genre, format, and package. CDL believes that the most significant impediment to the future use of managed content is not insufficiently-complete curation, but the lack of collection and management under an appropriate and proactive stewardship regime. Consequently, Merritt and Dash have been designed and are operated so as to maximize opportunities for self-service deposit of digital content. Once under secure management, this content is susceptible to ongoing review and enrichment by campus-based curators, collection managers, and RDM specialists to maintain and increase its curatorial value and provide a higher level of assurance of its ongoing availability and usability.
Dash data contributors are encouraged to follow the UK Data Service recommendations on formats and the DataONE recommendations regarding the form, structure, and description of data. CDL provides Guidelines for Digital Objects, that can be used as recommendations for material contributed to Merritt.
Persistent Identification and Citation
All objects managed in Merritt are assigned unique, persistent Archival Resource Key (ARK) identifiers using CDL’s EZID service. All Dash datasets also receive Digital Obect Identifiers (DOIs) from DataCite. All Merritt and Dash object landing pages prominently display the object’s actionable persistent identifier(s) for use in citations. Landing pages in Dash feature pre-formatted citations conforming to the 2014 FORCE11 Joint Declaration on Data Citation Principles.
Merritt and Dash are strongly versioned. Any changes to data or metadata automatically results in the creation of a new version of the data object. Versioning relies on file-level backwards deltas to minimize duplicative file storage. Individual file-level components are never edited or replaced; new versions of files are added as components of the new dataset version. All previous versions can be retrieved through the Merritt and Dash UI and API.
Federation and Internet Search
Content submitted to curatorially-designated collections may be federated with external systems and services to enhance long-term preservation and accessibility. The Merritt collections underlying ONEShare, a public data repository for the earth and environmental sciences open for public submission by UC- or non-UC-affiliated researchers, are federated with the DataONE network, an NSF-funded data grid promoting preservation of scientific data. Descriptive metadata associated with datasets that have been assigned DOIs by Merritt or Dash is registered with DataCite, where it is indexed for online search. The Dash research portal implements affirmative search engine optimization (SEO) techniques to ensure that managed content is indexed by well-known search engines, such as Bing, Google, and Yahoo, for enhanced opportunities for internet search and discovery. Merritt is also open to indexing by these search engines.
Submission Agreements and Licensing Terms
Data submitted to Dash are associated with standard Creative Commons CC-BY licenses or CC0 public domain dedications covering terms of access and acceptable use. Material contributed to Merritt is covered by the terms of campus-level agreements granting CDL a non-exclusive, perpetual, revocable license to save, copy, enrich, federate, create derivatives, and, if so curatorially-designated, distribute for non-commercial use. Access to content contributed without explicit associated terms is determined by the curatorially-assigned access control rules for the collection of which the content is a member, which permit designation for either authenticated access and use only by a restricted set of individuals, or unconstrained public access and use.
Restricted Access During Peer Review
Dash datasets underlying articles being peer-reviewed may be designated for access restrictions for up to a six month period. During that time, no public data downloads are allowed, although certain minimal descriptive information — contributor(s), title, and date — will be presented on dataset landing page.
The procedures for responding to DMCA-compliant take-down requests are defined as part of the CDL’s general terms of service.
Merritt and Dash operate on a partial cost-recovery basis. There is no service fee for their use , but CDL recoups its costs for provisioning preservation storage, which is typically billed at the campus level. The current nominal pricing is $650/TB/year, but this is pro-rated to reflect actual daily storage usage. Usage accounting is based on the sum total of byte-days of usage over the year, assessed at $0.00000000000178 per byte-day ($650/TB/year ÷ 1,000,000,000,000 bytes/TB ÷ 365 days/year). The reliance on byte-day accounting means that contributors do not need to be concerned about the timing of their deposits. 1 TB deposited on the first day of a billing year and saved for the entire year will accrue a cost of $650 (1 TB * 365 days * 1,000,000,000,000 bytes/TB * $0.00000000000178/byte-day). That same 1 TB deposited on the last day of the billing year will cost only $1.78 (1 TB * 1 day * 1,000,000,000,000 bytes/TB * $0.00000000000178/byte-day).
The billing year is aligned with the University of California fiscal year, July through June. Billing for the previous year’s storage usage is billed early in the subsequent year, and is payable within 60 days of billing.
Third-party Service Providers
Merritt and Dash use CDL’s EZID service for assigning, managing, and resolving object ARK identifiers and DataCite, an international non-profit membership organization, of which CDL is a founding member, for assigning, managing, and resolving DOIs.
Merritt relies on external storage providers for primary and replication storage in its preservation system. One storage provider within UC is the San Diego Supercomputer Center (SDSC)’s cloud storage. The service level agreements defining the terms of the contractual arrangements are available at:
SDSC’s cloud storage is routinely subject to Nessus scans, a professional auditing service that probes for vulnerabilities and malware.
Merritt also relies on a non-UC commercial service provider, Amazon web services (AWS), for preservation storage, using S3 and Glacier, database hosting using RDS, and virtual server hosting using EC2. The service level agreements defining the terms of the contractual relationship between CDL and Amazon are available at:
- Amazon AWS customer agreement
- Amazon AWS S3/Glacier service level agreement
- Amazon AWS EC2 service level agreement
AWS complies with a number of regulatory and professional IT standards and certification programs, including CSA, FERPA, FISMA, HIPAA, ISO 9001, 27001, 27017, SOC 1, 2, 3, and others: AWS Compliance.
In addition, Merritt relies on the University of New Mexico (UNM) for preservation storage for ONEShare collections, providing open data sharing. The memorandum of understanding defining the terms of the relationship between CDL and UNM is available at:
Dash relies on ORCID, an international non-profit membership organization, of which CDL is a member, for managing unique, persistent research identifiers, and Crossref’s Funder Registry, for funding agency identifiers. The membership agreement defining the terms of the relationship between CDL and ORCID is available at:
The Funder Registry data are available as CC0 Public Domain via an open public API requiring no prior contractual relationship.
All Merritt and Dash content is already subject to Merritt-managed replication to two independent storage locations. In addition, curatorially-selected Merritt content, including all Dash managed datasets, are contributed to the Digital Preservation Network (DPN), of which CDL is a founding member, where it is subject to additional replication to three geographically-dispersed and technologically heterogeneous repository nodes on the DPN network, selected from amongst Academic Preservation Trust (APT), Duracloud, HathiTrust (HT), Stanford Digital Repository (SDR), and Texas Preservation Node (TPN). The DPN business and sustainability plan is based upon an endowment pricing model intended to fund preservation management fully over a 20-year term. At the end of that term, CDL will reassess the data in terms of its ongoing value and impact to determine its suitability for DPN renewal. In the event that CDL decides not to renew, DPN submission agreements provide for the legal transfer of stewardship responsibilities to the DPN central organization or any of its individual members. The membership agreement defining the terms of the relationship between CDL and DPN is available at:
CDL makes no representations or warranties with respect to Merritt or Dash, and disclaims any liability arising out of their use. Neither the CDL nor Merritt or Dash users shall be liable for any indirect, special, incidental, punitive or consequential damages arising out of that use. Liability for direct damages is limited to the dollar amount of the fee paid for the service. By making use of Merritt or Dash, users are indemnifying, defending, and holding harmless CDL, its officers, employees, and agents from and against any liability and damages, including any reasonable attorney’s fees, that arise from that use. No limitation of liability set forth elsewhere in these terms applies to this indemnification; further, this indemnification shall survive the termination of these terms.