The Basics of findability and interoperability of digital collections

The Basics define "findability" as ‘a condition in which digital objects and its metadata are available, uniquely identified and reusable by humans and machines.' Using the Basics of findability for  your digital collections, you ensure that your digital collections (for which usually substantial investments are already made) are accessible to a wider and more diverse group of users.

Implementing the guidelines of the Basics will make sure that digital objects can be found by search engines, persistent references can be made  to digital objects and that the objects can be used on platforms other than your own website (think of initiatives like Europeana and other regional, thematic or national portals). Finally, the Basics are a first step to publish your collections as linked (open) data.

Four key concepts

The Basics of findability are divided into four key concepts:
  1. Identification of data: the data have an unique, and preferably, sustainable identifier.
  2. Accessibility of data: the data are accessible through the Internet.
  3. Search engine readability of data: the data are presented in such a way search engines can index the data.
  4. Reuse of data: the data can (easily) be harvested and technically reused by third parties ( for instance  through linked data) and linked to other data sets.
The concept of 'data' refers both to the digital objects and the metadata.

1. Identification of data

The first step is to create an unique identifier for the digital object, the original analogue object if available and its metadata.  This process is divided into three stages:
  1. In the first phase you assign every digital object with a unique identifier. This means that you provide every digital object (within the institution) with a unique number or code.
  2. In the second phase, you create a URI ( Uniform Resource Identifier ) for each digital object. This is a unique reference to a digital object. On the internet,  URI’s are often represented as  a URL (Uniform Resource Locator: a location reference on the server), a URN (a name reference, not tied to a physical location on a server) or a combination of both. A URI is the starting point for publishing  your data as linked (open) data .
  3. The third phase you make sure your URI´s are persistent.  The URI exists as an independent reference and will remain persistent, even if there are changes in the actual location of the digital source.  The file location and the URI are connected through a table. This process is called ‘resolving’ or ‘redirecting’.  There are currently a number of methods and standards available for persistent identification.
The Basics dictate no specific standard for persistent identification, but advices to use one of the standards. 

2. Accessibility of data

In order to make the data accessible on the Internet, it must be able to convey the data by using the HTTP - (or secure HTTPS) protocol. You can also use the FTP protocol, especially for enabling the download of larger datasets.

3. Search engine readability of data

A few extra steps are needed to make the digital collections (so not only the website, but also  its underlying databases) readable for search engines. As a minimum requirement, there should be a landing page for every digital object. In this way search engines can identify the all the available digital objects.  This can either be a static page or a dynamic page. The Basics advice to use one of the options below, or a combination of both:
  1. Using Sitemaps: the Sitemaps protocol produces a list of URL’s and metadata for search engines to index the website and the underlying database.
  2. Using hypertext links: URL’s that are reached through hyperlinks can be indexed (spidering) by a search engine. You can, for instance, divide the collection into sections, sub-headings and so forth to enable browsing.

4. Reuse of data

The Basics contain two requirements to enable technical reuse of data:
  1. Publish your data in a structured and open format to make it interoperable. An obvious choice would be to use (meaningful) XML (preferably validated XML through an XSD or DTD file).
  2. Use Dublin Core Metadata Element Set for the interoperability of metadata and use at least the limited set of 15 fields (simple Dublin Core) that provides a common ground for all cultural heritage institutions.
On top of these requirements THE BASICS also recommend to use:
  1. The OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) enables automatic retrieval from your information systems. The protocol allows automatic harvesting from various sources. OAI-PMH (together with Dublin Core) is an important and minimum requirement to make your data available  to aggregators or portals such as Europeana. 
  2. Alternative and specialized exchangeable metadata standards such as EAD (for archival collections) and LIDO (for museum collections), MODS or MARC XML (for library collections) are recommended for the exchange of a rich data set.
  3. An alternative for the use of XML is JSON. It is based on Javascript and enables a more low-key exchange (in comparison to XML). JSON is used in many open data projects.
  4. Another possibility for sharing data is the use of  RESTful API’s. This is a web service that uses the REST protocol and with the API it is possible to reuse the dataset (or parts of it) in a different context.
  5. RDF (Resource Description Framework) to publish your data in RDF triples. This is one of the key ingredients to linked data.

Licenses for reuse of data

The reuse of data also has a legal component, but the Basics of findability only focusses on the technical side of reuse.  A question like “to which extend do I want to open up my collections for reuse?” will be addressed in the soon to be published Basics for copyright.  


The following guidelines are the minimum standards:  

Liability and contribution to the Basics

This text is a revised version of the Basics. The first version was written in 2007 and reviewed in 2013 during a meeting with Dutch experts working in the field of digital heritage. Professionals are also invited to comment on this text and share their experience with the Basics through or by emailing us:  

