Data sources

A Data source is a service where published material (metadata and files) are stored, preserved, and made discoverable and accessible. A data source is described by the EOSC Profile for data sources.

Example:

Episciences is an overlay platform supporting the management of open-access journals on top of the Open Access repository HAL. In this context, episciences.org is a publishing Venue (journal, open access, open peer review), while HAL is a Data source. Articles published via episciences.org will be therefore linked to the respective journal (publishing Venue) and the data source HAL. However, HAL is also a publishing Venue for researchers that are directly uploading their Research product. More specifically, a publishing Venue with peer-review and some support for metadata curation. In this case, a Research product will be linked to HAL both as a publishing Venue and as a Data source.

Note

Each Research product must be associated with its publishing Venue and its Data source.

This section describes the metadata fields for a Data source.

local_identifier

String (mandatory): Unique code identifiying a Data source in the SKG (if any, otherwise “stateless identifier”).

1 "local_identifier": "123"

identifiers

List (optional): A list of objects representing external identifiers for the entity. Each object is structured as follows.

  • scheme String (mandatory): The scheme for the external identifier (e.g., a DOI).

  • value String (mandatory): The external identifier.

1 "identifiers": [
2     {
3         "scheme": "doi"
4         "value": "https://doi.org/..."
5     }
6 ]

name

String (mandatory): Name of the Data source.

1 "name": "Zenodo"

submission_policy_url

String (recommended): This policy provides a comprehensive framework for the contribution of research products. Criteria for submitting content to the repository as well as product preparation guidelines can be stated. Concepts for quality assurance may be provided.

1 "submission_policy_url": "https://..."

preservation_policy_url

String (recommended): This policy provides a comprehensive framework for the long-term preservation of the research products. Principles aims and responsibilities must be clarified. An important aspect is the description of preservation concepts to ensure the technical and conceptual utility of the content.

1 "preservation_policy_url": "https://..."

version_control

Boolean (optional): If data versioning is supported: the Data source explicitly allows the deposition of different versions of the same object

1 "version_control": True

persistent_identity_systems

List (recommended): The persistent identifier systems that are used by the Data source to identify the ProductType it supports.

  • product_type String (mandatory): The Product type to which the persistent identifier is referring to. Follows the EOSC vocabulary Research Product Type.

  • pid_schemes List (mandatory): the list of persistent identifier schemes used to refer to ProductTypes. Each elements must be drawn by the EOSC vocabulary Persistent Identity Scheme.

1 "persistent_identity_systems": [
2     {
3         "product_type": "Research Literature",
4         "pid_schemes": ["DOI", "Handle"]
5     }
6 ]

jurisdiction

String (mandatory): The property defines the jurisdiction of the users of the Data source, based on the vocabulary Jurisdiction.

1 "jurisdiction": "National"

data_source_classification

String (mandatory): The specific type of the Data source based on the vocabulary Data Source Classification.

1 "data_source_classification": "Journal Archive"

research_product_type

List (mandatory): The types of OpenAIRE entities managed by the Data source, based on the vocabulary Research Product Type.

1 "research_product_type": []

thematic

Boolean (mandatory): Boolean value specifying if the Data source is dedicated to a given discipline or is instead discipline agnostic.

1 "thematic": False

research_product_license

List (recommended): Licenses under which the research products contained within the Data source can be made available. Repositories can allow a license to be defined for each research product, while for scientific databases the database is typically provided under a single license. Each element in the list is structured as follows:

  • Research Product License Name String (mandatory):

  • Research Product License URL String (mandatory):

1 "research_product_license": [
2     {
3         "name": "..."
4         "url": "https://..."
5     }
6 ]

research_product_access_policy

List (recommended): List of terms following vocabulary: COAR Access Rights 1.0.

1 "research_product_access_policy": ["open access"]

research_product_metadata_license

List (recommended): Metadata Policy for information describing items in the repository: Access and re-use of metadata. Each element has the following properties:

  • name String (mandatory):

  • url String (mandatory):

1 "research_product_metadata_license": [
2     {
3         "name": "..."
4         "url": "https://..."
5     }
6 ]

research_product_metadata_access_policy

List (recommended): List of terms following vocabulary: COAR Access Rights 1.0.

1 "research_product_metadata_access_policy": ["open access"]