Further Acknowledgements
Section 4 and Section 6 of this document were created with the help of the Live OWL Documentation Environment (LODE) service (available on Github).
Copyright © 2016 . This document is available under the W3C Document License. See the W3C Intellectual Rights Notice and Legal Disclaimers for additional information.
The DataID core ontology defines concepts and properties to describe simple and complex datasets in an interoperable way. DataID expands the Data Catalog Vocabulary DCAT with capabilities to describe dataset hierarchies, fine-grained technical details of datasets, dataset permissions, dataset distributions and machine-readable licensing information. The descriptions of provenance information are based on the PROV Ontology PROV-O and include relations between datasets and agents, such as persons or organisations, with regard to their rights and responsibilities. In addition, DataID core provides the means to describe inter-dataset relationships and versioning of datasets and metadata. DataID core offers a well structured, comprehensive format to approach a unified view of dataset metadata and constitutes the core of a metadata system built around this ontology.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.
By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.
Section 4 and Section 6 of this document were created with the help of the Live OWL Documentation Environment (LODE) service (available on Github).
This document describes the set of classes, properties, and restrictions that constitute the DataID core ontology in OWL2 Web Ontology Language encoding OWL2. This ontology specification provides the essential concepts to provide structured dataset metadata for any domain, type of data or dataset structure. Particular emphasis was placed on maintaining the Extensibility of general vocabularies like DCAT (which is imported) and preserving Interoperability with other existing dataset metadata formats (ontologies, XSLT schemata etc.).
The DataID ecosystem is comprised of multiple ontologies arranged around the DataID core ontology, which is introduced in this document.
DataID core is a midweight ontology that can be adopted in a wide range of scenarios. DataID core reflects the essential concepts and properties needed to describe a dataset and all its manifestations comprehensively, with a focus on metadata for Provenance, Licensing and Access of data. To achieve this, well-established ontologies (namely DCAT and PROV-O) are imported and innovative vocabularies with sufficient maturity (such as VOID and LEXVO) are reused throughout this ontology. The second focal point of this vocabulary is to qualify relations between Agents and (dataset) Entities, by introducing a flexible management of roles, actions and responsibilities. A collection of individuals of roles and actions were added as a common example of how to model a file based dataset environment. All of these instances can be easily replaced, fitting the use case at hand.
DataID core was constructed with the compliance to OWL2 RL profile OWL2 Profiles in mind. This is achieved with the exception of the 5 axioms of the Prov Ontology pointed out in the Prov-O specification. With one exception (see dataid:Superset), the DataID core ontology does not enforce cardinality restrictions. This decision was made mainly to conform with OWL 2 RL as well as easing the extension of DataID core. To define a suitable profile for a domain or use case, the authors suggest using SHACL SHACL.
This document is based on the following publications:
The DataID ontology was created in the first place to satisfy the sophisticated need for detailed dataset metadata in the general context of DBpedia and its periodically released datasets in particular. The general lack of specificity, in most dataset metadata available, led to the extension, modularization and refinement of the DataID ontology to fit any given usage scenario, for any type of data (introduced here in version 2.0.0).
This document is aimed at both dataset publishers (people involved in maintaining, administering and hosting datasets), and data users (people involved in finding, querying, crawling and indexing datasets).
Readers of this document should be familiar with the core concepts of RDF [RDF-PRIMER], RDF Schema [RDFS] and OWL 2 Web Ontology Language [OWL2]. Knowledge of the Turtle syntax [TURTLE] for RDF is required to read the examples. Some knowledge of widely-used vocabularies (Dublin Core [DCTERMS], Friend of a Friend [FOAF], Data Catalog Vocabulary [DCAT], PROV Ontology [PROV], Vocabulary of Interlinked Datasets [VOID], ) is also assumed.
The table below indicates the full list of namespaces and prefixes used in this document and for all examples.
Prefix | Namespace |
---|---|
dataid | http://dataid.dbpedia.org/ns/core# |
dataid-ld | http://dataid.dbpedia.org/ns/ld# |
datacite | http://purl.org/spar/datacite/ |
dcat | http://www.w3.org/ns/dcat# |
dct | http://purl.org/dc/terms/ |
foaf | http://xmlns.com/foaf/0.1/ |
geonames | http://www.geonames.org/ontology# |
lit | http://www.essepuntato.it/2010/06/literalreification/ |
lvont | http://lexvo.org/ontology# |
odrl | http://www.w3.org/ns/odrl/2/ |
owl | http://www.w3.org/2002/07/owl# |
prov | http://www.w3.org/ns/prov# |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs | http://www.w3.org/2000/01/rdf-schema# |
sd | http://www.w3.org/ns/sparql-service-description# |
skos | http://www.w3.org/2004/02/skos/core# |
spdx | http://spdx.org/rdf/terms/# |
time | http://www.w3.org/2006/time# |
vann | http://purl.org/vocab/vann/ |
void | http://rdfs.org/ns/void# |
xsd | http://www.w3.org/2001/XMLSchema# |
The following keywords are used throughout the text without any further explanation.
In general, keywords such as MediaType (and their plural form) refer to instances of the concept with the same name, or to named individuals in the DataID core ontology. This is only true for concepts of this ontology. When referring to a Distribution, an instance of dataid:Distribution is addressed and not, for example, the concept with the same name in the DCAT DCAT vocabulary. There is one exception: Entity refers to an instance of the concept prov:Entity. It is generally used in the context of this document to summarise all instances of concepts in the DataID core ontology which are subclasses of prov:Entity: dataid:DataId, dataid:Dataset and dataid:Distribution.
When referring to a concept or property itself (and not an instance), its name together with the appropriate prefix (e.g. dataid:authorizedFor) is used instead.
The keyword Sub-Dataset is used to point out a Dataset which is a sub-dataset of a different Dataset, while Superset is a representation of multiple Sub-Dataset.
The five important aspects of dataset metadata (see Section 2) are also adopted as keywords: Provenance, Licensing, Access, Extensibility, Interoperability.
There are multiple interpretations of the word/acronym DataID depending on the context. It can refer to a DataID metadata document, resulting from the appliance of the described ontology to one or more datasets, resulting in a collection of RDF statements based on this ontology. Or it is used to name an instance of the concept dataid:DataId, meaning the entry into a dcat:Catalog, the most abstract entity in every DataID. The authors will be explicit and use the terms DataID document and DataId resource (or instance, entity) in the remainder of this submission.
The keywords MAY, MUST, SHALL, MUST NOT, RECOMMENDED, and SHOULD are to be interpreted as described in RFC2119.
Since its introduction, the DCAT DCAT vocabulary has been widely adopted as a foundation for dataset metadata in research, government and industry [needs at least one link to back this up]. The very general approach adopted by the authors of DCAT allows for portraying any given (digital) object with this ontology. Extending DCAT is very easy and mappings to other metadata formats are not difficult to achieve.
Conversely, the general approach of DCAT, in combination with the Dublin Core vocabulary, is often too imprecise where specificity is needed.
A list of more pressing issues resulting from this impreciseness:
Similar findings were concluded at the W3C/VRE4EIC workshop Smart Descriptions & Smarter Vocabularies (SDSVoc) in 2016 [reference to the proceedings (not yet published)]. As a result of lacking specificity, current representations of datasets with DCAT are often not contributing to the main benefits for publishing data on the web DWBP : Reuse, Comprehension, Linkability, Discoverability, Trust, Access, Interoperability and Processability. This, in turn, amplifies broader problems with published datasets, especially in the open data community, reflected by the Open Data Strategy, defining the following six barriers for “open public data”, proposed by the European Commission in 2011:
A basis for solving those problems is sufficiently detailed and structured metadata. The following short list of important aspects of dataset metadata focuses on those shortcomings of DCAT which have to be solved to break down the barriers to open public data.
Provenance: A crucial aspect of data, required to assess correctness and completeness of data conversion, as well as the basis for the trustworthiness of the data source (no trust without provenance).
Licensing: Machine-readable licensing information provides the possibility to automatically publish, distribute and consume only data that explicitly allows these actions.
Access: Publishing and maintaining these kinds of metadata together with the data itself serves as documentation benefiting the potential user of the data as well as the creator by making it discoverable and crawlable.
Extensibility: Extending a given core metadata model in an easy and reusable way, while leaving the original model uncompromised expands its application possibilities fitting many different use cases.
Interoperability: Interoperability with other metadata models is a hallmark for a widely usable and reusable dataset metadata model.
Improving the portrayal of Provenance, Licensing and Access, while maintaining the easy Extensibility and Interoperability of DCAT, are the linchpin objectives in our effort to present a comprehensive, extensible and interoperable metadata vocabulary.
In accordance with the main goals of the ALIGNED project ALIGNED, the DataID core ontology was created:
While this section does not directly describe the DataID core ontology, it serves as a backdrop, necessary to understand certain design decisions made in the core ontology.
The former DataID ontology (version 1.0.0) was modularised into a multilayer composition arranged around a single core ontology. This was necessary to preserve Extensibility and Interoperability, as the vocabulary was growing due to a plethora of requirements of different use cases. The scaling approach adopts principles of the modular programming technique, separating concepts and properties of a large ontology into independent, interchangeable modules, specialised to fit common use cases, dependent only on DataID core.
The DataID Ecosystem is a suite of ontologies comprised of DataID core and its extensions, which are summarised in the common (or mid) layer of the ecosystem (see sphere model below). In a wider sense, this ecosystem is extended by every ontology importing DataID core to satisfy a specific use case by adding additional knowledge (represented by the outermost layer of the model).
Alongside DataID core multiple extensions were created, to satisfy different use cases in a reusable manner, which are part of the mid-layer of the DataID Ecosystem. The onion-like layer model below illustrates the different types of ontologies in the DataID Ecosystem.
DataID core provides the basic description of a Dataset and serves as a foundation for all extensions in the mid-layer or use case-specific ontology extensions.
Linked Data extends DataID core with many concepts of the VOID vocabulary VOID and some additional properties specific to Linked Open Data (LOD) Datasets.
Activities & Plans provides provenance information of activities which generated, changed or used Datasets . The goal is to record all activities needed to replicate a Dataset as described by a DataID document. Plans can describe which steps(activities, precautionary measures) are put in place to reach a certain goal. This extension relies heavily on the Prov ontology PROV-O.
Statistics will provide the necessary measures to publish multi-dimensional data, such as statistics about Datasets or surveys as a Dataset. This recently created ontology will be based on the Data Cube Vocabulary DATA-CUBE.
Other common extensions of similar general character as the ontologies of that layer, which could be useful in multiple use cases.
Multiple requirements are planned to be enforced for the adoption of new ontologies in the common (or mid) layer of the DataID Ecosystem.
They might contain (while not being restricted to):
Deciding on which combination of DataID ontologies to use for a Dataset description is a domain and problem dependent process.
It may be necessary to add additional properties on top of the provided metadata properties.
For example, a DataID based ontology for LOD Datasets dealing with multi-dimensional data, may look schematically like this:
(importing DataID core and the extensions for Linked Data and Statistics, as well as some additional properties only used in the use case at hand.)
An alphabetical index of DataID core terms, by class (concepts) and properties is given below. All the terms are hyperlinked to their detailed description for quick reference.
This section grants an overview of the DataID core ontology, sectioned by concepts. The depiction below shows most of the concepts and properties which make up DataID core. A complete description of all concepts, properties and individuals are available in Section 6.
DataID core is founded on two pillars: the DCAT DCAT and PROV PROV-O ontologies. The base structure of DCAT (Catalog, CatalogRecord, Dataset and Distribution) is clearly visible in the upper part of the diagram. The PROV ontology is used to qualify the relation an Agent might have with an Entity. The AgentRole an Agent has in regard to an Entity and what this AgentRole might entail is depicted in the lower part of the diagram.
Throughout the DataID core ontology, multiple OWL restrictions were put in place to restrict ranges of properties (such as dct:language or dct:temporal) to concepts of vocabularies which have proven either to be sound and expressive enough or have been widely adopted in the Linked Data community. This step is necessary to provide a unified model of structured metadata without ambiguity. For example, the ODRL ontology ODRL was selected to describe licenses and other policies (dct:license). To specify dct:language the very useful Lexvo ontology LEXVO and dataset have been chosen.
This section is partitioned into instances of classes introduced in the DataID core ontology, exemplifying their use.
For illustration purposes, a running example was woven into the descriptions of concepts and properties. This example is a reduced version of an original DataID document of the Arabic DBpedia (release: 2015-10) DBPEDIA. Under the main dataset, only two sub-datasets are shown (as opposed to over 50 in the real world example from which this is drawn). Some instances referenced in the example were left out to reduce redundancy. The example omits the more common properties of DCTERMS DCTERMS and RDFS such as dct:title, dct:description, dct:modified, dct:issued and rdfs:label to make this example more easy to read. It was chosen to cover many aspects of DataID core and to provide an easy use case which could arise in a similar fashion outside the DBpedia domain. Remarks were added next to statements, explaining their use. The full example is available in Turtle serialisation TURTLE here: 2015-10_dataid_ar.ttl. The basic structure of its DataID document is outlined in figure 4, short names translate to the following URIs:
DataId | <dataid_ar.ttl> |
MainDataset | <dataid_ar.ttl?set=maindataset> |
Dataset A | <dataid_ar.ttl?set=interlanguage_links> |
Dataset C | <dataid_ar.ttl?set=long_abstracts_en_uris> |
Distribution A1 | <dataid_ar.ttl?file=interlanguage_links_ar.ttl.bz2> |
Distribution A2 | <dataid_ar.ttl?file=interlanguage_links_ar.tql.bz2> |
Distribution C1 | <dataid_ar.ttl?file=long_abstracts_en_uris_ar.ttl.bz2> |
Distribution C2 | <dataid_ar.ttl?file=long_abstracts_en_uris_ar.tql.bz2> |
Distribution C3 | <dataid_ar.ttl?sparql=DBpediaSparqlEndpoint> |
The base URI used in our example is:
@base <http://downloads.dbpedia.org/2015-10/core-i18n/ar/2015-10_>
The class dataid:DataId inherits from dcat:CatalogRecord, which does not represent a dataset, but metadata about a dataset's entry in a catalogue. Additionally, in the context of DataID core, it represents metadata about a DataID document (graph), such as version pointers, modification dates and relations to its context (such as Agents, Catalogs, Repositories). This DataId resource is the most abstract Entity in any DataID graph. The dataid:inCatalog property as the inverse of dcat:record references the catalogue a DataID is entered in. The property foaf:primaryTopic points out the Dataset a DataId resource represents. Since this is a functional property, representing multiple datasets under one DataId resource, publishers have to make use of the dataid:Superset class.
The DataId instance below exemplifies a typical DataId entry in a dcat:Catalog.
<dataid_ar.ttl> a dataid:DataId ; dataid:associatedAgent <http://wiki.dbpedia.org/dbpedia-association> ; dataid:inCatalog <http://downloads.dbpedia.org/2015-10/2015-10_dataid_catalog.ttl> ; #Pointing out the dcat:Catalog this DataID document is recorded in. dataid:latestVersion <http://downloads.dbpedia.org/2016-04/core-i18n/ar/2016-04_dataid_ar.ttl> ; #Making use of the version pointers of DataID core. This is the latest version. dataid:previousVersion <http://downloads.dbpedia.org/2015-04/core-i18n/ar/2015-04_dataid_ar.ttl> ; dataid:underAuthorization <dataid_ar.ttl?auth=creatorAuthorization> ; #the inverse property of dataid:authorizationScope helps to identify responsible Agents for an Entity dc:hasVersion <dataid_ar.ttl?version=1.0.0> ; dc:issued "2016-08-02"^^xsd:date ; dc:modified "2016-10-13"^^xsd:date ; dc:publisher <http://wiki.dbpedia.org/dbpedia-association> ; #The DataID core ontology way of providing provenance on Agents is in place, but using established properties to point out Agents does not go amiss. dc:title "DataID metadata for the Arabic DBpedia"@en ; foaf:primaryTopic <dataid_ar.ttl?set=maindataset> #foaf:primaryTopic points out the main dataset described by a DataID document.
The dataset concept of both the DCAT DCAT and VOID VOID were merged into dataid:Dataset, providing useful properties about the content of a dataset from both ontologies. In particular, the property void:subset allows for the creation of dataset hierarchies, while dcat:distribution points out the Distributions of a Dataset. The dataid:Superset as a subclass of dataid:Dataset SHALL be used to represent multiple Sub-Dataset entities, portraying dataset collections or hierarchical dataset structures in general. Opposed to a conventional Dataset, a Superset is prohibited from possessing Distributions (referred to with dcat:distribution). It is strongly RECOMMENDED that each Dataset shall either have at least one Distribution or one Sub-Dataset. In the running example, the main dataset (an instance of dataid:Superset), is used as a hierarchical root, representing all Sub-Datasets clustered around a common topic. In the case of DBpedia, all Datasets were arranged under a Superset representing a DBpedia language edition.
Multiple properties for textual statements on different (general) aspects of a Dataset were added: dataid:dataDescription, dataid:openness, dataid:growth, dataid:reuseAndIntegration, dataid:similarData, dataid:usefulness. All of which provide Publishers, Maintainers etc. a way to convey general information about the topics represented by these properties. This information will be useful in many scenarios related to dissemination tasks, for example, those described by the Horizon 2020 data management plan guidelines[needs link].
The Superset of this example is a container of Datasets and has no Distributions. It represents a slice of a DBpedia release, comprised of all Datasets of the Arabic language edition:
<dataid_ar.ttl?set=maindataset> a dataid:Superset ; dataid:associatedAgent <http://wiki.dbpedia.org/dbpedia-association> ; dataid:growth <dataid_ar.ttl?stmt=growth> ; #general statements about this dataset dataid:openness <dataid_ar.ttl?stmt=openness> ; dataid:reuseAndIntegration <dataid_ar.ttl?stmt=reuseAndIntegration> ; dataid:similarData <dataid_ar.ttl?stmt=similarData> ; dataid:usefulness <dataid_ar.ttl?stmt=usefulness> ; dc:hasVersion <dataid_ar.ttl?version=1.0.0> ; dc:issued "2016-07-02"^^xsd:date ; dc:language <http://lexvo.org/id/iso639-3/ara> ; #Languages in use are referenced by using instances of the lexvo.org dataset. dc:license <http://purl.oclc.org/NET/rdflicense/cc-by-sa3.0> ; #License cc-by-sa3.0 is further described as an instance of odrl:Policy here: ODRL licenses dc:modified "2016-08-01"^^xsd:date ; dc:publisher <http://wiki.dbpedia.org/dbpedia-association> ; dc:rights <dataid_ar.ttl?rights=dbpedia-rights> ; dc:title "DBpedia root dataset for Arabic, version 2015-10"@en ; void:subset <dataid_ar.ttl?set=long_abstracts_en_uris>, #void:subset points to sub-datasets <dataid_ar.ttl?set=interlanguage_links> ; void:vocabulary <http://downloads.dbpedia.org/2015-04/dbpedia_2015-10.owl> ; #The ontology/schema used for structuring the underlying data. dcat:keyword "maindataset"@en , "DBpedia"@en ; dcat:landingPage <http://dbpedia.org/> ; #Reference to a web page of general character, related to this dataset. foaf:isPrimaryTopicOf <dataid_ar.ttl> ; #inverse of foaf:primaryTopic, pointing back to the dataid:DataId instance foaf:page <http://wiki.dbpedia.org/Downloads2015-10> . #detailed documentation about the dataset at hand
The second Dataset is a Sub-Dataset of the first and provides actual data (Distributions). This Linked Data dataset is fitted with properties such as void:triples, void:sparqlEndpoint and sd:defaultGraph, introduced (or especially useful) with the Linked Data extension (dataid-ld:) of the DataID mid layer.
<dataid_ar.ttl?set=long_abstracts_en_uris> a dataid:Dataset, dataid-ld:LinkedDataDataset ; dataid:associatedAgent <http://wiki.dbpedia.org/dbpedia-association> ; dataid:qualifiedDatasetRelation <dataid_ar.ttl?relation=source&target=pages_articles> ; #The qualified dataset relation to the source dataset (see section Dataset Relationships) dataid:relatedDataset <dataid_ar.ttl?set=pages_articles> ; dc:hasVersion <dataid_ar.ttl?version=1.0.0> ; dc:isPartOf <dataid_ar.ttl?set=maindataset> ; #As a sub-dataset this dataset is part of <dataid_ar.ttl?set=maindataset> dc:issued "2016-07-02"^^xsd:date ; dc:language <http://lexvo.org/id/iso639-3/ara> ; dc:license <http://purl.oclc.org/NET/rdflicense/cc-by-sa3.0> ; dc:modified "2016-08-02"^^xsd:date ; dc:publisher <http://wiki.dbpedia.org/dbpedia-association> ; dc:title "long abstracts en uris"@en ; void:rootResource <dataid_ar.ttl?set=maindataset> ; void:triples 232801 ; #Since this is a LOD dataset, making use of properties specific for this context (like void:triples) is an useful contribution. void:sparqlEndpoint <http://dbpedia.org/sparql> ; #This dataset is part of the data from the official SPARQL endpoint of DBpedia dcat:distribution <dataid_ar.ttl?sparql=DBpediaSparqlEndpoint> , #dcat:distribution points out the actual manifestations of a dataset. In this case the dataset is available in two different RDF serialisations and via a SPARQL endpoint. <dataid_ar.ttl?file=long_abstracts_en_uris_ar.ttl.bz2> , <dataid_ar.ttl?file=long_abstracts_en_uris_ar.tql.bz2> ; dcat:keyword "long_abstracts_en_uris"@en , "DBpedia"@en ; dcat:landingPage <http://dbpedia.org/> ; sd:defaultGraph <http://ar.dbpedia.org> ; #When loading this LOD dataset into a triple store, this should be the graph under which to store it. foaf:page <http://wiki.dbpedia.org/Downloads2015-10> .
The class dataid:Distribution is a subclass of dcat:Distribution and provides the technical description of the data itself. In addition, it serves as documentation of how to access the data described (via dcat:accessURL or dcat:downloadURL), and which conditions apply (e.g. dataid:accessProcedure, dataid:softwareRequirement). Every Distribution of a Dataset MUST contain the whole Dataset in the format and location described. It MAY contain additional data on top of that, for example when describing a ServiceEndpoint. Two Distributions of the same Dataset, therefore, must either contain the exact same data (for example in two different serialisations), or one Distribution must completely subsume the other. The Distribution concept, introduced by the DCAT vocabulary DCAT, is crucial to be able to automatically retrieve and use the data described in a DataID document, simplifying, for example, data analysis. Additional subclasses, to further distinguish how the data is available on the web, were added:
Except for dataid:SingleFile all of these subclasses may need additional semantics to describe them in a useful manner. This is not the task of DataID core, further extensions of this ontology will address these issues.
The range of dataid:checksum is spdx:Checksum SPDX, providing spdx:checksumValue and spdx:algorithm properties. The dcat:mediaType property is restricted to the range of dataid:MediaType. Similar to dcat:byteSize, dataid:uncompressedByteSize provides an integer value for the number of bytes a compression or archive file has stored as content inside. A glimpse of the (uncompressed) data of a Distribution can be offered with the property dataid:preview. This is useful when data providers want to convey the format or an example of the data of a very large or compressed file.
The first example is an instance of dataid:SingleFile, describing a single RDF file (which contains the whole Dataset) in Turtle syntax TURTLE, compressed with the bzip2 compression. It can be downloaded directly (without any intermediate steps), hence the property dcat:downloadURL is used to point out the resource on the web. Since it is a compressed file, the byte size in its compressed and uncompressed state is provided. An instance of spdx:Checksum was included, providing the checksum value for this Distribution.
<dataid_ar.ttl?file=long_abstracts_en_uris_ar.ttl.bz2> a dataid:SingleFile ; dct:license <http://purl.oclc.org/NET/rdflicense/cc-by-sa3.0> ; #Repeating the license terms in Distribution instances is good practice. There are cases where these differ from Dataset license terms. dct:publisher <http://wiki.dbpedia.org/dbpedia-association> ; dataid:associatedAgent <http://wiki.dbpedia.org/dbpedia-association> ; dataid:checksum <dataid_ar.ttl?file=long_abstracts_en_uris_ar.ttl.bz2&checksum=md5> ; #The checksum value (see instance below) dataid:isDistributionOf <dataid_ar.ttl?set=long_abstracts_en_uris> ; #The inverse of dcat:distribution dataid:preview <http://downloads.dbpedia.org/preview.php?file=2015-10_sl_core-i18n_sl_ar_sl_long_abstracts_en_uris_ar.ttl.bz2> ; #dataid:preview points to a web resource with a short snippet of the actual data (in the turtle serialisation) of this file. dataid:uncompressedByteSize 186573907 ; #Since .bz2 indicates that this is an bzip2 archive file, this property is used to specify the uncompressed size of the content in bytes. dcat:byteSize 33428372 ; #dcat:byteSize specifies the the size of the file described in bytes. dcat:downloadURL <long_abstracts_en_uris_ar.ttl.bz2> ; #dcat:downloadURL points out the url under which this file is available. dcat:mediaType dataid:MediaType_turtle_x-bzip2 #The media type of the content described with this instance of a distribution (see bottom of this example). <dataid_ar.ttl?file=long_abstracts_en_uris_ar.ttl.bz2&checksum=md5> #The Checksum concept from the SPDX vocabulary of the Linux Foundation is used to provide checksum values for different checksum algorithms. a spdx:Checksum ; spdx:algorithm spdx:checksumAlgorithm_md5 ; #Multiple algorithms are defined by the SPDX ontology (md5 is used in this example). spdx:checksumValue "2503179cd96452d33becd1e974d6a163"^^xsd:hexBinary . #The checksum hex value.
The second example is an instance of dataid-ld:SparqlEndpoint, a subclass of dataid:ServiceEndpoint and sd:Service which was introduced with the DataID extension for Linked Data (dataid-ld). Additional properties from the SPARQL 1.1 Service Description language (sd:) are used to further desribe the endpoint. Both examples are Distributions of Dataset <dataid_ar.ttl?set=long_abstracts_en_uris>.
<dataid_ar.ttl?sparql=DBpediaSparqlEndpoint> a dataid-ld:SparqlEndpoint ; dataid:associatedAgent <http://support.openlinksw.com/> ; #Please notice: a different agent as for all other entities displayed is associated with this entity dataid:accessProcedure <dataid_ar.ttl?stmt=sparqlaccproc> ; #Provides a statement about how to access the data provided with this endpoint (see Section SimpleStatement). dct:hasVersion <dataid_ar.ttl?version=1.0> ; dct:license <http://purl.oclc.org/NET/rdflicense/cc-by-sa3.0> ; dcat:accessURL <http://dbpedia.org/sparql> ; #As opposed to dataid:SingleFile, dataid:ServiceEndpoint provides a dcat:accessURL. The data is accessible via this URL (e.g. by applying a query). dcat:mediaType <http://dataid.dbpedia.org/ns/mt#MediaType_sparql-results+xml> ; #A different MediaType as in the last example is used to specify the result format of a SPARQL endpoint. sd:endpoint <http://dbpedia.org/sparql> ; #The address of the SPARQL endpoint (same as dcat:accessURL). sd:supportedLanguage sd:SPARQL11Query ; #Supported query language. sd:resultFormat <http://www.w3.org/ns/formats/RDF_XML>, #Supported result formats of the endpoint as RDF serializations (note: list is truncated). <http://www.w3.org/ns/formats/Turtle> .
DCAT DCAT does not offer an intrinsic way of specifying the exact format of the content described by a Distribution. While the property dcat:mediaType does exist, its expected range dct:MediaTypeOrExtend is an empty concept, not extended by the DCAT vocabulary. Therefore, the dataid:MediaType was introduced to better qualify this crucial piece of information. The following properties are of interest:
The following snippet exemplifies the use of this property:
(note: the namespace http://dataid.dbpedia.org/ns/mt# for common MediaTypes is used on a preliminary basis)
<http://dataid.dbpedia.org/ns/mt#MediaType_turtle_x-bzip2> a dataid:MediaType ; dataid:innerMediaType <http://dataid.dbpedia.org/ns/mt#MediaType_turtle> ; #Pointing to the MediaType of the file inside the compression file. dataid:typeName "Turtle Bzip2" ; dataid:typeExtension ".bz2" ; #the extension of the outer most format dataid:typeTemplate "application/x-bzip2" #the IANA mime type string of the outer most format <http://dataid.dbpedia.org/ns/mt#MediaType_turtle> a dataid:MediaType ; dataid:typeName "Turtle" ; dataid:typeExtension ".ttl" ; dataid:typeTemplate "application/x-turtle" ; #there are MediaTypes with multiple valid mime types dataid:typeTemplate "text/turtle" .
An Agent is something or someone that bears some form of responsibility for an Entity or activities which create, transform or manage Entities in some way. Agents are real or legal persons, groups of persons, programs, organisations etc. The class dataid:Agent subsumes both agent concepts of the PROV PROV-O and FOAF FOAF ontologies. The property dataid:hasAuthorization was added as the inverse property of dataid:authorizedAgent, pointing out the Authorizations which grant this Agent some kind some authority over Entities, which is explained in detail in the example on Authorizations.
The example of an Agent portrays an organisation:
<http://wiki.dbpedia.org/dbpedia-association> a dataid:Agent ; dataid:hasAuthorization <dataid_ar.ttl?auth=creatorAuthorization> ; #pointing out an Authorization this Agent is responsible for foaf:homepage <http://dbpedia.org> ; foaf:mbox "dbpedia@infai.org" ; #To identify and contact an Agent, it is useful to add more than the name to an instance of dataid:Agent. foaf:name "DBpedia Association" .
One objective of DataID core is the detailed expression of the relations between Agents and Entities. To qualify these relations (summarised under the property dataid:associatedAgent) AgentRoles have to be assigned to the involved Agents (such as Maintainer, Publisher etc.). This is achieved by the class dataid:Authorization, which is a subclass of prov:Attribution, a qualification of the property prov:wasAttributedTo. It basically states, which AgentRole(s) (pointed out with dataid:authorityAgentRole) an Agent (dataid:authorizedAgent) has, regarding a certain collection of Entities (dataid:authorizedFor). This mediator is further qualified by an optional period of time for which it is valid and access restrictions by the Entities themselves, allowing only specific Authorizations to exert influence over them (see dataid:needsSpecialAuthorization). Property dataid:underAuthorization is the inverse of dataid:authorizationScope and provides the Authorization context for an Entity. The property dataid:authorizedAction can be inferred by the AgentRole assigned and will be addressed in detail in the next section.
<http://wiki.dbpedia.org/dbpedia-association/persons/Freudenberg> a dataid:Agent ; #dataid:Agent is at its core a foaf:Agent with additional properties. dataid:hasAuthorization <dataid_ar.ttl?auth=maintainerAuthorization> ; foaf:mbox "freudenberg@informatik.uni-leipzig.de" ; foaf:name "Markus Freudenberg" ; dataid:identifier <http://www.researcherid.com/rid/L-2180-2016> . #reference to a researcherID (see below) <dataid_ar.ttl?auth=maintainerAuthorization> a dataid:Authorization ; #An Authorization instance pertaining to <http://wiki.dbpedia.org/dbpedia-association/persons/Freudenberg>. dataid:authorityAgentRole dataid:Maintainer ; #The role in the context of this Authorization for the pertaining Agent is a Maintainer. dataid:authorizedAgent <http://wiki.dbpedia.org/dbpedia-association/persons/Freudenberg> ; dataid:authorizedFor <dataid_ar.ttl> . #This property, together with its sub-properties, points out the Entities for which it is valid. In this case this Authorization is valid for all Entities (datasets, distributions etc.) of this DataID. <http://wiki.dbpedia.org/dbpedia-association> a dataid:Agent ; #Definition of an organization with the the Creator role assigned for it. dataid:hasAuthorization <dataid_ar.ttl?auth=creatorAuthorization> ; foaf:homepage <http://dbpedia.org> ; foaf:mbox "dbpedia@infai.org" ; foaf:name "DBpedia Association" . <dataid_ar.ttl?auth=creatorAuthorization> a dataid:Authorization ; dataid:authorityAgentRole dataid:Creator ; #In the predefined AgentRole hierarchy of DataID core the Creator is the highest ranking AgentRole. This could be interpreted for deciding on which AgentRole has the power to override AuthorizedActions of other AgentRoles. dataid:authorizedAgent <http://wiki.dbpedia.org/dbpedia-association> ; dataid:authorizedFor <dataid_ar.ttl> .
The AgentRole assigned to an Agent in the context of an dataid:Authorization is defined only by the property dataid:allowsFor, pointing out AuthorizedActions it entails. A dataid:AuthorizedAction SHALL either be a dataid:EntitledAction, representing all AuthorizedActions an Agent could take, or the AuthorizedActions an Agent has to take (dataid:ResponsibleAction). AuthorizedActions and AgentRoles defined in this ontology are only examples of possible implementations, reflecting a common environment of a File or Document Management System. They can be replaced to fit the use case at hand. Implementing them as a skos:ConceptScheme offers additional semantics, for example in determining which AgentRole can override AuthorizedActions initiated by Agents with other AgentRoles for the same Entity.
This example is not part of our running demonstration of an overall example but has been lent from the DataID core ontology document itself. Here the AgentRole 'Contact' is defined in the context of a skos:ConceptScheme.
dataid:Contact a owl:NamedIndividual, dataid:AgentRole; rdfs:isDefinedBy <http://dataid.dbpedia.org/ns/core#> ; dataid:allowsFor dataid:ReadDataId; #Pointing out AuthorizedActions which are allowed for an Agent holding this AgentRole. dataid:allowsFor dataid:ModifyContent; dataid:allowsFor dataid:ReadContent; dataid:allowsFor dataid:ResponseToContact; rdfs:comment "Contact agent. An agent that can be contacted for general requests about the resource."@en ; skos:prefLabel "contact"@en ; skos:inScheme dataid:AgentRoleScheme ; #The skos:ConceptScheme this individual belongs to skos:broader dataid:Publisher, dataid:Maintainer . #Defining the hierarchical structure with skos:broader, skos:narrower
A DatasetRelationship is a qualification of the generic property dataid:relatedDataset (which is a sub-property of dct:relation). While dataid:relatedDataset is a symmetric property, its qualification is directed. The dataid:DatasetRelationship is a subclass of prov:EntityInfluence and is defined by three properties:
As for all subclasses of prov:Role, for dataid:DatasetRelationRole some generic relation roles are already defined in the DataID core ontology (e.g. dataid:SourceRole or dataid:DerivatRole). They serve as common examples of how to use this concept.
In the example, the Wikipedia source dataset (named 'pages_articles') of a given DBpedia dataset is referred to with the help of this concept.
<dataid_ar.ttl?relation=source&target=pages_articles> a dataid:DatasetRelationship ; dataid:datasetRelationRole dataid:SourceRole ; dataid:qualifiedRelationOf <dataid_ar.ttl?set=long_abstracts_en_uris> ; dataid:qualifiedRelationTo <dataid_ar.ttl?set=pages_articles> .
The class dataid:Identifier uniquely identifies any resource (incl. Entities and Agents), given an identifier as a literal (dataid:literal) and a corresponding datacite:IdentifierScheme (e.g. ORCID, ResearcherID etc.). Typically an organisation is responsible for issuing and managing Identifiers described with this concept, which can be referred to with dct:creator. DataID core adopted this approach from Datacite DATACITE to provide a schematic way of adding additional, existing identifiers to Entities and Agents referenced in a DataID document.
The following Identifier is an actual ResearchID ResearcherID identifier for one of the involved Agents.
<http://www.researcherid.com/rid/L-2180-2016> a dataid:Identifier ; dataid:literal "L-2180-2016" ; #The actual identifier as literal. dct:issued "2016-08-01"^^xsd:date ; dct:references <http://www.researcherid.com/rid/L-2180-2016> ; #If available, the pertaining web resource for more information about the Agent or Entity. datacite:usesIdentifierScheme datacite:researcherid . #The scheme of the Identifier is defined in the Datacite ontology.
The concept dataid:SimpleStatement is intended as a tool for conveying a statement, definition or point of view about a certain topic. Using either a simple literal (using dataid:literal) to provide a quotation or by a referencing a web resource providing or representing the statement in any medium (picture, text, video etc.). This class implements the following Dublin Core classes:
With this measure, it is possible to use dataid:SimpleStatement as range of dct:rights and its sub-properties, dct:provenance, dct:conformsTo and others.This reification approach with an intermediate resource was chosen to cover as many scenarios as possible including many edge cases which do not have to be modelled explicitly. Also, this approach makes it easy to attach provenance information to a statement.
The two instances from the running example exemplify different usage scenarios of this concept.
<dataid_ar.ttl?rights=dbpedia-rights> a dataid:SimpleStatement ; #A simple textual rights statement. dataid:literal """DBpedia is derived from Wikipedia and is distributed under the same licensing terms as Wikipedia itself. As Wikipedia has moved to dual-licensing, we also dual-license DBpedia starting with release 3.4. Data comprising DBpedia release 3.4 and subsequent releases is licensed under the terms of the Creative Commons Attribution-ShareAlike 3.0 license and the GNU Free Documentation License. Data comprising DBpedia releases up to and including release 3.3 is licensed only under the terms of the GNU Free Documentation License."""@en . <dataid_ar.ttl?stmt=sparqlaccproc> a dataid:SimpleStatement ; #This statement is used for a sparql endpoint to point with property dataid:accessProcedure to the official SPARQL 1.1 definition. dct:references <https://www.w3.org/TR/sparql11-overview/> ; #Instead of a statement a reference to a web site containing the statement is possible as well. dataid:literal "An endpoint for sparql queries: provide valid queries."@en .
The dataid:Authorization is one of the more complex concepts of the DataID core ontology.
In particular, the impact of dataid:authorizationScope
with its sub-properties is more difficult to understand at first sight.
The following axioms for the transitive property dataid:authorizationScope would be desirable to extend it along any property path combined of foaf:primaryTopic, void:subset and dcat:distribution.
A series of properties (dataid:authorisationChain1 - 9) were created to simulate this behaviour with the help of the OWL property chain axiom (which is possible since every subsequent property path behind dataid:authorizationScope has at least two triples when not pointed out directly with dataid:authorizedFor).
For example: if a knowledge base (KB) holds a statement such as:
ex:someAuthorization dataid:authorizedFor ex:DataId .and further statements:
ex:DataId foaf:primaryTopic ex:RootDataset . ex:RootDataset void:subset ex:DatasetC .we can infer the following statements:
ex:someAuthorization dataid:authorizationScope ex:DataId . ex:someAuthorization dataid:authorizationScope ex:RootDataset . ex:someAuthorization dataid:authorizationScope ex:DatasetC .by inferring these statements first:
ex:someAuthorization dataid:authorizationChain7 ex:RootDataset . ex:DataId dataid:authorizationChain2 ex:DatasetC .and applying the transitive trait of dataid:authorizationScope.
With these auxiliary properties the inference of dataid:authorizationScope along the depicted property paths becomes feasible:
Here the Authorization for Agent Yellow is not only valid for the DataId entity, referred to via dataid:authorizedFor. By inferring additional statements of this kind, the scope of this Authorization is extended to every Dataset and Distribution connected via foaf:primaryTopic, dcat:distribution and void:subset. By this means, extending the influence (or scope) of an Authorization over multiple Entities without having to point out all of them with dataid:authorizedFor is realised, without changing the definitions of the external properties involved, or an inclusion of rule-based axioms.
The automatic extension of an Authorization has also its drawbacks, which are hereby addressed:
By introducing multiple Authorizations in the context of a DataID document, providing the same
AgentRole for an Entity, the author of a DataID document can encounter unintended behaviours.
In this example the previous context is enriched by introducing an additional Agent Blue:
Dataset C (and all its Distributions) has two Maintainers, both equally permitted to wield AuthorizedActions as defined by the definition of dataid:Maintainer. This behaviour may or may not be intended by the author. To provide the means for restricting Entities to specific Authorizations, the property dataid:needsSpecialAuthorization was introduced. This sub-property of dataid:underAuthorization (the inverse of authorizationScope) allows to point out those Authorizations with sufficient importance to exert their authority over an Entity, to the exclusion of other Authorizations referenced via authorizationScope.
The following example again expands the already known scenario, by introducing a third Authorization for Agent Green:
While Dataset A and Distributions A1 and A2 are under the Authorizations of Agent Yellow and Agent Green, only Distribution A2 will be maintained by both Agents. Dataset A and Distribution A1 require specifically the Authorization of Agent Green for the purpose of providing the AgentRole of Maintainer.
This mechanism is useful when introducing different levels of privacy into the domain of, for example, a Document Management System (DMS). Two groups of users are specified: The first group (yellow group) should only be able to read the content of a given collection of documents, while the second group (blue group) is also allowed to modify these documents. Therefore, defining two new AgentRoles is advisable. AgentRole 'Reader' can only read the content of Entities available to it, while the 'Editor' allows also for modifying the content. These AgentRoles are linked to via dataid:authorityAgentRole from the respective Authorizations of the two groups (dataid:authorizedAgent points out the members of a group).
Both Authorizations are authorised for the same document collection (Dataset) and its Distributions as PDF and MS-Word versions of the same content in the DMS.
Since the MS_Word version of the documents is used for editing the content, while its PDF counterpart is the publishing version,
it is sensible to allow only the Editors (blue group) access to the MS_Word Distribution by using dataid:needsSpecialAuthorization.
The existence of this statement:
ex:distributionMsWord dataid:needsSpecialAuthorization ex:authorizationBlue .lets us directly infer:
ex:authorizationBlue dataid:authorizedFor ex:distributionMsWord .Since dataid:needsSpecialAuthorization is a sub-property of dataid:underAuthorization which, in turn, is an inverse property of dataid:authorizationScope .
IRI: http://dataid.dbpedia.org/ns/core#Agent
IRI: http://dataid.dbpedia.org/ns/core#AgentRole
AgentRoles are defined by a set of rights and responsibilities (see AuthorizedAction) an Agent, assigned with this Role, has to address or can execute.
IRI: http://dataid.dbpedia.org/ns/core#Authorization
This concept is a mediator between a set of Entities (or scope) defined by the property dataid:authorizedFor, a set of AgentRoles defining rights and responsibilities (dataid:authorityAgentRole) and Agents (dataid:associatedAgent), fulfilling the AgentRoles, in regard to the collection of Entities. This subclass of prov:Attribution qualifies the relation prov:wasAttributedTo between prov:Entity and prov:Agent with the AgentRoles and AuthorizedActions of the DataID-domain.
IRI: http://dataid.dbpedia.org/ns/core#AuthorizedAction
AuthorizedActions may be comprised either of activities an Agent can execute in regard to a collection of Entities (dataid:EntitledAction), or of responsibilities an Agent should act upon (dataid:ResponsibleAction) (e.g. make certain decisions if necessary or act upon certain events).
IRI: http://dataid.dbpedia.org/ns/core#DataId
The dataid:DataId class is the most generic entity in a DataID graph about one or more datasets. As a subclass of dcat:CatalogRecord / void:DatasetDescription it describes not a dataset itself but provides metadata about its entry in a dcat:Catalog and/or its relations to repositories or other data collections.
IRI: http://dataid.dbpedia.org/ns/core#Dataset
A collection of data, available for access in one or more formats. Dataset resources describe the concept of the dataset, not its manifestation (the data itself), which can be acquired as a Distribution. Datasets are prov:Entities and can be generated by prov:Activities.
IRI: http://dataid.dbpedia.org/ns/core#DatasetRelationRole
Provides the role of a dataid:DatasetRelationship (e.g. Linkset, Source, Derivat, Similarity etc.).
IRI: http://dataid.dbpedia.org/ns/core#DatasetRelationship
Portrays a generic relation between two ore more datasets.
IRI: http://dataid.dbpedia.org/ns/core#Directory
A dedicated file system directory holding (multiple) files of the same Dataset, which, when put together, make up the whole Dataset.
IRI: http://dataid.dbpedia.org/ns/core#Distribution
Distributions describe the technical details of a single manifestation of the pertaining Dataset (for example; its data format/serialisation or a ServiceEndpoint including the procedure required to access the data).
IRI: http://dataid.dbpedia.org/ns/core#EntitledAction
EntitledActions describe actions an Agent is allowed to perform when holding a certain AgentRole. These Actions may address restricted matters like access, modification rights and others.
IRI: http://dataid.dbpedia.org/ns/core#FileCollection
Multiple files constituting one (complete) Dataset (files in different directories or servers are allowed, as opposed to Directory).
IRI: http://dataid.dbpedia.org/ns/core#Identifier
Uniquely identifies any resource, given an identifier as literal (see dataid:literal) and a corresponding identifier scheme (e.g. a aid/pid scheme such as ORCID, ResearcherID etc. pointed out via datacite:usesIdentifierScheme). Optionally it can point out a reference document on the web (about this Identifier) using dct:references.
IRI: http://dataid.dbpedia.org/ns/core#MediaType
Extends the dct:MediaTypeOrExtend class of Dublin Core, providing the IANA media type description (or mime type) and common file extensions used. A pointer to an inner media (dataid:innerMediaType) can describe (multiple) layers of compressions and all internal media types.
IRI: http://dataid.dbpedia.org/ns/core#ResponsibleAction
AgentRoles provide rights as well as responsibilities an agent has to attend to in order to fulfil this Role. ResponsibleActions should describe actions pertaining to responsibilities an agent is supposed to do when holding a specific AgentRole.
IRI: http://dataid.dbpedia.org/ns/core#ServiceEndpoint
A specific Distribution, which is accessible via an access URL and provides data as a web service in a certain format.
IRI: http://dataid.dbpedia.org/ns/core#SimpleStatement
A SimpleStatement is intended as a tool for conveying a statement, definition or point of view about a certain topic. Using either a simple literal (using dataid:literal) to provide a quotation or by a referencing a web resource providing or representing the statement in any given medium (picture, text, video etc.). This class implements several classes of Dublin Core which is not further specified within DC.
IRI: http://dataid.dbpedia.org/ns/core#SingleFile
A single data dump file representing the whole Dataset, in a certain format/serialisation.
IRI: http://dataid.dbpedia.org/ns/core#Superset
This dedicated version of a dataid:Dataset has exacly one purpose: to point out all its Sub-Datsets with void:subset. A dataid:Superset has no data itself and is therefore prohibited to point out Distributions with dcat:distribution. It can be used in a dataset hierarchy (e.g. as a root dataset), or as a container for other datasets.
IRI: http://dataid.dbpedia.org/ns/core#accessProcedure
Describes the steps which have to be taken to gain access to the described data at the location of a Distribution (e.g. register an account to gain dct:accessRights).
IRI: http://dataid.dbpedia.org/ns/core#allowsFor
An AgentRole allows an Agent to execute certain AuthorizedActions.
IRI: http://dataid.dbpedia.org/ns/core#associatedAgent
An Agent which is generally connected to the Dataset. Their exact function in regard to the Entity has to be specified by the qualification of an Authorization.
IRI: http://dataid.dbpedia.org/ns/core#authorityAgentRole
Assigns an AgentRole for an Authorization to an Agent, thereby allowing for certain AuthorizedActions this Agent can execute on the Entities defined by the scope of the Authorization.
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain1
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain2
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain3
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain4
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain5
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain6
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain7
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain8
IRI: http://dataid.dbpedia.org/ns/core#authorizationChain9
IRI: http://dataid.dbpedia.org/ns/core#authorizationScope
This property defines the scope of an Authorization. An Agent has the right to execute AuthorizedActions for all Entities of this scope. Together with its sub-properties (dataid:authorizedFor, dataid:authorizationChainX), it defines how the scope of an Authorization is extended to other Entities. Since this property can be inferred by its sub-properties it shall not be instantiated in a DataID document.
IRI: http://dataid.dbpedia.org/ns/core#authorizedAction
Points out AuthorizedActions Agents are allowed to execute under a given Authorization (and its chosen AgentRoles).
IRI: http://dataid.dbpedia.org/ns/core#authorizedAgent
Points out an Agent which holds the rights granted by an Authorization (e.g. to modify the metadata of a Dataset).
IRI: http://dataid.dbpedia.org/ns/core#authorizedFor
Points out the Entities which are under the direct influence of an Authorization. An Agent has the right to execute AuthorizedActions (as defined by the pertaining AgentRole) for these Entities. This property shall be used to point out the initial entities for which an Authorization is valid. Inference via authorizationChain properties may extend the scope of this Authorization further over multiple Entities along the hirarical structure of a DataID.
IRI: http://dataid.dbpedia.org/ns/core#checksum
The checksum value allows the contents of a file to be authenticated since it shifts even with small changes to the file.
IRI: http://dataid.dbpedia.org/ns/core#dataDescription
Provides a detailed description of the data represented by this Dataset.
IRI: http://dataid.dbpedia.org/ns/core#datasetRelationRole
The role which qualifies a dataid:DatasetRelationship. It specifies which relationship datasets (pointed out with dataid:qualifiedRelationTo) have in regard to the dataset referred to with dataid:qualifiedRelationOf (e.g. source datasets when dataid:SourceRole is in place).
IRI: http://dataid.dbpedia.org/ns/core#growth
Indication of what size the approximated end volume of the Dataset is.
IRI: http://dataid.dbpedia.org/ns/core#hasAuthorization
Provides an Agent with the Authorization for a given scope of Entities (e.g. Dataset, Distribution etc.), granting rights to execute certain AuthorizedActions.
IRI: http://dataid.dbpedia.org/ns/core#identifier
A unique identifier for an Agent or Entity (for other, non DataID related identifiers).
has characteristics: inverse functional
IRI: http://dataid.dbpedia.org/ns/core#identifierScheme
Points out which scheme of identifiers is used to uniquely identify resources inside a Dataset.
IRI: http://dataid.dbpedia.org/ns/core#inCatalog
The inverse property of dcat:record, pointing back to the dcat:Catalog in which this DataId is recorded.
IRI: http://dataid.dbpedia.org/ns/core#innerMediaType
Points out the MediaType of the (compressed) file inside another file (relevant for archive files).
has characteristics: functional
IRI: http://dataid.dbpedia.org/ns/core#isDistributionOf
Inverse property of dcat:distribution, linking a Distribution to a Dataset.
IRI: http://dataid.dbpedia.org/ns/core#latestVersion
Latest version of a DataId/Dataset/Distribution.
IRI: http://dataid.dbpedia.org/ns/core#needsSpecialAuthorization
Points out an Authorization which grants some degree of authority over this resource to the exclusion of other Authorizations which are not referred with this property.
IRI: http://dataid.dbpedia.org/ns/core#nextVersion
Next version of a DataId/Dataset/Distribution.
IRI: http://dataid.dbpedia.org/ns/core#openness
General description of how data will be shared. For example embargo periods (if any), outlines of technical mechanisms for dissemination or a definition of whether access will be widely open or restricted to specific groups. In case the Dataset cannot be shared, the reasons for this should be mentioned (e.g. ethical, rules of personal data, intellectual property, commercial, privacy-related, security-related).
IRI: http://dataid.dbpedia.org/ns/core#preview
Provides the URL of a short preview of the data provided by a Distribution, helpful when conveying type and format of the data provided as an example.
IRI: http://dataid.dbpedia.org/ns/core#previousVersion
Previous version of a DataId/Dataset/Distribution.
IRI: http://dataid.dbpedia.org/ns/core#qualifiedDatasetRelation
Qualifies the dataid:relatedDatset property with a DatasetRelationship.
has characteristics: inverse functional
IRI: http://dataid.dbpedia.org/ns/core#qualifiedRelationOf
Inverse property of dataid:qualifiedDatasetRelation, pointing out a dataset which has related datasets referred to via dataid:dataid:qualifiedRelationTo.
has characteristics: functional
IRI: http://dataid.dbpedia.org/ns/core#qualifiedRelationTo
Pointing out datasets which are somehow related to the dataset referred to via dataid:dataid:qualifiedRelationOf.
IRI: http://dataid.dbpedia.org/ns/core#relatedDataset
Points to other Datasets containing related data. Can be qualified with dataid:DatasetRelationship. (note: while this property is symmetric, its qualification is not)
has characteristics: symmetric
IRI: http://dataid.dbpedia.org/ns/core#reuseAndIntegration
Information on the possibilities for integration and reuse of the Dataset.
IRI: http://dataid.dbpedia.org/ns/core#similarData
Information on the existence (or absence) of similar data (see also dataid:similarDataset).
IRI: http://dataid.dbpedia.org/ns/core#softwareRequirement
Software needed to access the data provided via this Distribution or otherwise relevant for consuming the data.
IRI: http://dataid.dbpedia.org/ns/core#typeReference
Refers to a standard document of the MediaType described.
IRI: http://dataid.dbpedia.org/ns/core#underAuthorization
Points out an Authorization which grants some degree of authority over this resource (inverse property of dataid:authorizationScope).
IRI: http://dataid.dbpedia.org/ns/core#usefulness
Description of to whom the Dataset could be useful, and whether it underpins a scientific publication.
IRI: http://dataid.dbpedia.org/ns/core#isInheritable
Indicates whether this Authorization will be valid after an update to the DataID or a scope element.
has characteristics: functional
IRI: http://dataid.dbpedia.org/ns/core#literal
An actual textual statement or literal (string).
IRI: http://dataid.dbpedia.org/ns/core#typeExtension
Lists file extensions commonly used with this MediaType.
IRI: http://dataid.dbpedia.org/ns/core#typeName
Names the MediaType described.
IRI: http://dataid.dbpedia.org/ns/core#typeTemplate
The template (or mime type string) of a MediaType.
has characteristics: functional
IRI: http://dataid.dbpedia.org/ns/core#uncompressedByteSize
Records the byte size of the uncompressed content of an archive file.
IRI: http://dataid.dbpedia.org/ns/core#validFrom
The influence an Agent has over an Entity is valid from (inclusive) a certain point in time.
has characteristics: functional
IRI: http://dataid.dbpedia.org/ns/core#validUntil
The influence an Agent has over an Entity is valid until (exclusive) a certain point in time.
has characteristics: functional
IRI: http://dataid.dbpedia.org/ns/core#AgentRoleScheme
The AgentRole scheme (hierarchy) provided by DataID core, depicting Roles commonly used in the context of a file or document management system. This scheme should be replaced in other use cases.
IRI: http://dataid.dbpedia.org/ns/core#AgentSupervision
The responsibility (ResponsibleAction) to supervise other Agents.
IRI: http://dataid.dbpedia.org/ns/core#AllEntitlements
Represents all entitlements (EntitledActions). The top concept of the AuthorizedActionScheme hierarchy.
IRI: http://dataid.dbpedia.org/ns/core#AllResponsibilities
Encompasses all responsibilities (ResponsibleActions). The top concept of the AuthorizedActionScheme hierarchy.
IRI: http://dataid.dbpedia.org/ns/core#AuthorizedActionScheme
Description of the AuthorizedAction hierarchy.
IRI: http://dataid.dbpedia.org/ns/core#Contact
Contact agent. An Agent that can be contacted for general requests about the resource.
IRI: http://dataid.dbpedia.org/ns/core#Contributor
Contributor to the resource. An Agent that was involved in creating or maintaining the resource but does not have the main part in this activity.
IRI: http://dataid.dbpedia.org/ns/core#CopyRole
This role specifies a dataid:DatasetRelationship where one dataset is an exeact copy of a second dataset (e.g. when republished under a different domain).
IRI: http://dataid.dbpedia.org/ns/core#Creator
Creator of the resource. An AgentRole that is credited with the main part in the initial creation of the resource.
IRI: http://dataid.dbpedia.org/ns/core#DeleteContent
EntitledAction to delete some content of an entity.
IRI: http://dataid.dbpedia.org/ns/core#DerivatRole
This role specifies a dataid:DatasetRelationship where one dataset points out a second dataset, which is a derivat of the first.
IRI: http://dataid.dbpedia.org/ns/core#GenericRelation
This role specifies a dataid:DatasetRelationship between two datasets which have a relation of a unknown quality.
IRI: http://dataid.dbpedia.org/ns/core#Guest
A visitor or anonymous Agent has only read rights on public documents
IRI: http://dataid.dbpedia.org/ns/core#GuestAgent
Use this Agent as a stand in for any Agent not specifically defined in a domain, granting public access to an Entity.
IRI: http://dataid.dbpedia.org/ns/core#GuestAuthorization
This Authorization can be used to point out that the content of an entity is public and can be read by anyone (see also dataid:Guest).
IRI: http://dataid.dbpedia.org/ns/core#Maintainer
Maintainer of the Dataset. An Agent that ensures the technical correctness, accessibility and up-to-dateness of a Dataset.
IRI: http://dataid.dbpedia.org/ns/core#ModifyAgentRoles
EntitledAction to modify the role of Agents on certain Entities.
IRI: http://dataid.dbpedia.org/ns/core#ModifyAuthorization
EntitledAction to modify an Authorization.
IRI: http://dataid.dbpedia.org/ns/core#ModifyAuthorizedAgents
EntitledAction to modify which Agents are authorized on certain Entities.
IRI: http://dataid.dbpedia.org/ns/core#ModifyContent
EntitledAction to modify the content of an Entity.
IRI: http://dataid.dbpedia.org/ns/core#Publisher
Publisher of the Dataset. An Agent that makes the Dataset accessible online on a server or repository without necessarily being involved in its creation.
IRI: http://dataid.dbpedia.org/ns/core#PublishingDecision
The responsibility (ResponsibleAction) to decide if an Entity should be published.
IRI: http://dataid.dbpedia.org/ns/core#ReadContent
EntitledAction to read the content of an Entity.
IRI: http://dataid.dbpedia.org/ns/core#ReadDataId
EntitledAction to read the DataID dataset metadata.
IRI: http://dataid.dbpedia.org/ns/core#ResponseToContact
The responsibility (ResponsibleAction) to respond to contact attempts by external Agents. A contact point for the Entity.
IRI: http://dataid.dbpedia.org/ns/core#ResponseToLifeCycleEvent
The responsibility (ResponsibleAction) to manage changes and react to bugs and issues that are reported concerning a Dataset.
IRI: http://dataid.dbpedia.org/ns/core#SimilarityRole
This role specifies a dataid:DatasetRelationship where one dataset has a significant similarity to another dataset (without any assertion as to dimension of similarity).
IRI: http://dataid.dbpedia.org/ns/core#SourceRole
This role specifies a dataid:DatasetRelationship where one dataset is created by transforming/collectiong data from another dataset.
IRI: http://dataid.dbpedia.org/ns/core#UpdateDataId
The responsibility (ResponsibleAction) to update dataset metadata.
Agents are real or legal persons, groups of persons, programs, organisations etc.