RDF-ISSUE-131 (mobile-datasets): How can one create an RDF dataset without being a web server? [RDF Graphs] from RDF Working Group Issue Tracker on 2013-05-17 (public-rdf-wg@w3.org from May 2013)

From: RDF Working Group Issue Tracker <sysbot+tracker@w3.org>
Date: Fri, 17 May 2013 13:24:53 +0000
To: public-rdf-wg@w3.org
Message-Id: <E1UdKeT-0007eM-8p@tibor.w3.org>

RDF-ISSUE-131 (mobile-datasets): How can one create an RDF dataset without being a web server? [RDF Graphs]

http://www.w3.org/2011/rdf-wg/track/issues/131

Raised by: Sandro Hawke
On product: RDF Graphs

In general, the SPARQL definition of datasets (adopted into RDF 1.1 by WG resolution on 29 October 2012) satisfies our charter deliverable of allowing people to work with multiple graphs. However, it requires that each graph be labeled with an IRI, and creating such an IRI can be problematic.

It's easy enough for software to make up IRIs for graphs if it happen to be a web server, in charge of some range of web addresses. But how can other software do this? For instance, how can a web client create a dataset to send as one of several parameters in an HTTP POST operation? And how can a web client use datasets for HTTP PATCH (as the LDP Working Group wants to do). And how can something use datasets in a UDP or TCP based protocol?

At the moment, a few options come to mind:

Option 1 - Use RFC-4122 Random UUIDs as graph names. These are IRIs that look like urn:uuid:7a745845-5a5e-46ad-9ae7-6ec202741183, where the hex parts are 118 random bits, and 10 fixed bits. In theory, collision is unlikely if a good source of randomness is available. Perhaps the randomness can be improved by including a hash of the other parts of the dataset. Note that use of non-resolvable IRIs like this is bad practice for Linked Data.

<urn:uuid:7a745845-5a5e-46ad-9ae7-6ec202741183> { ... contents of graph ... }

Option 2 - Use a UUID-like string as an IRI base or prefix for graph names. (Slight variation on Option 1.) By going outside the RFC-4122 syntax, we can include a "local part" in the IRI. Something like:

@prefix my: <tag:w3.org,2013:uuid:7a745845-5a5e-46ad-9ae7-6ec202741183:>
...
my:g2 { ... contents of graph 2 ... }

Option 3 - Use a "relative" dataset, where the graph names are written as relative IRIs but the base for IRI-resolution is not known to the system generating the dataset and is assigned to some new, unique IRI base by each receiver. This is arguably not licensed by the current RDF drafts or the SPARQL 1.1 spec. Some client libraries will not store or serialize RDF with relative IRIs.

<#g3> { ... contents of graph 3 ... }

Option 4 - Use blank nodes as graph names. This is not allowed in Datasets as defined in the current RDF drafts or the SPARQL 1.1 spec. Some client libraries will not store or serialize RDF datasets with blank node graph names. As with other uses of blank nodes, knowing they cannot be referenced by other documents allows certain optimizations, and they can be Skolemized for use in systems that do not want/allow blank nodes.

_:g4 { ... contents of graph 4 ... }

Option 5 - Do not directly support this use case in RDF 1.1. Instead, require systems to use an extended RDF which allows blank node graph names, eg JSON-LD, or variations on TriG and N-Quads which may arise for this purpose.

Received on Friday, 17 May 2013 13:24:54 UTC