Re: Input needed from RDF group on JSON-LD skolemization from David Booth on 2013-07-01 (www-archive@w3.org from July 2013)

From: David Booth <david@dbooth.org>
Date: Mon, 01 Jul 2013 10:06:53 -0400
To: Markus Lanthaler <markus.lanthaler@gmx.net>
CC: 'Pat Hayes' <phayes@ihmc.us>, 'www-archive' <www-archive@w3.org>, 'Andy Seaborne' <andy@apache.org>, 'Manu Sporny' <msporny@digitalbazaar.com>, "'Hawke, Sandro'" <sandro@w3.org>, "'Wood, David'" <david@3roundstones.com>, 'David Longley' <dlongley@digitalbazaar.com>, 'Gregg Kellogg' <gregg@greggkellogg.com>
Message-ID: <51D18CFD.2020108@dbooth.org>
On 07/01/2013 05:41 AM, Markus Lanthaler wrote:
> On Sunday, June 30, 2013 10:45 PM, David Booth wrote:
>> On 06/30/2013 10:25 AM, Pat Hayes wrote:
>>>
>>> On Jun 27, 2013, at 10:19 PM, David Booth wrote:
>>>
>>>> [Copying public archive www-archive.w3.org for lack of a better
>>>> option]
>>>>
>>>> PROBLEM SUMMARY
>>>>
>>>> GOAL: Any two JSON-LD-compliant parsers should produce the exact
>>>> same RDF triples when parsing the same JSON-LD document, except for
>>>> blank node labels and (possibly) datatype conversions.
>>>>
>>>> CURRENT PROBLEM: JSON-LD  is intended to be a concrete RDF syntax,
>>>> but the JSON-LD data model has some extensions to the RDF data
>>>> model, and this causes some non-determinism and/or important
>>>> information loss when interpreting JSON-LD as RDF.
>>>
>>> Wait. There are two issues getting muddled here. Yes, there can be
>>> information loss in JSON-LD ==> RDF. No, it does not follow that the
>>> mapping is nondeterministic or ambiguous. So information loss does
>>> not compromise the GOAL as stated.
>>
>> True.  I thought it would be obvious that information loss is
>> undesirable (since otherwise we could just map to the empty graph), but
>> to clarify: the goal is to have a deterministic mapping *with* minimum
>> information loss.
>
> How do you define "loss"?

In this case, when interpreting JSON-LD as RDF, information loss means 
discarding triples that have blank nodes as predicates, because RDF does 
not allow blank nodes as predicates.

> The data is obviously in the JSON(-LD).

Yes -- represented using blank nodes that, when converted to the RDF 
model, would appear in the predicate position..

> You seem
> to suggest that we remove a mechanism that allows to map data to something
> close enough to RDF that some RDF systems already support.

Yes, I am suggesting that the feature of permitting blank nodes in that 
JSON-LD position -- the position that would result in blank predicates 
in the RDF model -- be removed.   Users would instead be required to use 
URIs in those cases instead of blank nodes.

> IMO, that's also information loss.

I would consider it a feature loss rather than information loss, since a 
user could still represent the information.  The user would just have to 
use a URI instead of a blank node.   Is there an important use case for 
permitting blank nodes in the JSON-LD position that would map them to 
blank node predicates in RDF?  If so, what is it?

David

>
>
>> David
>>
>>>
>>> Pat .
>>>> At present, the results of JSON-LD-compliant parsing of a JSON-LD
>>>> document to produce a set of RDF triples is non-deterministic
>>>> because JSON-LD allows blank node predicates and RDF does not.
>>>
>>> That is a nonsequiteur. There is a perfectly deterministic algorithm
>>> to map JSON-LD into RDF, with information loss. Option (a) below, for
>>> example.
>>>
>>>> The JSON-LD specification currently suggests three potential
>>>> solutions but does not mandate one of them: (a) discard triples
>>>> that contain blank node predicates; (b) retain triples that contain
>>>> blank node predicates; or (c) skolemize blank nodes that are used
>>>> in the predicate position.
>>>>
>>>>
>>>> RANGE OF POTENTIAL SOLUTIONS
>>>>
>>>> 1. Change JSON-LD to prohibit JSON-LD blank nodes in positions
>>>> where the RDF interpretation of JSON-LD would cause them to be
>>>> mapped to illegal RDF blank nodes.
>>>>
>>>> Pros: Easy enough spec change.
>>>>
>>>> Cons: Loss of JSON-LD functionality?  (Is there an important use
>>>> case for having blank nodes in predicate positions in JSON-LD?)
>>>>
>>>> My comments: This seems to me like the best available option.
>
> How is that different from the current situation? Instead of mapping
> predicates to bnode identifiers people won't map them at all then. The
> resulting RDF is the same.
>
>
>>>> 2. Change RDF to permit blank nodes as predicates.
>>>>
>>>> Pros: Avoids information loss.
>>>>
>>>> Cons: Not possible in the current RDF working group, because it is
>>>> specifically specified in the charter as being out of scope:
>>>> http://www.w3.org/2011/01/rdf-wg-charter "Some features are
>>>> explicitly out of scope for the Working Group . . . Removing
>>>> current restrictions in the RDF model (e.g., . . . blank nodes as
>>>> predicates"
>>>>
>>>> My comments: To my mind, this would have been a second-best option
>>>> if it were available.
>>>>
>>>>
>>>> 3. Change the JSON-LD-to-RDF-model mapping to specify that illegal
>>>> triples are discarded.
>>>>
>>>> Pros: Easy change to the JSON-LD spec.
>>>>
>>>> Cons: Significant information loss when interpreting JSON-LD as
>>>> RDF.
>>>>
>>>> My comments: Not acceptable, due to the information loss.
>
> Again, the same result IMO.
>
>
>>>> 4. Require skolemization of bnodes that appear in the predicate
>>>> positiont.   (Note that if skolemization of a bnode is performed,
>>>> it must be performed uniformly on all instance of that bnode that
>>>> arise from that JSON-LD document.)  RDF-standards-based
>>>> round-trippable skolemization would permit round-tripping of the
>>>> skolemized bnodes back to the original JSON-LD even if the return
>>>> trip is performed by a different party.
>>>>
>>>> Pros: Avoids information loss.
>>>>
>>>> Cons: (a) More complex than other options; (b) To avoid possible
>>>> URI clashes, the skolemizer would need a user-specific URI prefix
>>>> as a parameter, such as
>>>> http://example.com/.well-known/genid/alice/
>>>>
>>>> My comments: Complex, but acceptable.
>
> Yes, skolemization might be necessary... but not on the JSON-LD side. It is
> the consumer that has to skolemize if it can't accept the data otherwise.
>
>
>>>> Are there other options or pros/cons that I did not list?  Which
>>>> options would be preferable, acceptable or not acceptable to you?
>>>>
>>>> I suggest adopting #1, but also adding a note to the JSON-LD spec
>>>> that recommends that parsers offer an *option* (disabled by
>>>> default) to retain triples with a blank node predicate.
>
> That's a contradiction. You can't prohibit blank-node-predicates at the
> syntax level and provide a flag to allow it. All such documents would be
> invalid.  I think what you mean here is that they are discarded by default
> when converting to RDF but retained if that option is set, right? If so,
> then we this is again exactly the same point as we currently have but
> instead of doing that filtering in the toRDF algorithm, it would be the
> consumer of the toRDF result to do the filtering.
>
>
> --
> Markus Lanthaler
> @markuslanthaler
>
>
>
>
Received on Monday, 1 July 2013 14:07:23 UTC