RE: Input needed from RDF group on JSON-LD skolemization from Markus Lanthaler on 2013-07-01 (www-archive@w3.org from July 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Mon, 1 Jul 2013 11:41:01 +0200
To: "'David Booth'" <david@dbooth.org>, "'Pat Hayes'" <phayes@ihmc.us>
Cc: "'www-archive'" <www-archive@w3.org>, "'Andy Seaborne'" <andy@apache.org>, "'Manu Sporny'" <msporny@digitalbazaar.com>, "'Hawke, Sandro'" <sandro@w3.org>, "'Wood, David'" <david@3roundstones.com>, "'David Longley'" <dlongley@digitalbazaar.com>, "'Gregg Kellogg'" <gregg@greggkellogg.com>
Message-ID: <00bc01ce763f$1c868c00$5593a400$@lanthaler@gmx.net>

On Sunday, June 30, 2013 10:45 PM, David Booth wrote:
> On 06/30/2013 10:25 AM, Pat Hayes wrote:
> >
> > On Jun 27, 2013, at 10:19 PM, David Booth wrote:
> >
> >> [Copying public archive www-archive.w3.org for lack of a better
> >> option]
> >>
> >> PROBLEM SUMMARY
> >>
> >> GOAL: Any two JSON-LD-compliant parsers should produce the exact
> >> same RDF triples when parsing the same JSON-LD document, except for
> >> blank node labels and (possibly) datatype conversions.
> >>
> >> CURRENT PROBLEM: JSON-LD  is intended to be a concrete RDF syntax,
> >> but the JSON-LD data model has some extensions to the RDF data
> >> model, and this causes some non-determinism and/or important
> >> information loss when interpreting JSON-LD as RDF.
> >
> > Wait. There are two issues getting muddled here. Yes, there can be
> > information loss in JSON-LD ==> RDF. No, it does not follow that the
> > mapping is nondeterministic or ambiguous. So information loss does
> > not compromise the GOAL as stated.
> 
> True.  I thought it would be obvious that information loss is
> undesirable (since otherwise we could just map to the empty graph), but
> to clarify: the goal is to have a deterministic mapping *with* minimum
> information loss.

How do you define "loss"? The data is obviously in the JSON(-LD). You seem
to suggest that we remove a mechanism that allows to map data to something
close enough to RDF that some RDF systems already support. IMO, that's also
information loss.


> David
> 
> >
> > Pat .
> >> At present, the results of JSON-LD-compliant parsing of a JSON-LD
> >> document to produce a set of RDF triples is non-deterministic
> >> because JSON-LD allows blank node predicates and RDF does not.
> >
> > That is a nonsequiteur. There is a perfectly deterministic algorithm
> > to map JSON-LD into RDF, with information loss. Option (a) below, for
> > example.
> >
> >> The JSON-LD specification currently suggests three potential
> >> solutions but does not mandate one of them: (a) discard triples
> >> that contain blank node predicates; (b) retain triples that contain
> >> blank node predicates; or (c) skolemize blank nodes that are used
> >> in the predicate position.
> >>
> >>
> >> RANGE OF POTENTIAL SOLUTIONS
> >>
> >> 1. Change JSON-LD to prohibit JSON-LD blank nodes in positions
> >> where the RDF interpretation of JSON-LD would cause them to be
> >> mapped to illegal RDF blank nodes.
> >>
> >> Pros: Easy enough spec change.
> >>
> >> Cons: Loss of JSON-LD functionality?  (Is there an important use
> >> case for having blank nodes in predicate positions in JSON-LD?)
> >>
> >> My comments: This seems to me like the best available option.

How is that different from the current situation? Instead of mapping
predicates to bnode identifiers people won't map them at all then. The
resulting RDF is the same.


> >> 2. Change RDF to permit blank nodes as predicates.
> >>
> >> Pros: Avoids information loss.
> >>
> >> Cons: Not possible in the current RDF working group, because it is
> >> specifically specified in the charter as being out of scope:
> >> http://www.w3.org/2011/01/rdf-wg-charter "Some features are
> >> explicitly out of scope for the Working Group . . . Removing
> >> current restrictions in the RDF model (e.g., . . . blank nodes as
> >> predicates"
> >>
> >> My comments: To my mind, this would have been a second-best option
> >> if it were available.
> >>
> >>
> >> 3. Change the JSON-LD-to-RDF-model mapping to specify that illegal
> >> triples are discarded.
> >>
> >> Pros: Easy change to the JSON-LD spec.
> >>
> >> Cons: Significant information loss when interpreting JSON-LD as
> >> RDF.
> >>
> >> My comments: Not acceptable, due to the information loss.

Again, the same result IMO.


> >> 4. Require skolemization of bnodes that appear in the predicate
> >> positiont.   (Note that if skolemization of a bnode is performed,
> >> it must be performed uniformly on all instance of that bnode that
> >> arise from that JSON-LD document.)  RDF-standards-based
> >> round-trippable skolemization would permit round-tripping of the
> >> skolemized bnodes back to the original JSON-LD even if the return
> >> trip is performed by a different party.
> >>
> >> Pros: Avoids information loss.
> >>
> >> Cons: (a) More complex than other options; (b) To avoid possible
> >> URI clashes, the skolemizer would need a user-specific URI prefix
> >> as a parameter, such as
> >> http://example.com/.well-known/genid/alice/
> >>
> >> My comments: Complex, but acceptable.

Yes, skolemization might be necessary... but not on the JSON-LD side. It is
the consumer that has to skolemize if it can't accept the data otherwise.


> >> Are there other options or pros/cons that I did not list?  Which
> >> options would be preferable, acceptable or not acceptable to you?
> >>
> >> I suggest adopting #1, but also adding a note to the JSON-LD spec
> >> that recommends that parsers offer an *option* (disabled by
> >> default) to retain triples with a blank node predicate.

That's a contradiction. You can't prohibit blank-node-predicates at the
syntax level and provide a flag to allow it. All such documents would be
invalid.  I think what you mean here is that they are discarded by default
when converting to RDF but retained if that option is set, right? If so,
then we this is again exactly the same point as we currently have but
instead of doing that filtering in the toRDF algorithm, it would be the
consumer of the toRDF result to do the filtering.


--
Markus Lanthaler
@markuslanthaler

Received on Monday, 1 July 2013 09:41:39 UTC