JSON-LD Telecon Minutes for 2013-02-05

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Tue, 05 Feb 2013 13:56:19 -0500
To: Linked JSON <public-linked-json@w3.org>
CC: RDF WG <public-rdf-wg@w3.org>
Message-ID: <511155D3.6010803@digitalbazaar.com>

Thanks to Niklas for scribing for 2 hours straight! The minutes from
today's telecon are now available.

http://json-ld.org/minutes/2013-02-05/

Full text of the discussion follows including a link to the audio
transcript:

--------------------
JSON-LD Community Group Telecon Minutes for 2013-02-05

Agenda:
http://lists.w3.org/Archives/Public/public-linked-json/2013Feb/0003.html
Topics:
1. New Alternate Algorithms Review
2. RDF Algorithms Section
3. ISSUE-217: Disallow BNode identifier as Graph Name
4. JSON-LD 1.0 Final Community Group Specification
Resolutions:
1. Adopt the 'purpose' and 'general solution' language in Dave
Longley's (alternate2.html) specification.
Chair:
Manu Sporny
Scribe:
Niklas Lindström
Present:
Manu Sporny, Niklas Lindström, Gregg Kellogg, Dave Longley,
Paul Kuykendall, Markus Lanthaler, David I. Lehn
Audio:
http://json-ld.org/minutes/2013-02-05/audio.ogg

Manu Sporny: Any additions to the Agenda?
Niklas Lindström is scribing.
Niklas Lindström: We might want to discuss the reply and
suggested change to Eric's recent mail regarding when a default
graph is turned into a named graph. [scribe assist by Manu
Sporny]
Niklas Lindström: If we have time, I'd like to describe some
potential future needs when working with National Library of
Sweden stuff. We're officially using JSON-LD there now, maybe
some framing-like issues and @rev like stuff. [scribe assist by
Manu Sporny]
Gregg Kellogg: I may do a CG proposal for @ordered [scribe
assist by Manu Sporny]
Gregg Kellogg: I'm probably going to do a community document
regarding @list, like @ordered..

Topic: New Alternate Algorithms Review

Dave Longley: I have worked on merging the current and alternate
texts for the algorithms, e.g. including the lookup table
(inverse context) for term selection, also added examples and a
visual description
… my goal is also to have an implementation (the one used on
the playground) is implementing this new spec text (alternate2)
… I've also detected things that were missing, like keyword
aliasing
… we have heard of at least one processor that didn't work
when impl. the described algorithms
… I've left out the 2 or 3 controversial issues that we have
left (like relative iris)
… I've also added some sections to describe the general
problem (or "purpose" as gregg suggested) in paragraph form, for
people who do their own algorithms
… it's difficult to wrap your mind how all the algorithms work
together; so I've attempted to address that as well, e.g. by
using the notion of a "subalgorithm"
… not yet updated: the flattening and node mapping algorithms
… also the mapping to rdf concepts may need more review
… in summary: we have a working implementation of this, and it
should be noted that it didn't really need that many updates
Gregg Kellogg: I like the context processing; it's the term
selection that seem problematic
Dave Longley: re context processing: I reordered to put it first
… we tailored the API to do async processing
… but it may be better to retrieve all the contexts
beforehand, and then do the processing
… this is also much more beneficial for our payswarm work
… caching processed contexts
Niklas Lindström: Thanks for the great summary and work - sounds
awesome. [scribe assist by Manu Sporny]
Niklas Lindström: It's interesting about the pre-loading the
context stuff - it dawned on me a while ago, but didn't have the
time to digest the idea. [scribe assist by Manu Sporny]
Niklas Lindström: Things we've talked about regarding
asynchronous processing in general would be affected if you
needed all the contexts beforehand. I wonder if there is
something more to that idea that would affect the API as well.
We've discussed async vs. sync approaches - maybe the API needs
to be modified... maybe the transformation step is purely
functional? [scribe assist by Manu Sporny]
Dave Longley: I'm not sure we can eliminate that, since someone
might need to do something async during the processing
Dave Longley: I don't know if we can eliminate async entirely -
there may be ways to make it simpler, but I don't think we can
remove some of the stuff from the API. If anybody wanted to do
anything that we don't think of in this group, we might cripple
them. [scribe assist by Manu Sporny]
Paul Kuykendall: I'm in the group implementing the C# impl. I'd
like to note that this new layout of the spec looks very helpful.
… e.g. putting context processing first, and the explanatory
additions to each algorithm are valuable
Manu Sporny: so, the question is if we want to move forward with
one of the three spec alternatives we have before us
Markus Lanthaler: I think we should include the prose of this
directly in the spec. We could agree on that and then discuss the
algorithms separately.
Manu Sporny: sounds good
Gregg Kellogg: I see value in being able to take some sample
data, and walk through the algorithm step by step to see what's
going on
… there is something on term selection which seems to
intentionally be similar to term ranking
… and what's the relation to inverse contexts
Dave Longley: so inverse contexts gives a lookup from iri to
possible terms, and term selection goes through the alternatives;
first building the container and then going through if there's a
language, etc.
… when doing compaction: get info for property IRI, then match
values which apply; and then term selection looks for specificity
to select the proper one
… think of the new term selection algorithm as similar to
markus' querying of inverse context
Gregg Kellogg: what might help is a picture or table to
illustrate this
Dave Longley: yes, a table would be helpful, and show with
arrows what is selected
Dave Longley: i wanted to be clear that we're not going to
modify the data, therefore I used the notion of a shallow copy
Gregg Kellogg: I think we need to move forward with this, and
dave's rewrite addresses or major issues with complexity.
Compaction is still very complicated, but I think this is the
path to go
Dave Longley: there are also places where we explain over again
local processing steps which we could probably explain the gist
of and define them (and then link to them)
Gregg Kellogg: like a micro-algorithm section, sounds good
Manu Sporny: my high level read-over gives me the same
impression as gregg; the purpose and direction of this is where
we want to go
… the things fit together much better now
… and the algorithm work has been very thorough
… so no it's much easier to get an overview
Markus Lanthaler: was the error stuff removal a conscious
decision?
Dave Longley: I wanted to get away from a lot of MUST and SHOULD
language
… so I combined markus' and gregg's error description
… but we should probably add technical (API) error text back
… we should combine the MUST/SHOULD with that
Gregg Kellogg: re. MUST text, if we use that, and we're
duplicating normative text that should exist in the normative
grammar, we should look for something better than repeating that
… using an error code seemed incongruous with an algorithm
which is much more mathematical in nature
… it'd be better with a constant with a title
Markus Lanthaler:

http://json-ld.org.local/spec/latest/json-ld-api/alternate2.html#idl-def-JsonLdErrorCode
… e.g. a "list-of-list error" (could be a tref)
… I prefer something less prescriptive than "raise an error"
Markus Lanthaler: ups... local IRI
… but we need to be explicit about what is an exceptional
error, and leave to the API to define what that is
Markus Lanthaler:

s/http://json-ld.org.local/spec/latest/json-ld-api/alternate2.html#idl-def-JsonLdErrorCode/http://json-ld.org/spec/latest/json-ld-api/#idl-def-JsonLdErrorCode/
Gregg Kellogg: there is a circular dependency issue of letting
the algorithm reference to the API, we need something separate,
and let the API also refer to that
… the algorithms should exist without the API
Manu Sporny: so both could refer to the lookup table, defined in
prose
Gregg Kellogg: yes, and it could also be used to index back to
the normative text describing this
Markus Lanthaler: I don't see how the constants are coupling the
algorithms with the API
Manu Sporny: let's take this part back to the list
… can we do a proposal on the high level text, and next week
propose on the algorithms?
Gregg Kellogg: I would like to come back also to the RDF
algorithms
dave+manu: also include the feature definition language?

PROPOSAL: Adopt the 'purpose' and 'general solution' language in
Dave Longley's (alternate2.html) specification.

Manu Sporny: +1
Gregg Kellogg: +1
Dave Longley: +1
Niklas Lindström: +1
Markus Lanthaler: +1
Paul Kuykendall: +1

RESOLUTION: Adopt the 'purpose' and 'general solution' language
in Dave Longley's (alternate2.html) specification.

Manu Sporny: Markus to review the algorithms; next week we'll
handle whether or not we want to include Dave Longley's algorithm
rewrites.

Topic: RDF Algorithms Section

Gregg Kellogg: there has been some issues regarding aligning
with the RDF concepts, we need to determine the status of that
… also, to add explicit examples
Manu Sporny: yes, would be good (using turtle)
Markus Lanthaler: does it require to be expanded+flattened?
Gregg Kellogg: there' based upon expanded; there may be some
recursion issue, but I'll look at if it would be simplified by
flattnening
… complexity on the order of turtle parsing
Manu Sporny: it might be easier to explain without recursion
… looping over flattened input is probably easier to explain

Topic: ISSUE-217: Disallow BNode identifier as Graph Name

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/217
Manu Sporny: about using blank node identifiers as a graph name.
We raised this with the RDF group. Their response is that graph
names can only be IRIs.
… this is problematic when doing graph normalization. When you
have two graphs, without bnode names, you have to generate a name
… we can't use a hash of the content to name the graph, we
could use fragment IDs, but we'd be specifying something new.
Basically, if we invent a new mechanism, we're just re-inventing
bnode identifiers.
Dave Longley: if you have two graphs without id but same values,
you'd have to assume they're the same graph, which is not
correct.
Gregg Kellogg: what if we say that if graphs occurs without an
@id, it's a default graph?
Gregg Kellogg: according to RDF concepts, you can't have two
graphs that don't have names
… when turning that into RDF, you cannot.
Manu Sporny: Well, if you named them with "blank graph names"
you could. RDF Concepts states that you cant' have anonymous
graph names that are local to the document, which is a mistake.
Gregg Kellogg: fragment identifers do that
Gregg Kellogg: you can't process the same document twice and get
the same bnode out
Gregg Kellogg: we're setting ourselves up for problems if we
diverge from RDF
Niklas Lindström: I agree with Gregg in principle - it'll just
cause more problems if we diverge from RDF WG. [scribe assist by
Manu Sporny]
Niklas Lindström: We support bnode names for properties, right?
[scribe assist by Manu Sporny]
Manu Sporny: Yep. [scribe assist by Manu Sporny]
Niklas Lindström: Terms that don't have explicit @id of @null
are dropped? [scribe assist by Manu Sporny]
Markus Lanthaler: yes. [scribe assist by Manu Sporny]
Niklas Lindström: We support blank nodes for properties, but not
graphs? Syntax for @id supports bnode @ids, maybe we should do a
SHOULD NOT support bnode IDs for properties and graphs? [scribe
assist by Manu Sporny]
Niklas Lindström: The reason to have two blank node identifiers
is to say that there are two graphs that are not named. [scribe
assist by Manu Sporny]
Manu Sporny: yes, the problem is that the RDF model doesn't
allow two different graphs to exist without having names, which
is dumb because they allow two different nodes to exist without
names. Seems like a completely arbitrary decision.
Manu Sporny: it seems the reason for disallowing this seems more
political than logical - no consensus to do anything, so don't do
anything. This has a real-world consequence in that it will break
the RDF Dataset Normalization Algorithm.
Gregg Kellogg: if the name must be an IRI, there is no issue.
What we need to to is note that it's a violation if it's a bnode
id.
Niklas Lindström: I haven't read RDF concepts in detail about
this recently, one thing that strikes me as odd is that you never
in any part of RDF Concepts, expect the IRI or bnode to be
"different", apart from lists. [scribe assist by Manu Sporny]
Manu Sporny: the current RDF 1.1 concepts spec doesn't say that
node and graph are disjoint
Manu Sporny: you can have two blank nodes that refer to each
other. You cannot do that with two "blank" graphs. Why?
Gregg Kellogg: are we really bound to the RDF data model and WG?
I think we are.
Manu Sporny: Gregg and I disagree here. We have done as much
alignment as possible. There are minute differences where JSON-LD
is explicitly more lax and accommodating. E.g. bnodeid's for
properties.
… and up to last week support for bnode id's for graph ids
Dave Longley: it doesn't help with normalization though, which
is tied to quads, we need to be able to use /something/ in the
graph position. We've been using a blank-node like identifier.
… I think we need to say that if you're gonna use @graph other
than as default; you need an @id
Discussion around the effect of bnode ids for graphs won't match
since those ids aren't stable... though, identifiers will be
internally stable (to the document or quad-store).
Markus Lanthaler: are you normalizing datasets or graphs?
Gregg Kellogg: Datasets
Markus Lanthaler: but the algorithm is called graph
normalization?
Manu Sporny: Datasets didn't exist when we wrote the first
version of the spec.
Dave Longley: you normalize to quads
Gregg Kellogg: this is an issue for the RDF WG
Manu Sporny: yes. But it's important to understand that code
we're deploying in two weeks use bnode ids for graphs. If the
normalization algorithm changes that's a problem
Dave Longley: It could work for payswarm if we disallow it; we
can adapt
Manu Sporny: It'll be hard to convince the RDF WG that the RDF
Concepts model is broken.

PROPOSAL: Disallow blank node identifiers for graph names.

Manu Sporny: -1 (I think if we do this, we align with the RDF
data model, which is broken - no reason to disallow
blank-node-like identifiers for graph names)
Gregg Kellogg: +1
Dave Longley: a sad -0
Markus Lanthaler: +0.5
David I. Lehn: -0
Niklas Lindström: +0.1
Manu Sporny: Is there anything that would get more consensus
that this?
Markus Lanthaler: Yeah, what's in the spec right now.
Markus Lanthaler: if we don't support it in the data model we
have to throw an error
Gregg Kellogg: by sticking with SHOULD, you allow for usage to
evolve which could affect future RDF
Dave Longley: do we have feedback from RDF WG on SHOULD NOT vs
MUST NOT?
Manu Sporny: not really...
Gregg Kellogg: my guess is they'd grudgingly go along with
should not, but further convince them that JSON-LD is deviating
from RDF unnecessarily.
Markus Lanthaler: this is the current spec text: "Each named
graph is a pair consisting of an IRI or blank node identifier
(the graph name) and a JSON-LD graph. Whenever possible, the
graph name should be an IRI."
Gregg Kellogg: what would it mean to use the bnode-id as a
subject in a description, and use the same bnode-id for the
graph?
… this should be brought up to the working group now.
… it *may* result in a retreat of a MUST NOT
Niklas Lindström: We may want to discuss this with the
Provenance WG about this. [scribe assist by Manu Sporny]

PROPOSAL: Graph names SHOULD use IRIs. The JSON-LD Data model
supports identifiers for graphs that are IRIs and identifiers
that look like blank node identifiers, but instead identity
graphs. The RDF Conversion algorithm SHOULD generate an error
when a non-absolute IRI is detected when converting to RDF.

Markus Lanthaler: Counter Proposal: Keep the current spec text:
"Each named graph is a pair consisting of an IRI or blank node
identifier (the graph name) and a JSON-LD graph. Whenever
possible, the graph name should be an IRI."
Markus Lanthaler:
http://json-ld.org/spec/latest/json-ld-syntax/#relationship-to-rdf
Gregg Kellogg: with no change, we need an issue marker for bnode
ids as graph ids
Markus Lanthaler: That's already in the spec: "In contrast to the
RDF data model as defined in [RDF-CONCEPTS], JSON-LD allows blank
nodes as property labels and graph names. This feature is
controversial in the RDF WG and may be removed in the future."
Manu Sporny: we can leave it open if we mark it as at risk (we
can still go to LC)
… we'll bring it up again in the RDF WG

Topic: JSON-LD 1.0 Final Community Group Specification

Markus Lanthaler:
http://json-ld.org/spec/FCGS/json-ld-syntax/20130202/
Manu Sporny: Should we publish the FCGS specification?
Niklas Lindström: We still have outstanding issues, why now?
Manu Sporny: Because we want to get the Intellectual Property
aggrements in place while RDF WG is reviewing a semi-finalized
specification.
Gregg Kellogg: I think we should wait a week, then try again.
JSON-LD 1.0 Community Group agrees to wait a week.

-- manu

--
Manu Sporny (skype: msporny, twitter: manusporny, G+: +Manu Sporny)
President/CEO - Digital Bazaar, Inc.
blog: Aaron Swartz, PaySwarm, and Academic Journals
http://manu.sporny.org/2013/payswarm-journals/

Received on Tuesday, 5 February 2013 18:56:56 UTC