JSON-LD Telecon Minutes for 2012-11-27 from Manu Sporny on 2012-12-06 (public-rdf-wg@w3.org from December 2012)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Wed, 05 Dec 2012 20:21:20 -0500
To: Linked JSON <public-linked-json@w3.org>
CC: RDF WG <public-rdf-wg@w3.org>
Message-ID: <50BFF310.4020302@digitalbazaar.com>
Seems as if I forgot to send out the telecon minutes for 2012-11-27.
Thanks to François for scribing last week! The minutes from last week's
call are now available (I have yet to clean up and upload the audio):

http://json-ld.org/minutes/2012-11-27/

Full text of the discussion follows including a link to the audio
transcript:

--------------------
JSON-LD Community Group Telecon Minutes for 2012-11-27

Agenda:
   http://lists.w3.org/Archives/Public/public-linked-json/2012Nov/0019.html
Topics:
   1. ISSUE-182: Dataset vs. Graph
   2. ISSUE-113: Define exactly how (IRI) compaction is supposed
      to work
   3. ISSUE-172: Should each member in a list contribute to term
      rank?
   4. ISSUE-200: JSON-LD API Review by Robin Berjon
Resolutions:
   1. When compacting lists, the most specific term that matches
      all of the elements in the list, taking into account the default
      language, must be selected.
   2. The callback signature for the .toRDF() method should
      accept Quad[]. That is, the callback is called once after all
      processing has been completed.
Chair:
   Manu Sporny
Scribe:
   François Daoust
Present:
   François Daoust, Markus Lanthaler, Manu Sporny, Gregg Kellogg,
   Niklas Lindström, David I. Lehn
Audio:
   http://json-ld.org/minutes/2012-11-27/audio.ogg

François Daoust is scribing.
Markus Lanthaler: We should discuss
   https://github.com/json-ld/json-ld.org/issues/182 first, there
   was some discussion about it on last week's RDF WG telecon.

Topic: ISSUE-182: Dataset vs. Graph

Markus Lanthaler:  I gave a quick update on JSON-LD during last
   week's RDF telecon
Gregg Kellogg: http://www.w3.org/2011/rdf-wg/track/issues/105
   … and we came across issue 105 about dataset syntaxes vs.
   graph syntaxes
   … the issue is that if we dereference a URI and get a graph,
   it wouldn't be the same as getting a dataset even if the data is
   the same.
   … One solution is that we put in JSON-LD spec that we treat
   the data in the default graph in the JSON-LD Dataset as the graph
   in a usual graph-based serialization.
Manu Sporny:  That seems a reasonable way to address the issue.
   … What you're saying is that the RDF WG wouldn't have a
   problem with that?
Markus Lanthaler:  The RDF WG does not say anything about these
   semantics.
   … Richard made the comment that this could be generalized. The
   idea would be that we come up with a proposal and push it to the
   RDF WG. They don't have a lot of interest in the issue otherwise
   and might just close it.
Gregg Kellogg:  I wonder if default graph is really the right
   choice.
   … Let's say that you have a datasource in Turtle that
   describes a book.
   … It would be natural to put the metadata about the book in
   the default graph in JSON-LD.
   … and you could put the description of the book in the named
   graph whose name is the location of the document.
   … If the @graph keyword is used, then perhaps it makes things
   more explicit.
Manu Sporny:  It seems that it would be more of a Best Practice
   thing.
   … Does not seem to require any MUST or SHOULD.
   … It seems that what Markus is proposing is easier.
Gregg Kellogg:  I think it's less a JSON-LD issue than a dataset
   issue.
   … If you use a dataset in a graph, you could use all the data,
   and it's not wrong. You'd have more data.
Markus Lanthaler:  also discussed during the telecom, but that
   would mean that you could not generically do content-negotiation
   with JSON-LD because it would be up to the application to decide
   where it puts the information.
   … By default, if you put all info in a named graph that has
   the same URI, you end up with sort of two default graphs, which
   sounds weird.
Gregg Kellogg:  that would be two named graphs.
   … Question is what does the receiver do with the data it
   receives?
Markus Lanthaler:  problem is that there would be no way to
   interpret the data in a generic way.
Gregg Kellogg:  we just need to describe what the behavior should
   be on the client side.
   … I could see an argument for just flattening, basically just
   stripping off the named graphs.
Manu Sporny:  concerned about data loss, meaning references to
   the named graph.
   … We don't know where the triple originate from.
Gregg Kellogg:  the only other solution would be to reify it. No,
   thank you.
Manu Sporny:  most natural thing would be to use the default
   graph. If the server is mixing and matching datasets and graphs,
   the lowest denominator should be used, which means the default
   graph.
[discussion about Payswarm implementation]
Gregg Kellogg:  it seems to me that there is a trend towards
   supporting named graphs.
   … I can certainly see that happening. I think it would be
   natural to do things. Signing information is useful use case.
   … The source of the important info is likely to be in a named
   graph unless we add more semantics to the default graph.
Manu Sporny:  in the use case where there is signature, the
   "default" graph is effectively going to be named.
Gregg Kellogg:  yes, and the name could be the URI of the
   document.
Manu Sporny:  In PaySwarm, we actually don't use named graphs yet
   because RDFa doesn't support them yet. We talk about the
   signature on the graph as another set of triples, which is a bit
   awkward, but it works.
Gregg Kellogg:  We could support some of it in the RDF conversion
   algorithms. One of Robin's comment is about calling only one
   callback. We could do some magic there if we have that.
   … I think we really want to push JSON-LD to the main frame of
   RDF, not to the fringe.
Niklas Lindström: +1, this is the crucial part
Gregg Kellogg:  It's not just JSON-LD. For JSON-LD, document is
   generally limited, but in quads, it can be gigabytes, and you
   cannot wait up until you have ingested the whole thing before
   asserting things.
[scribe missed some of Gregg's comments]
Niklas Lindström:  It would be good if we could formulate some
   concrete suggestion to the RDF WG.
   … For one, if I understood correctly, the concept of datasets
   within RDF 1.1 does not allow to nest datasets.
Gregg Kellogg:  correct.
Gregg Kellogg: Basically, the argument is that if expecting a
   graph, a consumer should extract the graph with the name
   equivalent to the location.
Gregg Kellogg: … We can change the to/from RDF algorithm to take
   a JSON-LD document with only a default graph and output it using
   a name based on the location.
Niklas Lindström:  it's to me a clear indication, the grouping of
   triples is clearly outside the notion of graphs. It's just a way
   to group sets. There should no semantic between the the set of
   triples and the groups that contain these triples.
   … The union of triples should be treated the same way as if
   they were together.
   … If we make a difference, it's http-range-14 times 10.
Manu Sporny:  So I'm having a hard time finding the difference
   between your two views. Could you formulate something?
Gregg Kellogg:  I pasted my proposal on IRC: "Basically, the
   argument is that if expecting a graph, a consumer should extract
   the graph with the name equivalent to the location."
Manu Sporny:  How does that translate to JSON-LD?
   Content-negotiating between Turtle and JSON-LD, what would the
   resulting JSON-LD graph contain?
Gregg Kellogg:  with my proposal, if you have a named graph, you
   use that, otherwise you use the default graph.
Manu Sporny:  How does that affect the JSON-LD document?
Markus Lanthaler: My proposal would be to say that you can use
   JSON-LD as a graph source. The consumer would just use the
   default graph in that case
Gregg Kellogg:  It doesn't. If we're returning quads in JSON-LD.
   With no name, the intent is clear. If the name is the
   same-document relative URI, then that's the same thing.
Markus Lanthaler: The problem is (as I've found out last week)
   that graphs can be treated as logical expressions, but not
   datasets
Markus Lanthaler: see:
   http://www.w3.org/2011/rdf-wg/meeting/2012-11-21#line0244
Gregg Kellogg:  It does not have implications on the JSON-LD
   syntax.
Manu Sporny:  I guess I'm unclear about the differences between
   what you're proposing and what Markus is proposing.
   … It seems that your proposals are parallel. Neither of them
   requires us to change JSON-LD at all.
Markus Lanthaler:  If I understood Gregg correctly, there would
   be no default graph when turning to RDF
Gregg Kellogg:  when turning to RDF, that's correct. It would
   return quads that are named according to the document location.
   This would address the use case where the default graph is used
   to provide provenance information.
Markus Lanthaler:  You prevent another use case. You cannot put
   anything in the default graph.
Gregg Kellogg:  No, you can! In JSON-LD, you can have an empty
   name graph. @graph with an empty object as a value. It doesn't
   put any triples in the graph.
Markus Lanthaler:  you would put the data in the named graph if
   there is no such named graph in the first place?
Gregg Kellogg:  yes.
Markus Lanthaler:  I don't really like that. It means your data
   moves if you later decide to change the graph and add such a
   named graph.
Niklas Lindström:  I think the problem here is that the notion of
   graph is the domain of the keeper of information. In Gregg's
   example, if you have an URI for the document, and you return a
   dataset with assertions with a named graph that uses that URI.
   From a consumer perspective, you would want to put provenance
   information in your default graph. There is a clash of two
   worlds. Conflict between default graph and source of each graph.
Gregg Kellogg:  we could just say that provenance information
   should not be written in the default graph.
   … That would allow us to use the default graph as now.
   … We have examples that might be worth re-writing, in
   particular when we talk about signing information.
   … Chicken-and-egg situation as a named graph needs to be
   included in the default graph in JSON-LD
Manu Sporny:  I suggest to push the issue off to the issue
   tracker. Niklas, Gregg, Markus, please put some proposals there.
Niklas Lindström:  named graph with provenance data. I have
   minted special URIs for Atom entries. Sort of similar to distinct
   named graph with provenance information as Gregg suggests.
   … There may be something substantially useful there.
Manu Sporny:  OK, let's see concrete proposals and get back to it
   next week.
Niklas Lindström:
   http://www.w3.org/2011/rdf-wg/meeting/2012-11-21#line0285
Niklas Lindström: Sandro: "you can treat this is as graph source,
   if you want, and when you do, you get the default graph"
Niklas Lindström:  Sandro said something that looks like Markus
   proposal.
Gregg Kellogg:  yes, but we need to think through the provenance
   issues.

Topic: ISSUE-113: Define exactly how (IRI) compaction is supposed to work

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/113
Manu Sporny:  two proposals on the table with concerns from
   Markus that we may be missing the point.
   … This is the whole term-ranking discussion. Markus proposes
   updates to the algorithm. Gregg and Dave thought it would just be
   different, not necessarily better.
Manu Sporny: PROPOSAL 1: Clarify parts of the IRI compaction
   algorithm that need to change, but do not change the algorithm in
   any large way as it works and has been implemented by two
   different people.
Manu Sporny: PROPOSAL 2: Adopt Markus' proposed algorithm above
   for the IRI compaction algorithm.
Manu Sporny:  It seems first proposal has the most amount of
   support.
   … I guess Markus point is that clarification is not enough.
Markus Lanthaler:  It's not clear to me what this proposal means.
   It's too abstract for me.
Manu Sporny:  The main thing that proposal is trying to convey is
   that the algorithm is the one that is in the spec. So it's about
   clarifying the parts that are not clear.
Gregg Kellogg:  This also intersects with possible changes we
   need to make to deal with property generators.
   … and language maps. It's possible that the term ranking
   algorithm may need to be revisited in light of these. If it does,
   it could be good to improve it if we can.
Manu Sporny:  If we work on it heavily, it could modify a number
   of test cases.
Gregg Kellogg:  It's easy to find test cases that will be more
   appropriately dealt with by a given algorithm, but that's not the
   point of test cases which should test the actual algorithm that
   is in the spec.
   … If you're abusing term ranking with lists.
   … I guess we should make things much simpler in such cases.
Markus Lanthaler:  but we never discussed that. It says something
   in between.
Manu Sporny:  The best way to solve it might be to re-write the
   algorithm. If it addresses the compaction issues, I don't really
   care what it looks like. It needs to be simple and do the job.
   Someone just needs to do it.
Markus Lanthaler:  I don't care if it's my algorithm but I do
   care what the output of the algorithm is. That's why I would like
   to decide what the desired output is.
Gregg Kellogg:  I think it's clear for everything but lists.
   … It's really when you get to what is the best term to use for
   a list that things get tricky.
   … I can certainly see that I might want to select a term to
   express that list. When you have a list with different languages,
   it's a bit nonsensical.
Niklas Lindström:  The only applicable term with mixed content
   should be the one that has no type and language. You can't split
   the list. That's the simplest solution to me.
   ... If it's a mixed list, you must treat that data with lots
   of inline knowledge in your code.
Gregg Kellogg:  That would alter the algorithm as it is written
   now to reject a term [scribe missed exact change, it's kind of
   hard to scribe algorithms expressed orally ;)]
Niklas Lindström:  The only case where I used mixed lists was to
   report errors. I have to pick up the specific details of that, so
   no coercion.
Manu Sporny:  Going back, I think we have agreement on how this
   should work. Someone needs to sit down re-writing the algorithm.
   Whoever does it first and implements it wins :)
   … I'm fine with Markus re-writing the algorithm if he takes
   other people comments into account.
Gregg Kellogg:  This should be the final version.
Manu Sporny:  Right, it should include everything.

PROPOSAL:  When compacting lists, the most specific term that
   matches all of the elements in the list, taking into account the
   default language, must be selected.

Gregg Kellogg: +1
Manu Sporny: +1
François Daoust: +1
Niklas Lindström: +1
Markus Lanthaler: +1

RESOLUTION: When compacting lists, the most specific term that
   matches all of the elements in the list, taking into account the
   default language, must be selected.

Manu Sporny:  do we need to do anything else to address this
   issue here?
   … OK, moving on, then.

Topic: ISSUE-172: Should each member in a list contribute to term rank?

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/172
Manu Sporny:  Basically, that's what we just discussed. The
   answer is "yes" but not quite straightforward. Each member in the
   list is checked and the most specific term that matches all the
   elements in the list is taken.

Topic: ISSUE-200: JSON-LD API Review by Robin Berjon

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/200
Manu Sporny:  Review by Robin Berjon.
   … Ivan felt that it would be good to have an API review by
   someone that has a lot of experience with WebIDL and Javascript
   APIs.
   … I see that Markus has already responded.
Gregg Kellogg:  I certainly think we should talk about the use of
   IRI vs. URL.
Manu Sporny:  Robin suggests we use URL instead of IRI, even
   though IRI is more correct.
Gregg Kellogg:  HTML5 modifies what URL means, at least last time
   I checked, and we put some provision in RDFa I think about that.
Manu Sporny:  The plan is to update the URL spec to absorb the
   IRI spec, but not positive about that.
François Daoust:  One thing that wasn't said - we said we're
   using URL to mean IRI. [scribe assist by Manu Sporny]
David I. Lehn: can i vote for URI? :)
Gregg Kellogg:
   http://dev.w3.org/html5/spec/single-page.html#resolving-urls
Gregg Kellogg:  maybe URI, at it's most commonly understood than
   IRI. We could use URI and say that we conform to IRI spec.
Niklas Lindström:  The problem is that, technically, URI and IRI
   are not the same thing. I think we should stick to IRI until
   someone is really pushing for the change.
Manu Sporny:  Agree, let's move on.
[Manu going over Robin's comments]
Manu Sporny:  changing JSON Object to reference JSON spec?
Markus Lanthaler:  yes, much clearer in Syntax spec.
[discussion on NoInterfaceObject on JsonLdProcessor]
Manu Sporny:  I'm going to push back on that.
   … that's how JSON works. JSON.parse, JSON.stringify.
   … that's probably what we want to follow.
Markus Lanthaler:  You could have a private constructor.
Manu Sporny:  we might want to ask the whatwg channel. I'm not
   convinced that constructors are the right way to go. That's what
   I did previously and received a lot of pushback.
Manu Sporny:  ref. asynchronous/synchronous. We could say that
   this is an asynchronous API but that implementations in other
   languages may use a synchronous version.
   … I don't think that adding a synchronous API buys us a lot of
   things.
Niklas Lindström:  do we need to rephrase the note that it's only
   when you don't want to implement the API but want to follow the
   gist of it.
Manu Sporny:  Yes, we should clarify the wording. I also think we
   should not specify a synchronous API and we should also not claim
   that the API is the only way to implement the algorithms.
Markus Lanthaler: I think the spec is quite clear on this:
   http://json-ld.org/spec/latest/json-ld-api/#jsonldprocessor
Manu Sporny:  ref. error constants, that's true, something we
   never have had time to review so far.
Manu Sporny:  ref. losing information, I'm pretty that's what
   we're doing.
Gregg Kellogg:  we lose information for terms that are not
   defined.
Markus Lanthaler:  we still have a constant that is not used
   anywhere. That may have triggered the comment.
   … "lossy compaction", let's remove that.
Manu Sporny:  re. modification in place, it's true. We should be
   probably be modifying a copy of the provided input.
Markus Lanthaler:  yes.
Manu Sporny:  re. "string" and "number" in WebIDL. OK, we'll have
   a look at WebIDL for numbers.
Manu Sporny:  re. toRDF designed wrong, true for the final call.
   We wanted to provide feedback about how many triples has been
   generated. I'm afraid that if we call back with an array of
   quads, that would make a lot of data. That said, we'll need to
   keep that data in memory, so that memory is needed anyway. Does
   anyone have a feeling about one callback total vs. one callback
   per quad?
Markus Lanthaler:  It's much easier to pass all the quads at
   once.
Gregg Kellogg:  Agree.
Niklas Lindström:  any way to say that it's an enumerable of any
   kind in WebIDL?
Manu Sporny:  I don't think so.

PROPOSAL:  The callback signature for the .toRDF() method should
   accept Quad[]. That is, the callback is called once after all
   processing has been completed.

Gregg Kellogg: +1
Manu Sporny: +1
François Daoust: +1
Niklas Lindström: +1
Markus Lanthaler: +1
David I. Lehn: +0

RESOLUTION: The callback signature for the .toRDF() method should
   accept Quad[]. That is, the callback is called once after all
   processing has been completed.

Markus Lanthaler:  one quick question about error handler that
   Dave was to work on?
Manu Sporny:  no news up until the end of the year, I think.
   Maybe we should simplify that. Markus, is that you would suggest?
Markus Lanthaler:  yes.
Manu Sporny:  feel free to do that and let's see how it looks
   like after that. If fixing the data really ends up being
   necessary, we can always improve that later on, but I would
   expect people to lint the data before they pass it on to the
   processor.
[Call adjourned]

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: HTML5 and RDFa 1.1
http://manu.sporny.org/2012/html5-and-rdfa/
Received on Thursday, 6 December 2012 01:21:55 UTC