Re: [JSON] new yellow box, proposed solution from Sandro Hawke on 2011-03-23 (public-rdf-wg@w3.org from March 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Tue, 22 Mar 2011 20:13:29 -0400
To: Eric Prud'hommeaux <eric@w3.org>
Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, boaz <boaz@bocoup.com>, RDF Working Group <public-rdf-wg@w3.org>
Message-ID: <1300839209.3138.888.camel@waldron>
On Tue, 2011-03-22 at 18:39 -0400, Eric Prud'hommeaux wrote:
> Cc += boaz <boaz@bocoup.com> who will hopefully validate or refute my
> assertions about JS/JSON.
> 
> Boaz, as you're not a member of the WG, your replies will only be
> visible to the WG in my replies to your replies.
> 
> * Sandro Hawke <sandro@w3.org> [2011-03-22 12:26-0400]
> > I think we might be able to get away without the second table.
> > 
> > After the JSON task force meeting yesterday, it seemed to me the main
> > opportunity for standards in the second table on can also fit on the
> > first one, because of the level 7 convergence.   That is, 7A is
> > publishers with RDF and consumers who don't want anything to do with
> > RDF.   I made this a yellow box on the first table.   I'm a bit fuzzy
> > on some boxes in the second table, but I'm not seeing anything not
> > addressed in the first one, at the moment.
> > 
> > http://www.w3.org/2011/rdf-wg/wiki/JSON_User_Segments
> > 
> > Thinking about this yellow box (7A), I guess the Linked Data API is
> > aimed at this space.  So is Steve Harris' "CONSTRUCT JSON" idea for
> > SPARQL.
> > 
> > Thinking about it yesterday, I came up with another approach, which I'll
> > explain now, while I'm thinking about it.  Not sure how relevant it is
> > to this WG.   The approach is based on the idea that we could address
> > these folks with SPARQL 1.1, just by defining a "simplified" json
> > results format.   Something like this:
> > 
> > Example Data in Turtle:
> >     _:x foaf:name "Cassia"; foaf:age 7 .
> >     _:y foaf:name "Aubrey"; foaf:age 8 .
> > 
> > Query:
> >   SELECT ?name ?age WHERE { ?person foaf:name ?name; foaf:age ?age }
> > 
> > JSON result:
> > 
> > [ { "name": "Cassia", "age": 7 }, { "name": "Aubrey", "age": 8 } ]
> > 
> > In JSON, the lang tags, datatypes, and node type would be lost, but you
> > could get that information if you wanted it by using a different query,
> > using SPARQL 1.1's new select expressions [1].  For example, if we add
> > this triple, with a language tag on "Ivan" (so we know how to pronounce
> > it):
> > 
> > More Turtle:
> >    _z foaf:name "Iván"@hu"; foaf:age 9 .
> > 
> > New Query:
> >   SELECT ?name (lang(?name) AS ?namelang) ?age WHERE ...
> > 
> > giving us:
> > 
> > [ { name: "Cassia", namelang: "", age: 7 }, 
> >   { name: "Aubrey", namelang: "", age: 8 },
> >   { name: "Iván", namelang: "hu", age 9 } ]
> 
> If you followed the precedent that unbounds are simply not mentioned,
> so you'd get:
> 
>   [ { name: "Cassia",               age: 7 }, 
>     { name: "Aubrey",               age: 8 },
>     { name: "Iván", namelang: "hu", age 9 } ]

I wondered about that, but I looked it up in your spec [1] :-) and found
it says lang() returns "" if there is no language tag, not unbound.  

I don't think javascript users care much about "" vs absent entry, but I
wanted to stick to 100% pure SPARQL, not change the semantics.

That is, I want to be able to code this as a wrapper around a sparql end
point that doesn't even know sparql at all.

[1] http://www.w3.org/TR/rdf-sparql-query/#func-lang

> 
> > This puts all the RDF-knowledge into the SPARQL query, and keeps the
> > json RDF-free (so it's okay for Group A).   Some URLs could be set up
> > with fixed or parameterized SPARQL queries.  I guess this is pretty
> > close to the Linked Data API.   (Looking through that spec, I don't see
> > how things like language tags are addressed.  Ah, there's an open issue
> > on it, I think.)
> > 
> > So.   Pretty simple solution for the yellow box folks; the only standard
> > required is a very, very simple new SPARQL results format.  Well, and
> > maybe some of the other LDAPI stuff.  :-)
> 
> We're relying on JSON-native representations of RDF atoms. That is,
> you don't serialize "7"^^xsd:integer as "7", but as 7. This strikes me
> as perfectly sensible. (Guessing at JS a bit) this would exactly
> translate the primative numeric datatypes integer, float and double,
> and encompass the restricted subtypes like byte, short int, long¹
> without capturing the exact restriction. This captures most of the
> "data-only" queries, which has the appeal of keeping the simple stuff
> simple.
> 
> For non-native types, the current:

Current?   Oh, you mean the current SPARQL JSON Results Format:
http://www.w3.org/TR/rdf-sparql-json-res/


>   "person":{"type":"uri", "value":"http://a.example/cassia"}
>   "person":{"type":"bnode", "value":"cassia"}
>   "name":{"type":"literal", "xml:lang":"hu", "value":"Iván"}

(Note that I didn't SELECT ?person, but yes I guess your example is
right if we add that)

> syntax would break the pattern of where the average JSON user could
> look for the lexical value. 

Right.  I think people want to write more obvious code, like they could
with my proposal.   And, mostly, this is for GROUP A -- people who don't
want anything to do with RDF.   If they see type:"bnode" in the json, I
don't think they'll be happy.

> OTOH, your enhanced query approach:
> 
>   "personURI":"http://a.example/cassia"
>   "personBNode":"cassia"
>   "name":"Iván", "nameLang":"hu"
> 
> allows the consumer to be easily fooled. (E.G. when the object of
> dc:author is a BNode, I might report that label as the author's name.)
> The other prob is that user has to account for all of these types in
> their query, e.g. to dump a graph:
>   SELECT ?sBNode ?sURI ?pURI ?oURI ?oBNode ?oLexical ?oDT ?oLangTag…
> (and three more to dump named graphs).

No, I'd just smoosh bnodes and URIs together.   Most clients can treat
them as opaque ids, and if they need to distinguish, it's trivial to
look for the leading "_:" which I'd put there.  

I didn't really want to get into bnodes.  There are several approaches,
and I'd rather just have someone upstream skolemize them, but if we have
to support them we can do it.  Maybe use uuids so folks can merge
results downstream without collisions.   So, revisiting my example, but
with ids: 

Query:
  SELECT ?id ?name (lang(?name) AS ?namelang) ?age 
  WHERE { ?id foaf:name ?name; foaf:age ?age }

Result:

[ { "id": "_:u23ae1a8e-ce0e-4807-800b-963c1adf05ca" 
    "name:: "Cassia", "namelang":"", "age": 7 }, 
  { "id": "_:u358e1769-62d6-421e-a224-bb28994938fd", 
    "name": "Aubrey", "namelang":"", age: 8 },
  { "id": "_:u5cb0a70f-3ae9-4769-8bed-19fa425c6e35",
     "name": "Iván", "namelang": "hu", "age": 9 } ]

Alternatively, we could use urn:uuid: to make it more clear where we're
getting these labels from. 
 
> There's also the ASP.NET precedent for serializing e.g. Dates as
>   "birthdate":"/Date(12341232212)/"
> which could could use ISO datetimes instead as the constructor for
> Date takes seconds since epoch and iso datetimes. 

This is tricky.  I didn't know about the /Date(...)/ convention, and I
don't know how widely it's accepted.

Reading a bit more I see it's sort of actually \/Date(...)\/.   
http://weblogs.asp.net/bleroy/archive/2008/01/18/dates-and-json.aspx

If JSON.stringify(new Date()) had written something like that for me, 
I would have been happy, but no.   Maybe that's the wrong test....

Very, very messy.

    -- Sandro


> Is there a
> package for arbitrary precision decimals? Do we care about them?

Well, json.org doesn't list any limits to the sizes of numbers in the
syntax.  My FF3.5 seems to understand a string of digits as an integer
until it's too big, then as a double until that's too big, then as
infinity.   I don't think this is our problem to solve.

    -- Sandro


> ¹http://www.w3.org/TR/rdf-sparql-query/#operandDataTypes
> 
> 
> > Meanwhile, this *might* also address parts of the green box, but it
> > starts to get more complicated.  The green box is about mapping between
> > nice-json and RDF, and some kinds of those mappings can be defined by
> > these SPARQL queries.   While the mapping is expressed as a way to
> > extract JSON from RDF, its semantics are declarative, so it could be
> > "run backwards"; you can (with some queries) reconstruct the RDF.
> > 
> > What I'm imagining here is people publishing the SPARQL query that would
> > have been used to generate the given JSON from some RDF.  Given that,
> > you can map back to the RDF.   Does this work?   First check, looking at
> > the JSON that twitter uses for streaming, I see... no, because its
> > nested; they have nested lists and objects within the values.  Hrm.
> > That could get complicated.
> > 
> > Anyway...    it's an idea.  
> > 
> >    -- Sandro
> > 
> > [1] http://www.w3.org/TR/sparql11-query/#select_expressions
> > 
> > 
> > On Sun, 2011-03-20 at 20:22 +0000, Andy Seaborne wrote:
> > > 
> > > On 20/03/11 17:16, Manu Sporny wrote:
> > > > Agenda
> > > >
> > > > 1. General discussion on what we're attempting to accomplish with
> > > >     the various communities and long-term (market segments)
> > > >     http://www.w3.org/2011/rdf-wg/wiki/JSON_User_Segments
> > > > 2. JSON as RDF Proposal
> > > >     http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0447.html
> > > 
> > > That's very much making JSON appear as RDF; JSON source, RDFish application.
> > > 
> > > This can be contrasted with RDF for JSON: making published RDF 
> > > accessible to "normal" JSON applications, with varying degress of "RDF 
> > > ness" in the JSON application.
> > > 
> > > PROPOSAL: The RDF Working Group JSON Task Force will work on a way of
> > > making published RDF accessible to JSON applications.
> > > 
> > > Unlike a serialization of RDF in JSON may be lossy - i.e. when presented 
> > > to the application some details may be lost (e.g. some datatypes).
> > > 
> > > Drawn in a Sandro-matrix: with levels of data publishers:
> > > 
> > > level P1: RDF publisher willing to publish according to a fixed, 
> > > universal JSON presentation
> > > 
> > > level P2: RDF publisher willing to provide a JSON friendly form to all 
> > > applications; (i.e. one presentation, but specific to this data).
> > > 
> > > level P3: RDF publisher willing to provide a JSON friendly form based on 
> > > the application accessing the data (i.e. several presentations, based on 
> > > this data and accessing application)
> > > 
> > > Group A1: applications willing to do what ever it takes to get 
> > > RDF-published data (inc. read Turtle)
> > > 
> > > Group A2: applications wanting a JSON data structure
> > > 
> > > Group A3: applications willing to use a library/API
> > > 
> > >  Andy
> > > 
> > > > 3. Express all RDF in JSON Proposal
> > > >     http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0450.html
> > > > 4. Addressing multiple, seemingly divergent communities
> > > >     * For example: Can we draw consensus by combining object-based
> > > >                    vs. triple-based formats into a single format?
> > > > 5. Review/Explanation/QA on proposed formats
> > > >     http://www.w3.org/2011/rdf-wg/wiki/TF-JSON#Inputs
> > > 
> > > 
> > 
> > 
> > 
>
Received on Wednesday, 23 March 2011 00:13:40 UTC