RE: Trailing content in JSON-LD

On 23 Aug 2015 at 12:15, Andy Seaborne wrote:
> I'm having trouble pinning down what the spec status is of this input
> (this is for an issue in jsonld-java).
> 
> Does the trailing content mean it is illegal JSON-LD or not or is it
> outside the spec altogether in some cases?
> 
> ----------------------
> {
>    "@id" : "http://example/s",
>    "http://example/p" : "str"
> }
> xxxxxxxxx
> ----------------------

Clearly illegal. It's not even valid JSON (unless the x'es represent whitespace).


> The question is whether the whole input is the "JSON Document" or
> whether the trailing junk is considered to be outside the JSON Document.

Why should it be outside?


> In the first case, it is a parse error, and any output is undefined.
> In the second case, there would be triples and no parse error.
> 
> I currently think that the spec says this is illegal JSON-LD but the
> argument is convoluted and relies on the input coming from HTTP.  If it
> were some other source (a file with a non jsonld extension [tut, tut]),
> it is unstated.
> 
> The spec chase:
> 
> Section 8 =>
> 
> """
> A JSON-LD document MUST be a valid JSON document as described in [RFC4627].
> 
> A JSON-LD document MUST be a single node object or an array whose
> elements are each node objects at the top level.
> """
> 
> RFC4627 is the media type registration for JSON.
> 
> The definition link for "JSON-LD document" is descriptive:
> """
> A JSON-LD document serializes a generalized RDF Dataset
> [RDF11-CONCEPTS], which is a collection of graphs that comprises exactly
> one default graph and zero or more named graphs.
> """
> 
> so it does not say, to my reading, that the "JSON-LD document" includes
> or excludes the content after the "}".
> 
> RFC4627 talks about a "JSON text" when defining the media type.
> Because that is the whole of the HTTP body, I think it means that "JSON
> text" includes everything. Then "MUST be a single node object" applies
> => it's a parse error.
> 
> Proposed spec fix 1: If it said that """ A JSON-LD document MUST be a
> valid JSON *text* as described in [RFC4627]. """

Yep, that would have been a clearer wording but I think the combination of the two statements from the JSON-LD spec you quote above are unambiguous as well.


> then it would be clearer but still only applies if the media type can be
> invoked and sometimes it can't (e.g a stream of chars from a non-HTTP
> stream).

JSON is not really a streaming format. Both specs are based on the notion of "documents"... so you would need to split the input stream at document boundaries to invoke any of the algorithms. How you do that, is undefined.


> A sentence in the grammar explicitly, making it a synatx isse, not a
> context issue, stating that no trailing content is permitted would cover
> all cases.

JSON (text = a single object or array) doesn't allow that either. Only trailing whitespace is permitted [RFC4627]:

      JSON-text = object / array
      ...
      end-array       = ws %x5D ws  ; ] right square bracket
      end-object      = ws %x7D ws  ; } right curly bracket
      ...
      ws = *(
                %x20 /              ; Space
                %x09 /              ; Horizontal tab
                %x0A /              ; Line feed or New line
                %x0D                ; Carriage return
            )


Cheers,
Markus



--
Markus Lanthaler
@markuslanthaler

Received on Sunday, 23 August 2015 19:58:10 UTC