W3C home > Mailing lists > Public > public-linked-json@w3.org > August 2015

Re: Trailing content in JSON-LD

From: Nate Otto <nate@ottonomy.net>
Date: Sun, 23 Aug 2015 11:28:14 -0700
Message-ID: <CAPk0ug=uR3VkUuoXOxV53Tg1QJoCKCioLngqLbVktJZ3pLxJmg@mail.gmail.com>
To: Linked JSON <public-linked-json@w3.org>
As someone writing software to make requests for JSON-LD documents and
parse them, I am really not interested in trying to figure out how to parse
the input you provided. There could be thousands of different ways a
response could contain a fragment that is a valid JSON-LD document and
_other stuff_, and I don't want to support one case of _other stuff_,
because there's no end to the possibilities of what else I might find. If I
request 'application/ld+json', I don't want anything other than the
document that I can send straight to my JSON parser.


On Sun, Aug 23, 2015 at 11:00 AM, Andy Seaborne <andy@seaborne.org> wrote:

> On 23/08/15 18:34, Gregg Kellogg wrote:
> However, as a practical matter, JSON may be included in and HTML script
>> tag, which could conceivably be in CDATA. Sometimes other non-JSON comment
>> (such as a // comment) is also found). Because these are seen in the wild,
>> my reader removes everything preceding “{“ or “[“ and everything trailing
>> “}” or “]” to look for a valid JSON document. The specific substitution
>> pattern I use is the following:
>> input.to_s.sub(%r(\A[^{\[]*)m, '').sub(%r([^}\]]*\Z)m, ‘')
>> While this is technically invalid IMO, practically speaking not eating
>> such garbage will break real-world usage (perhaps mostly in schema.org
>> examples). I could see generating an error if this is seen when validating,
>> but otherwise I’m inclined to eat such garbage in my implementation.
> Agreed, embedding is important.
> I think it's better to talk about an extraction step to identify the
> content before invoking the JSON(-LD) specs.  There may be escaping or
> encoding issues from the enclosing content.  I see chopping junk as an
> example of such a step.
>         Andy
Received on Sunday, 23 August 2015 18:28:42 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:18:45 UTC