- From: Nate Otto <nate@ottonomy.net>
- Date: Sun, 23 Aug 2015 11:28:14 -0700
- To: Linked JSON <public-linked-json@w3.org>
- Message-ID: <CAPk0ug=uR3VkUuoXOxV53Tg1QJoCKCioLngqLbVktJZ3pLxJmg@mail.gmail.com>
As someone writing software to make requests for JSON-LD documents and parse them, I am really not interested in trying to figure out how to parse the input you provided. There could be thousands of different ways a response could contain a fragment that is a valid JSON-LD document and _other stuff_, and I don't want to support one case of _other stuff_, because there's no end to the possibilities of what else I might find. If I request 'application/ld+json', I don't want anything other than the document that I can send straight to my JSON parser. Nate On Sun, Aug 23, 2015 at 11:00 AM, Andy Seaborne <andy@seaborne.org> wrote: > > > On 23/08/15 18:34, Gregg Kellogg wrote: > > However, as a practical matter, JSON may be included in and HTML script >> tag, which could conceivably be in CDATA. Sometimes other non-JSON comment >> (such as a // comment) is also found). Because these are seen in the wild, >> my reader removes everything preceding “{“ or “[“ and everything trailing >> “}” or “]” to look for a valid JSON document. The specific substitution >> pattern I use is the following: >> >> input.to_s.sub(%r(\A[^{\[]*)m, '').sub(%r([^}\]]*\Z)m, ‘') >> >> While this is technically invalid IMO, practically speaking not eating >> such garbage will break real-world usage (perhaps mostly in schema.org >> examples). I could see generating an error if this is seen when validating, >> but otherwise I’m inclined to eat such garbage in my implementation. >> > > Agreed, embedding is important. > > I think it's better to talk about an extraction step to identify the > content before invoking the JSON(-LD) specs. There may be escaping or > encoding issues from the enclosing content. I see chopping junk as an > example of such a step. > > Andy > >
Received on Sunday, 23 August 2015 18:28:42 UTC