HTML Entities and Escaping in JSON-LD Literals

Dear all:

I think we need to clarify in the documentation of schema.org whether HTML entities and UTF numerical HTML encoding of an Unicode character in literals, namely text, should/can be kept as they are or need to be unescaped inside JSON-LD values. I assume the answer might be different for 

a) stand-alone JSON-LD documents and 
b) when JSON-LD is embedded inside HTML via <script> elements.

In particular, I would like to know whether they must, should, and can be left in their HTML-encoded forms.

Literals provided by backend databases will often be encoded for HTML environments and e.g. contain HTML entity encodings like &amp; for the ampersand character or UTF numerical HTML encoding of an Unicode character, like &#160; for a non-breaking space.

Developers will often face the task of reusing a template variable that contains such escaped characters in JSON-LD code in <script> elements.

The Google Structured Data Testing Tools seems pretty tolerant with this, but I would like to know the proper way of encoding text in JSON-LD values

The only guidance I found online was the simple statement

    "Depending on how the HTML document is served, certain strings may need to be escaped."

in 

    http://www.w3.org/TR/json-ld/

To make things more complicated, it seems that JSON-LD introduces novel escaping requirements for <, >, @ and ^:

    http://json-ld.org/spec/ED/json-ld-syntax/20100529/#escape-character

Does anybody know a definite reference for this?

Best wishes

Martin

-----------------------------------
martin hepp  http://www.heppnetz.de
mhepp@computer.org          @mfhepp

Received on Friday, 19 June 2015 10:02:07 UTC