W3C home > Mailing lists > Public > public-json-ld-wg@w3.org > April 2020

HTML Content Algorithms dont' take external JSON-LD data into account

From: Hoekstra, Rinke (ELS-AMS) <r.hoekstra@elsevier.com>
Date: Tue, 21 Apr 2020 12:28:14 +0000
To: "public-json-ld-wg@w3.org" <public-json-ld-wg@w3.org>
CC: "Breebaart, Matthijs (ELS-AMS)" <m.breebaart@elsevier.com>, "Townsend, Andrew S. (ELS)" <a.townsend@elsevier.com>
Message-ID: <BYAPR08MB57679B221A02E4F236AB9A86E3D50@BYAPR08MB5767.namprd08.prod.outlook.com>
Hi All,

We stumbled upon something odd when going through the HTML Content Algorithms (section 9.5 of the JSON LD 1.1 API document, [1]).

The algorithm extracts the JSON-LD from the textContent of script elements with a JSON-LD mime type as value for the "type" attribute.

We have cases where, similar to e.g. JavaScript, our HTML documents refer to JSON-LD data that is hosted external to the HTML document itself.

Our current approach is to use an empty script element with "type" set to the JSON-LD mime type, and "src" set to the dereferenceable IRI of the JSON-LD dataset that we want to process.

Our assumption was that JSON-LD processing of HTML documents would automatically consume these external datasets, but the current algorithm doesn't allow for this. That is, if we indeed read the specs correctly.

I appreciate that it's a bit late in the game, but it would be good to at least have the algorithm state explicitly that loading such external JSON-LD data using a "src" attribute is OPTIONAL. We'd rather not standardise on this internally when the JSON-LD spec may opt for using e.g. link elements at a later stage.



[1] https://www.w3.org/TR/json-ld11-api/#html-content-algorithms

Dr. Rinke Hoekstra
Lead Architect - Knowledge
Elsevier​, Amsterdam


Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33156677, Registered in The Netherlands.
Received on Tuesday, 21 April 2020 14:35:08 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 21 April 2020 14:35:09 UTC