- From: John Walker <john.walker@semaku.com>
- Date: Fri, 9 Oct 2015 13:20:15 +0200 (CEST)
- To: public-hydra@w3.org, public-linked-json@w3.org
On a forthcoming project we need to transfer information/data/content from one system to another. I have questions about the best way to deal with HTML/XML markup. My assumption is we will expose an API of some kind, most likely JSON (ideally JSON-LD). The question is if it is good/best practice to a. include HTML/XML markup in literal values, or b. refer out to a separate resource for these An example of first approach: { "@content": "http://schema.org/", "@id": "#id", "@type": "Product", "mpn": "ABC123", "name": "ACME thingamyjig", "description": "the ACME thingamyjig is our <b>new</b> wonderful product with some <sub>subscript</sub> stuff.<br/>A new line" } For me this is bad because the "description" is a string, but contains HTML markup <br/>. How is a client to know how to process this? Should the "<br/>" be displayed or rendered as a line break? What if the content contains < characters (common for technical products), should these be escaped as HTML entities <? Of course one could add the datatype rdf:HTML for this literal to indicate it is HTML. In our case these literals could be quite large and contain extensive markup. Additionally, if we had these literals directly on the product entities, there would be significant repetition as many products have the same content (DRY). The second option would be to refer to some external resource. { "@content": "http://schema.org/", "@id": "#id", "@type": "Product", "mpn": "ABC123", "name": "ACME thingamyjig", "description": <content/4y7dh2> } This could support conneg allowing to serve multiple representations on a single URL (e.g. HTML, DITA and plain text). Would also reduce repetition and allow for client side caching of these resources. Also would potentially play nicely with things like HTML Imports [4] [5]. IMHO from a principled/architectural perspective the second option is best. However we do not see this second option as a widely deployed pattern. Why is that? To go to other extreme, why not inline images as data URIs in the RDF? Clearly this is possible, but quite uncommon. Clearly developers are comfy with the idea of images as resources, but not textual content. Is that a step too far, is the support lacking in programming languages/libraries? Thoughts/opinions welcome? John [4] http://www.w3.org/TR/html-imports/ [5] http://www.html5rocks.com/en/tutorials/webcomponents/imports/
Received on Friday, 9 October 2015 11:20:48 UTC