Re: API design principles - HTMLXML literals from Asbjørn Ulsberg on 2015-10-09 (public-linked-json@w3.org from October 2015)

From: Asbjørn Ulsberg <asbjorn@ulsberg.no>
Date: Fri, 9 Oct 2015 16:44:04 +0200
To: John Walker <john.walker@semaku.com>
Cc: Hydra <public-hydra@w3.org>, public-linked-json@w3.org
Message-ID: <CAEdRHi6kLcqb6+BBGNxE=jC3bGA4g0-jHa4UR6HmKXm_=8qZ7w@mail.gmail.com>

2015-10-09 13:20 GMT+02:00 John Walker <john.walker@semaku.com>:

> The question is if it is good/best practice to
> a. include HTML/XML markup in literal values, or
> b. refer out to a separate resource for these

If possible, I'd say: C. Refer to other representations of the same
resources in your MIME type of choice. If the resource itself is
different, then B.

> {
>   "@content": "http://schema.org/",
>   "@id": "#id",
>   "@type": "Product",
>   "mpn": "ABC123",
>   "name": "ACME thingamyjig",
>   "description": "the ACME thingamyjig is our <b>new</b> wonderful product with
> some <sub>subscript</sub> stuff.<br/>A new line"
> }

I'm still not familiar enough with Hydra to be able to express this,
but I would instead make "description" a type with an "@id" pointing
to where the HTML description can be downloaded and perhaps provide a
plain text "@value" inline.

> Should the "<br/>" be displayed or rendered as a line break?

Just strip all markup. There's no (standard) way to represent <b>, for
instance, in JSON or text/plain. Just reference a resource that can
represent the description in several different content types.

> What if the content contains < characters (common for technical products),
> should these be escaped as HTML entities &lt;?

Anything that is allowed within the limits of JSON should be
unencoded. If you need to add any encoding, do it with what is
required to produce valid JSON, but no more.

> In our case these literals could be quite large and contain extensive markup.

Another reason to just provide a simple text/plain representation
inline and referencing an external resource which you can GET with
'Accept: text/html'.

> Additionally, if we had these literals directly on the product entities, there
> would be significant repetition as many products have the same content (DRY).

Another reason to reference.

> {
>   "@content": "http://schema.org/",
>   "@id": "#id",
>   "@type": "Product",
>   "mpn": "ABC123",
>   "name": "ACME thingamyjig",
>   "description": <content/4y7dh2>
> }

Much better.

> This could support conneg allowing to serve multiple representations on a single
> URL (e.g. HTML, DITA and plain text).

Sounds good.

> IMHO from a principled/architectural perspective the second option is best.

It is.

> However we do not see this second option as a widely deployed pattern.

What do you mean? Is not conneg a widely deployed pattern? Or are you
referring to something else?

> To go to other extreme, why not inline images as data URIs in the RDF?

You can do that. RFC 2397 describes how.

> Clearly this is possible, but quite uncommon.

It is quite common in HTML e-mails and not unheard of on the web.
Error pages in HTML, for instance, tend to bundle up everything they
can inline so they are as self contained as possible, in order to
successfully render a layout with CSS and images that don' depend on
external (quite possibly failing) resources.

> Clearly developers are comfy with the idea of images as resources, but not
> textual content.

Most of the web consists of textual resources like text/html.

-- 
Asbjørn Ulsberg           -=|=-        asbjorn@ulsberg.no
«He's a loathsome offensive brute, yet I can't look away»

Received on Friday, 9 October 2015 14:44:32 UTC