Re: [w3c/manifest] Identifier for external interpretation (Issue #1104) from Danny Ayers on 2023-11-02 (public-webapps-github@w3.org from November 2023)

From: Danny Ayers <notifications@github.com>
Date: Wed, 01 Nov 2023 19:55:12 -0700
To: w3c/manifest <manifest@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3c/manifest/issues/1104/1789982384@github.com>

I have to confess spoofing issues hadn't crossed my mind at all.

I bet there's overlap, but we're at cross purposes. What I'm trying to highlight is that, as defined in the current draft, as far as the Web is concerned, many manifests might as well be a random jumble of bytes. Unnecessarily so, and losing potential utility.

I'm not thinking in terms of a specific manifest associated with a specific app (though I am now starting to imagine the spoofing angle, yeah, I see that could easily get ugly). Rather what every manifest looks like.

Say my browser GETs a HTML page and in there is a link to style.css. Browser GETs style.css, interprets according to CSS definitions, applies whatever styling it's declared. How does it know what to do : the foundation authority is an IANA registered special string, "text/css" delivered in a HTTP header. But if the header said "image/jpeg", what then? Ok, this is contrived, there a loads of workarounds built into browsers.

webmanifest has "application/manifest+json", that's sound. But this file might legitimately arrive as "application/json". All that says is that it conforms to a particular syntax that can represent a Javascript object (and similar things).
Yet it contains information that's clearly defined in a (draft) specification. It has fields like id, name, description. There's no additional security risk, spoofing or otherwise, in making that information more accessible to a client. If there's a lie in that information, it'll look the same to any agent that interprets the text according to the spec. If, say, a layer of signing/encryption is added, it may take some handshaking of keys or whatever, but any agent with appropriate credentials can potentially see it.

It's a painful thing to bring up in this day and age, but something that does what XML namespaces did (still do?) well. There's a base format that a generic parser can run through. But the naming of the elements is qualified by something with a URI. Ideally a dereferenceable URL that HTTP can go and look at.

In the same way (not great to begin with and I've not looked at it in year) https://purl.org/stuff/pets defines a way of describing animal friends in RDF. A major reason is to tie definitions to human concepts, but there are more immediate uses - through a link to WordNet, the French word for 'dog' is available. (Well, would be if danbri got someone in to fix that server).

So "name" in the context of webmanifest has a specific meaning, defined at https://www.w3.org/TR/appmanifest/#name-member
But if someone (or their app) finds a webmanifest in the wild, however could they know that..?

Right now, as far as I can tell, unless an agent is told ahead of time "this is a webmanifest", it's effectively meaningless.
A JSON interpretation will give you a pair "name": "turbotax", but it conveys no more information than "a string" : "another string". Meanwhile there's a document that's had 10 years of development that can tell you exactly what it means, if you only you (/your software agent) knew where to look.
A day after bringing it up I still haven't checked the JSON-LD docs to see if that approach might work.

I believe these are strong reasons to address this point *somehow*. Rejection on pragmatic grounds is fair enough, but please after a little consideration.

--
Reply to this email directly or view it on GitHub:
https://github.com/w3c/manifest/issues/1104#issuecomment-1789982384
You are receiving this because you are subscribed to this thread.

Message ID: <w3c/manifest/issues/1104/1789982384@github.com>

Received on Thursday, 2 November 2023 02:55:20 UTC