Web linking: type=, media type names, media type references, media ranges

RFC 8288 defines Web Linking — while this is formally an individual document by Mark Nottingham, it is probably most properly “owned” by the httpbis WG.

In applying the concept of Web Linking to machine-to-machine applications (and extending it to something much more complex called “Thing Descriptions”), we ran into an interesting mélange of concepts around media types.

Probably the most definitive document about media types themselves right now is RFC 6838, which defines media type as identified by the combination of a type-name and a subtype-name, with:

     type-name = restricted-name
     subtype-name = restricted-name

     restricted-name = restricted-name-first *126restricted-name-chars
     restricted-name-first  = ALPHA / DIGIT
     restricted-name-chars  = ALPHA / DIGIT / "!" / “#” / […]

So »text/plain; charset=utf-8« and »text/plain; charset=us-ascii« are referencing the same media type, identified as »text/plain«.

This is also the definition of media type used by RFC 8288 (Web Linking), which has the target attribute name “type” and the corresponding value production RHS:

     type-name “/" subtype-name ; see Section 4.2 of [RFC6838]

So RFC 6838 and RFC 8288 agree on what a media type is; I’ll call this a “media type name” for now.  A web link can have the hint that something is a »text/plain«, but can’t tell you whether to expect UTF-8 or KOI-8R.

RFC 7231 has a more specific concept:

     media-type = type "/" subtype *( OWS ";" OWS parameter )
     type       = token
     subtype    = token
     parameter      = token "=" ( token / quoted-string )

This is used in the Content-Type header field.  To distinguish this construct from the mere “media type name”, I’ll call it “media type reference” (suggestions for better names are welcome).

There is no predefined way to apply this more specific concept to web linking; however sometimes this more specific hint is quite useful (e.g., to say that something is an »application/cose; cose-type=cose-encrypt« and not just any type of »application/cose«).  We can do that in RFC 6690 by using “ct”, the CoRE Content-Format number, which is an encoded form of a media type reference — as long as one of these numbers is already allocated for the media type reference needed.  We probably want to add something for media type references that do not have a content-format number assigned; this would then be a new target attribute named “content-type” for least-surprise to people used to HTTP or MIME.

So much for more specific hints.

RFC 7231 also defines a form of potentially less specific hints, the media range, for use in Accept:

     media-range    = ( "*/*"
                      / ( type "/" "*" )
                      / ( type "/" subtype )
                      ) *( OWS “;" OWS parameter )

So this can be less specific than media type names, as it allows wildcards for type names and subtype names, or more specific, as it allows parameters.  E.g., »*/*; charset=utf-8« is an interesting media range — give me anything, as long as it is UTF-8 encoded…  It is conceivable web links might want to allow this kind of hint, too.  We don’t have a specific use case in mind, so we are not defining anything at this time.  We could, of course, define the “content-type” target attribute in terms of media-range, but that might be surprising.

I don’t want to go into more details of the use cases, as these involve unsavory protocols such as MQTT (and other protocols that don’t support representation metadata).  I’m sending this to httpbis with one question right now: 

 Are we on the right way with defining a “content-type” target attribute for those cases that need it, and leaving media-ranges alone for now?

Grüße, Carsten

PS.: Thanks to Matthias Kovatsch for bringing up many of these questions.  All errors in the above are mine, though.

Received on Friday, 3 August 2018 12:58:36 UTC