Larry Masinter, 12/10/2011, still early draft for discussion
This document is circulated as part of TAG
Please discuss this document on www-tag@w3.org (archived).
This document explores three interrelated topics, and proposes some potential TAG findings for them. It is still very sketchy.
language, format, protocol, protocol element, language term
Technical specifications for languages, formats and protocols make use of identifiers--names chosen from a set of values. In many cases, there are parameters or values which are allowed as extensibility points, where the interpretation of the value cannot be directly determined by the specification and the value itself; instead the meaning is to be discovered by some other process.
The web architecture relies on many extensibility points; for example, content-types, uri schemes, color names, host names, html attributes to a given element, country codes, HTTP headers, css rounded corners.[refs].
Technical specifications intended as long-lived standards often provide extensibility points which allow new identifiers and values to be used, such that even after the language, protocol, or format has been defined and deployed, new values may be assigned, without updating the protocol standard or specification, or creating a new version of the specification and the concurrent cost of protocol/format version management.
There are a variety of ways of managing extensibility of sets of enumerated values, and establishing a mechanism for introducing private and/or public extensions.
Things to consider in any extensibility mechanism:
httpRange14 update/matching reality/discovery/transition/lifetime/availability. Preferred method, modulo longevity of URIs. Note that URN allows naming a registry as a URI.
Discuss each of the considerations.
(example from CSS, analysis of transition path difficulties) Do we make any recommendations over URIs vs. vendor prefixes?
Discuss each of the considerations.
A registry consists of the documentation for a set of registered values and their meaning, where the registry maintained by an organization (the registrar) with a commitment to maintain the registry and make it publically available. To ensure that quantities have consistent values and interpretations across all implementations, their assignment must be administered by an "authority": an organization or consortium which manages the values and insures proper administration.
For example, The Internet Assigned Numbers Authority (IANA)[ref] is the primary organization whose charter and purpose is to maintain registries of values needed for Internet protocols and languages as defined by the IETF.[ref BCP from which this was quoted] IANA administers the registry of many parameters in the core of the Web architecture: the space of URI (and IRI) scheme names, the space of media type identifiers ("MIME types"), a registry of HTTP protocol header values, HTTP result codes, names of character sets and character encoding schemes (charsets) and so forth. The architecture of the world wide web relies on extension points using "registration", even in W3C-specified protocols, languages, and formats which are not reviewed or published within the IETF.
Finding.use-IANA:
The "best current practice" specification in [BCP 26][RFC 5226] gives guidelines to protocol designers for establishing the registry rules associated with an IANA registry. Note that IANA acts as the operator of each registry, but itself does not evalute registry requests, but merely adminmisters a process by which the organization or individuals authorized to review or approve registry entries are accepted. These guidelines apply to IANA namespaces established or requested by W3C working groups or task forces.
Sniffing, "Willful Violations", Incomplete Inaccurate Registries
In some cases, community practice has evolved and the registries have not followed: the registries have not tracked the use of extensibility parameters, or where extensibility values are often ignored. In some cases, the registry is percieved as a bottleneck.If there is a registry, it is only useful if values are registered. A registry which does not match actual use (as is currently the case with URI schemes, Media Types) is not very useful.
Sniffing: the registry has not tracked, or the right extensibility parameter is not used. [ref mime-sniff]
"willfull violations": the registry values have been misused, and the technical specification contains new values that do not agree with the registry.
Often, a registry does not contain the actual definition of the meaning of a term or value, but rather contains a pointer to a document or document series which defines that value. For example, the Internet Media Type registry defining file formats and languages often contains a pointer to the document or specification. However, specifications themselves update. And sometimes they "fork" -- there can be multiple competing definitions. (In some cases, "forking" is "poaching").
Requiring the documentation to be stable is another reason why registrations diverge from reality.
Registry values typically go through a life-cycle, where a parameter is introduced experimentally, deployed in a limited or vendor-specific context, and then adopted more broadly.
Frequently, groups with registries or registered values attempt to convey status of a registered value in the name chosen within the registry, e.g., using an "x-" prefix for experimental names, "vnd." prefixes in internet medai types, etc. In practice, these conventions are failures, counter-productive, because there is no simple deployment path when status changes, e.g., vendor proposed extension become public standards, experiments succeed, etc.
W3C staff & working group participants must manage the registration information, and that the process itself needs revisions. Other registrations have their own administrative procedure. A regular "have obligations related to registration been met" check into the W3C document publication/advancement procedure.
In particular, there are two IANA registries essential to the web, Internet Media Types and Charsets.
Both have "willful violations" and "sniffing" in the HTML5 specification.
Fragment identifiers are defined in web architecture but not required enough in [MediaRegUpdate].
[BCP26]: Guidelines for Writing an IANA Considerations Section in RFCs, BCP 26, RFC ...
[IABext] Design Considerations for Protocol Extensions work in progress, Internet Draft
[Friendly] Friendly Registries, work in progress, Wiki Page, requirements and a place to gather explicit proposals
[HappyIana] https://www.ietf.org/mailman/listinfo/happiana
[LinkRelation] http://lists.w3.org/Archives/Public/www-tag/2011May/0006.html
[sniff] http://tools.ietf.org/html/draft-ietf-websec-mime-sniff
[MediaTypeFinding] Internet Media Type registration, consistency of use TAG Finding 3 June 2002 (Revised 4 September 2002)
[MIMEGuidelines] Register an Internet Media Type for a W3C Spec (W3C guidelines on registering types)
[MediaRegUpdate] Media Type Specifications and Registration Procedures, Intenet Draft, work in progress
[NoX] X- parameters harmful (Peter St. Andre)
[SpecUpdate] Best Practice for Referring to Specifications Which May Update [email draft, H. Thompson, C.M. Sperberg-McQueen]
[VendorFlap]
Here are some notes from discussion not yet incorporated:
Reasons for a "registry":
For example, some protocol designers thought a new URI scheme could cause a lot of extra work. For HTML tags, when you introduce a new section, everyone needs to understand that who implements browsers.
But if you add metadata, it's no skin of anyone's nose. so you have 2 situations - one on which you need whole community to get involved and one in which anyone besides a sub-community can ignore.
Only tangentially related to registry-based solutions, Mark Nottingham quotes ([12]http://lists.w3.org/Archives/Public/www-tag/2011Dec/0049.html) Roy Fielding as calling mustUnderstand-based approaches "socially reprehensible" we need a decision tree - questions to answer to understand what kind of extension you're doing and which of these techniques you should use
Compound extensibility points: when a new version of an exensibility point defines a new context in which old extensibility points are interpreted. (This is "willful violation" territory, if not also "sniffing" territory).
see discussion following http://lists.w3.org/Archives/Public/www-archive/2011Nov/0009.html