Re: ISSUE-41/ACTION-97 decentralized-extensibility

Brendan Eich wrote:
> On Oct 6, 2009, at 12:26 AM, Sam Ruby wrote:
>> Brendan Eich wrote:
>>> But right now I feel like my old blog post was pressed, nay 
>>> dragooned, into service of a cause I do not support.
>> Wow.  I remember talking to you about this F2F.  In any case, my 
>> apologies.
> I said that I supported the idea of D.E. (who doesn't?) and your 
> demonstrations of SVG and MathML in HTML, but I am sure I never endorsed 
> XML namespaces.

I thought we had specifically talked about the following page:

> Anyway.

Agreed.  Whether I asked the question poorly at the time, or did not 
understand your response at the time is immaterial.  Onward.

> Is there a way to go forward without assuming that XML namespaces are 
> essential to D.E. of HTML5?

We've not even established that D.E. is a requirement.  The documented 
position of the WHATWG is (and has been): [1]

     There is currently no mechanism for introducing new proprietary
     features in HTML documents (i.e. for introducing new elements
     and attributes) without discussing the extension with user agent
     vendors and the wider Web community. This is intentional; we don't
     want user agents inventing their own proprietary elements and
     attributes like in the "bad old days" without working with
     interested parties to make sure their feature is well designed.

And is reflected in the current editor's draft, thus: [2]

     Vendor-specific proprietary extensions to this specification are
     strongly discouraged. Documents must not use such extensions, as
     doing so reduces interoperability and fragments the user base,
     allowing only users of specific user agents to access the content
     in question.

     If vendor-specific markup extensions are needed, they should be done
     using XML, with elements or attributes from custom namespaces.

I see two parts to this question.  But first, for full disclosure, I am 
of the belief that attempts to legislate morality in the long run will 
be as successful as the 18th amendment to the US Constitution was.  I 
also happen to believe that we can't standardize everything at once, so 
at whatever point in time we happen to want to take a snapshot for 
HTML5, there will always be new elements and features (e.g. datagrid) 
which will not yet be standardized, and planning for evolution is the 
responsible path forward.

Back to the question at hand.

The first part is how to deal with markup extensions, whatever the 
source.  Poisoning the discussion by introducing the term "proprietary" 
into the discussion is counter-productive.  Extensions of all kinds will 
happen, we can chose to channel them, or not.

For many uses, something as simple as saying names with a dash (or a dot 
or another character) will never be standardized and to encourage the 
use of prefixes may suffice.  A similar approach has worked, and worked 
reasonably well, for CSS; though there are enough differences between 
the two languages that it is not at all clear that such an approach 
would work in the context of HTML (example: CSS has a more comprehensive 
approach to fallbacks).

The second part is the (unfortunate?) fact that users and tools have 
made use of entity and attribute names containing colons.  You have, and 
will continue to see, such syntax used in RDFa and in SVG fragments that 
users will copy and paste into HTML documents.  As near as I can tell, 
nobody is contesting the current position that none of these elements or 
attributes will affect the rendering of the page one bit.

Three subparts to the second part.

a) The current spec goes beyond the statement above, and makes all such 
uses non-conforming.  My personal belief is that such is entirely 
unnecessary and counter-productive.  And ultimately, self-defeating.

b) HTML parsing can result in localNames containing colons in the DOM. 
Serializing such DOMs as XML /can/ result in parse errors, depending on 
whether or not a corresponding xmlns: declaration is present.  The 
current serialization algorithm doesn't take this into account, and 
instead serializes things like dc:title as dcU00003Atitle.  Such will 
not round trip.  My belief: this is suboptimal.

c) MS's DistributedExtensibility Submission[3] goes beyond this, and 
attempts to make it easier to access this information via ECMAScript 
APIs, and in the process narrow the gap between the DOM produced by an 
XML parser and the DOM produced by an HTML parser when given the same 
polyglot[4] document.  My opinion: I'm not sure that this is possible 
without breaking the web.  Adding new APIs may be possible.  Phillip 
Taylor has pointed out that simply adopting MS's existing tagURN may 
cause pages that employ capability based browser sniffing to break.[5]

> /be

- Sam Ruby


Received on Tuesday, 6 October 2009 16:42:03 UTC