Re: ISSUE-41/ACTION-97 decentralized-extensibility

First to Adrian Bateman,

Thanks to Microsoft  and to Tony Ross for submitting this proposal. I 
especially the opportunities for both discussion and compromise included 
in the proposal.

> > It is a common practice for authors, tool vendors, and library  
> > authors to want to extend languages to represent additional  
> > information that can't be adequately described by the standard  
> > grammar. This might be used to preserve metadata used by one tool in  
> > a chain of operations. It might be actual data to be processed by a  
> > user agent as an extension to the standard processing. Here are a  
> > few examples that apply to HTML:
> >
> >     * A HTML document editor adds information about tool settings so  
> > that a subsequent editing session can continue with the same settings.
> Humans who write HTML save their mental state in HTML by writing HTML  
> comments. Is there a reason why comments with a product-specific  
> internal formatting wouldn't be an appropriate way to serialize  
> authoring tool state?

Yes, but comments in HTML are generally meant to be consumed by humans, 
they're not necessarily machine friendly. Being able to use formal 
markup to annotate the text so that it can be returned to a specific 
state in an editor is not an unconceivable need. And a need that 
wouldn't necessarily be met by HTML comments.

> >     * A JavaScript library processes custom tags in a browser and  
> > turns them into custom controls dynamically on the page.
> HTML5 addresses this use case with the data-* attributes. You take the  
> element that gives the best fallback behavior when the script doesn't  
> run and then put the script-sensitive stuff in data-* attributes.
Now, extend this concept to custom tags that can be turned into custom 
controls AND which can also be extracted by a web bot or other external 
service, in order to combine with like data for additional purposes.

> >     * A browser wants to allow custom behaviors to be defined in one  
> > module and attached automatically to custom elements.
> How would the custom behaviors be implemented? In page-supplied XBL2?  
> In native code specific to a combination of browser, OS and CPU  
> architecture pre-installed prior to loading the page?
> If in XBL2, wouldn't it be sufficient to be able to bind the behavior  
> to class attributes or to local names that have a special character  
> reserved for extensibility (such as '.' or '_' but *not* ':') without  
> having to go through the trouble of changing the namespace URI from  
> what the HTML5 spec says now?
> If in native code, how would the unavailability of the native code  
> behavior for a given browser, OS and CPU combination be less bad for  
> the ability of the user to read Web content than the unavailability of  
> an NPAPI plug-in (or NPAPI-plug-in-like ActiveX control)? That is, how  
> would this proposal be an improvement over the current mechanism for  
> proprietary extensions?
> (I think the discussion about extensibility should framed in terms of  
> the ability of users to read content. Not in terms of the ability of  
> authors to write content.)

I don't think it's appropriate to continue to re-frame these 
discussions. It's just as viable to discuss extensibility in terms of 
authoring content, as it is to discuss extensibility in terms of consuming.

Is your main concern, and reason for re-framing this discussion in terms 
of consumers because of your concern about automated processes to read 
this data?

> >     * An author includes processing instructions in the document  
> > that will be processed by a server before delivering the document to  
> > a user agent.
> Why does this use case require the complication of dispatching on a  
> {namespace, local} pair as opposed to dispatching on identifiers that  
> are simple strings? Why does this case require any resemblance to what  
> IE does now?
> >     * An author runs a tool on a document to add numbering to  
> > headings and a table of contents. Running this tool leaves custom  
> > metadata tags intact.
> Is the key to this bullet point leaving custom metadata intact or  
> being able to discover what numbers were written by the tool itself in  
> a previous run?
> > Using research data gathered by Microsoft, we identified a number of  
> > these concerns and this proposal was altered to avoid serious issues.
> Have you done analyses on previous cases where proprietary  
> extensibility of HTML has been practiced and checked if the Web had  
> been better if your proposed mechanism had been used for those cases  
> (e.g. <marquee>, <blink>, <canvas>, iPhone viewport meta directive, X-
> UA-Compatible, Palm Pre-specific attributes)?
> >      var myCustomElements =
> >           document.getElementsByTagNameNS("com.mycompany", "*");
> >      @namespace my "com.mycompany";
> >      my|*
> Why do the examples use a non-URI namespace?

I was wondering about that one myself.

> On Oct 1, 2009, at 02:09, Jonas Sicking wrote:
> > I'm not actually a big fan of this proposal. Experience with
> > namespaces in XML has showed (at least to me) that namespaces are too
> > complex for authors to understand.
> [...]
> > So all in all it feels like momentum is moving away from the XML
> > Namespaces model, rather than towards it.
> I agree.

I don't see any studies that reflect this. I don't see any behavior in 
the wild that reflects this. I don't see that problems people have had 
with something like XHTML are specifically, or only, because of 
namespaces. Actually, I don't think people have the problems with 
namespaces you all keep stating again and again.

If you don't like namespaces, say you don't like them. Say why, 
reflecting your own personal experiences. But please stop speaking for 
some nebulous group of "others" who, seemingly, are reduced to quivering 
masses of anguish just because they encounter a namespace in a page.

> > I much rather like the mechanism that CSS is using. Non-standard token
> > names are prepended by "-name-" in order to avoid collisions. Could we
> > do something similar by using "name_" at the beginning of
> > non-standardized names. We could even let people use element/attribute
> > names like "www_myorg_org_myelement".
In other words, let's ignore what's in use on the web, has been in use 
on the web for over a decade, in favor of something completely new and 
untested, and incompatible.

As to your appreciation of the CSS mechanism, I have to ask: when was 
CSS ever gathered from many different pages and incorporated into a 
single data store?

And what would happen if Joe comes up with -myname- and Sam comes up 
with -myname-, independent and unknowing? So we use reverse DNS 
identifiers to prevent this? Again? So then we have to educate people 
about this new thing, these reverse DNS identifiers, and then explain 
how they are used, and then we have to determine what is the best format 
for these, to ensure that what's used can be consistently accessed, and 
we'll have to come out with a specification for reverser DNS 
identifiers, to ensure they're used consistently outside of just the 
HTML effort...

> This is syntactically similar to a recent proposal on xml-dev to use  
> reverse DNS identifiers as names:

Again, let's remove what most people are familiar with, the URI. 
Instead, let's replace it with this Java like reverse DNS identifier, 
familiar only to some techs who have used it within their programming 

I want to echo Kurt Cagle in the email thread:

"Overall, I'm going to raise this question again - what exactly is it about
namespaces that the HTML crowd doesn't like? If it's the use of complex
namespace URIs, then frankly the ideal solution to that is to provide
guidance on what constitutes a good web URI. If it's the requirement of
using prefixes, then an extension of Micah's pragmatic namespaces solution
seems to be a good start, so long as there is a formal mechanism for
insuring that ANY namespace can be introduced in this matter.

However, if it is simply a desire by a group of people (notably the WHATWG
group) to control the standard at its most conservative, then nothing that
the XML community does, no matter how well intentioned, will make any
difference. This becomes a formal W3C matter (which it ultimately should be)
- not Google, not Ian Hixie, not any of us here individually ... or has the
W3C's focus on the Semantic Web blinded it to the fact that its initial,
primary and ultimate mandate was to act as the custodian of the HTML

> My criticism:
> Follow-up:
> Follow-up to follow-up:
> On Oct 1, 2009, at 02:19, Aryeh Gregor wrote:
> > Some relevant reading (although rather brief at the moment):
> >
> >
> I suggest reading the linked page:
> And this wiki page linked onwards from there:
> > If a proposal like
> > this were adopted, couldn't we allow namespaces, but say that they're
> > just a prefix like "foo:", and drop the association with URLs?
> In that case, you should use "foo.", "foo_" or "foo-" but not "foo:",  
> since you can't get DOM Consistency with "foo:".
> "foo." and "foo:" would both be annoying with Selectors. "foo_" is  
> ugly (IMO). "foo-" is problematic in IE < 8 (and in the IE 5.5 and IE  
> 7 modes of IE8).
> > UAs are supposed to get new features specced, not make up their own  
> > syntax.
> Indeed.
> -- 
> Henri Sivonen

I don't understand this fixation on taking what we know works--we _know_ 
works, we have evidence, it exists, it is neither speculative, nor 
theory--and replacing it with something untested and untried. Moreover, 
something that will forever isolate HTML from XHTML, and generate 
confusion for years to come as people try to understand namespaces for 
older browsers, and whatever new approach the HTML WG deems to be 
"better", for new.

The HTML WG is tasked with providing a smooth transition for those 
moving from HTML4 and XHTML 1 to X/HTML5. The Microsoft proposal takes 
this into account. In fact, commendably so.

And all due respect to Sam, in an earlier email: I don't believe this 
namespace distributed extensibility proposal should be a separate 
document, either, as it is integral to HTML, the markup, rather than 
HTML, the super web application platform. There are many other aspects 
of the current HTML5 specification that, to me, make better candidates 
for being split off. I would love, nothing more, than to see the core 
HTML document be focused on what is, core HTML.

I would hate to think that we'll end up with numerous separate 
documents, all coming about basically either because Ian Hickson 
disagrees with the proposal, or the topic is contentious. Or both. That 
would not be a legacy, which the HTML WG could be proud of.


Received on Thursday, 1 October 2009 21:23:09 UTC