Re: ISSUE-41/ACTION-97 decentralized-extensibility from Henri Sivonen on 2009-10-01 (public-html@w3.org from October 2009)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Thu, 1 Oct 2009 10:20:05 +0300
To: HTML WG <public-html@w3.org>
Message-Id: <3AED9A48-FDCA-453C-9620-2A28E450B124@iki.fi>
Tony Ross wrote:
> HTML is a standard language used to mark-up hypertext documents.  
> While it is most commonly thought of as the language used to define  
> the browseable web, the set of user agents that process HTML is  
> broader than web browsers. Ensuring interoperability between  
> different user agents is the key goal of a HTML standard.

Do you mean non-browser UAs for the content of the browsable Web  
(mentioned in the following paragraph) or do you also mean HTML UAs  
for environments other than the browsable Web?

> It is a common practice for authors, tool vendors, and library  
> authors to want to extend languages to represent additional  
> information that can't be adequately described by the standard  
> grammar. This might be used to preserve metadata used by one tool in  
> a chain of operations. It might be actual data to be processed by a  
> user agent as an extension to the standard processing. Here are a  
> few examples that apply to HTML:
>
>     * A HTML document editor adds information about tool settings so  
> that a subsequent editing session can continue with the same settings.

Humans who write HTML save their mental state in HTML by writing HTML  
comments. Is there a reason why comments with a product-specific  
internal formatting wouldn't be an appropriate way to serialize  
authoring tool state?

>     * A JavaScript library processes custom tags in a browser and  
> turns them into custom controls dynamically on the page.

HTML5 addresses this use case with the data-* attributes. You take the  
element that gives the best fallback behavior when the script doesn't  
run and then put the script-sensitive stuff in data-* attributes.

>     * A browser wants to allow custom behaviors to be defined in one  
> module and attached automatically to custom elements.

How would the custom behaviors be implemented? In page-supplied XBL2?  
In native code specific to a combination of browser, OS and CPU  
architecture pre-installed prior to loading the page?

If in XBL2, wouldn't it be sufficient to be able to bind the behavior  
to class attributes or to local names that have a special character  
reserved for extensibility (such as '.' or '_' but *not* ':') without  
having to go through the trouble of changing the namespace URI from  
what the HTML5 spec says now?

If in native code, how would the unavailability of the native code  
behavior for a given browser, OS and CPU combination be less bad for  
the ability of the user to read Web content than the unavailability of  
an NPAPI plug-in (or NPAPI-plug-in-like ActiveX control)? That is, how  
would this proposal be an improvement over the current mechanism for  
proprietary extensions?

(I think the discussion about extensibility should framed in terms of  
the ability of users to read content. Not in terms of the ability of  
authors to write content.)

>     * An author includes processing instructions in the document  
> that will be processed by a server before delivering the document to  
> a user agent.

Why does this use case require the complication of dispatching on a  
{namespace, local} pair as opposed to dispatching on identifiers that  
are simple strings? Why does this case require any resemblance to what  
IE does now?

>     * An author runs a tool on a document to add numbering to  
> headings and a table of contents. Running this tool leaves custom  
> metadata tags intact.

Is the key to this bullet point leaving custom metadata intact or  
being able to discover what numbers were written by the tool itself in  
a previous run?

> Using research data gathered by Microsoft, we identified a number of  
> these concerns and this proposal was altered to avoid serious issues.

Have you done analyses on previous cases where proprietary  
extensibility of HTML has been practiced and checked if the Web had  
been better if your proposed mechanism had been used for those cases  
(e.g. <marquee>, <blink>, <canvas>, iPhone viewport meta directive, X- 
UA-Compatible, Palm Pre-specific attributes)?

>      var myCustomElements =
>           document.getElementsByTagNameNS("com.mycompany", "*");

>      @namespace my "com.mycompany";
>      my|*

Why do the examples use a non-URI namespace?

On Oct 1, 2009, at 02:09, Jonas Sicking wrote:

> I'm not actually a big fan of this proposal. Experience with
> namespaces in XML has showed (at least to me) that namespaces are too
> complex for authors to understand.
[...]
> So all in all it feels like momentum is moving away from the XML
> Namespaces model, rather than towards it.

I agree.

> I much rather like the mechanism that CSS is using. Non-standard token
> names are prepended by "-name-" in order to avoid collisions. Could we
> do something similar by using "name_" at the beginning of
> non-standardized names. We could even let people use element/attribute
> names like "www_myorg_org_myelement".

This is syntactically similar to a recent proposal on xml-dev to use  
reverse DNS identifiers as names:
http://markmail.org/message/mfd4nth7hsoro7dd

My criticism:
http://markmail.org/message/rpl4mdma2smojtaq

Follow-up:
http://markmail.org/message/duxrugqssp2vbhvm

Follow-up to follow-up:
http://markmail.org/message/uywp4yfc4lvz6erd


On Oct 1, 2009, at 02:19, Aryeh Gregor wrote:

> Some relevant reading (although rather brief at the moment):
>
> http://wiki.whatwg.org/wiki/WhyNoNameSpaces

I suggest reading the linked page:
http://wiki.whatwg.org/wiki/Namespace_confusion

And this wiki page linked onwards from there:
http://microformats.org/wiki/namespaces-considered-harmful

> If a proposal like
> this were adopted, couldn't we allow namespaces, but say that they're
> just a prefix like "foo:", and drop the association with URLs?

In that case, you should use "foo.", "foo_" or "foo-" but not "foo:",  
since you can't get DOM Consistency with "foo:".

"foo." and "foo:" would both be annoying with Selectors. "foo_" is  
ugly (IMO). "foo-" is problematic in IE < 8 (and in the IE 5.5 and IE  
7 modes of IE8).

> UAs are supposed to get new features specced, not make up their own  
> syntax.

Indeed.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Thursday, 1 October 2009 07:20:41 UTC