Re: Possible Compromise solution for namespaces in HTML5 from Sam Ruby on 2009-11-23 (public-html@w3.org from November 2009)

From: Sam Ruby <rubys@intertwingly.net>
Date: Mon, 23 Nov 2009 11:13:21 -0500
To: "public-html@w3.org" <public-html@w3.org>
Message-ID: <4B0AB4A1.6000301@intertwingly.net>
Sam Ruby wrote:
> 
> I do have a few concrete suggestions, ones that I will make in a 
> separate email (with my co-chair hat off).  This email likely won't be 
> sent until tomorrow.

I don't think I'm about to say anything that hasn't already been covered 
in this thread, but here is my personal view of the discussion, with my 
co-chair hat off:

First, as a preface, a quote from the current HTML5 draft:

-----

data-*

These attributes are not intended for use by software that is 
independent of the site that uses the attributes.

-----

Namespaces can be put to a lot of different uses in XML.  Given the 
different patterns of production of HTML ("copy/paste", "script", 
"templates") and huge amount of legacy content, it is an unfortunate 
fact that not all of these use cases should, or even can, be addressed 
by adding namespaces to HTML5.

These discussions seem to spend a lot of time on the topic of namespaces 
not being an appropriate/viable/wise approach for enabling browser 
vendors (like Apple) to create new elements (like canvas).  People won't 
get it right, no mechanism for fallbacks, would break the web, DOM 
consistency, ... the reasons go on an on.

Now, saying that ONE certain use doesn't work doesn't mean that ALL use 
cases are doomed to fail.  So, what is the use case to focus on? 
Flipping the data-* definition on its head, how about "elements and 
attributes with colons in their name are provided primarily to support 
the use case of metadata  intended for use by software that IS 
independent of the site that uses the elements and attributes."?

To give a few concrete examples: things like Creative Commons licenses. 
  Dublin core.  FOAF.

The essence of Rob's suggestion is that prefixes matter.  And that there 
are two types of prefixes: registered and unregistered prefixes.  Let's 
explore what each this proposal means to various audiences.

-----

Audience 1: parsers, serializers, and the like

This includes browsers.  What does Rob's proposal mean to them?  If we 
accept the limitation on the use case above, it means very little.  The 
most I can come up with is that elements which contain colons in their 
name and which contain a trailing slash (solidus) character are treated 
as empty.  And XML serializers might add missing namespace declarations 
if the prefix is known.  Neither are hard requirements, should be 
discussed more, but unless I am missing something obvious, my point is 
that this is the *MOST* that this proposal requires of such tools, 
beyond what they already do which is to place any and all such 
information in the DOM in a consistent way.

Audience 2: validators

With this proposal, unrecognized prefix usage (including namespace 
declarations) in elements and attributes would produce a warning. 
Recognized prefix usage would produce errors (if the usage is incorrect) 
or nothing (if the usage is correct).

Audience 3: authors

Use of registered prefixes is more robust, and less likely to produce 
name collisions.  Use of unregistered prefixes, while less robust and 
more likely to produce name collisions, may make sense in controlled 
environments like company intranets.

Audience 4: script writers

While not being the primary audience for this feature, pretending that 
"intended for sites independent of this site" and "intended for use by 
the site itself" are non-overlapping sets is clearly foolish.  HTML5 as 
it is defined today will represent parse names (for both elements and 
attributes) that contain colons will be represented differently in the 
DOM when the content is parsed as XML vs as HTML.  This proposal doesn't 
change that.  This simply needs to be documented, noted, and dealt with.

Audience 5: RDFa

I'm including this mostly for completeness, this isn't so much an 
intended audience as an unintended consequence.  If authors can make use 
of dublin core inside of SVG inside of HTML without so much as a 
namespace declaration, some will undoubtedly wonder why they can't do 
the same inside of RDFa.  Presuming the use of the same pre-registered 
prefixes may make sense.

Audience 6: XML

While I understand Liam's proposal, I personally don't see it as 
sufficiently broadening the set of HTML documents that can be correctly 
parsed as XML to merit the extra effort.  All it takes is a single 
unadorned ampersand in an href or inline script to trip up an XML 
parser, and Liam's proposal doesn't change that.  And the workaround of 
adding a talisman xmlns declaration on the few places SVG or MathML is 
used is not particularly onereous, furthermore adding such will continue 
to be necessary for all of the currently deployed XML parsers.

- Sam Ruby
Received on Monday, 23 November 2009 16:14:00 UTC