Re: Name namespace, namespaces and names in general

On Wed, 9 Feb 2000, Jonny Axelsson wrote:

> [ re: nonuniqueness of values of NAME attributes on form controls]
> It is also problematic, since references like [HTML401, sect
> 12.2.3] implies an equality between the ID namespace, the A NAME
> namespace and the <formcontrol> NAME space.

That section has an explicit list of element types to which the spec
mandates this "namespace sharing", but form controls are not included.  
Even so, all that the spec is really doing is waving its hands about
the fact that *some* uses of the NAME attribute duplicate the intent
and function of ID, that we should all hope this historical kludge
goes away some day, and that we know it won't.  The spec might have
been written a bit more clearly, nonetheless: 'ID and name attributes
share the same name space' is a poor characterization (compounded by
how easy names and declared values - i.e. 'types' - can be confused).  
The real requirement is that values of NAME attributes of some element
types, as an application convention, should not clash with IDs - which
are a unique namespace, not by convention, but by definition. 

[ It's important to distinguish between 'ID' as a name and ID as a
declared value.  The 'unique namespace', in this case, is defined by
the latter, not the former.  There can be any number of different
*names* for attributes, but if all of them are of ID *declared value*,
then their values are all subject to the same uniqueness constraint.]

> There are some benefits with a hierachical namespace scenario. 

Such hierarchies can be constructed (and in fact are the specific
subject of location methods in the HyTime standard, for instance[1])

[1] http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-7.1.html

> Case 1: "Semantic" IDs (With a view to a database)

Yes, these can be valuable, but that's not what the ID declared value
is for.  (Its purpose and scope is restricted, by design, to the
individual document - "each document has its own namespace", merely
by its existence.  IDs meet *that* need, only.) 

> [ re data from tables in a RDBMS ]
> COL = dbFIELD, COL.CLASS = dbFIELDNAME
> TR =  dbRECORD, TR.ID = dbRECORDID

Oops... It so happens that the HTML lexicon doesn't have anything for
the purpose you have in mind, but that doesn't mean misusing what's
there!

> The same way to do this in XML would be 
> <person id="Employee-ID">

No.  In fact, this is the confusion in a nutshell.  An ID is not the
*source* but the *target* of references.  COL.CLASS is a reference
*to* the dbFIELDNAME; but for a similar reference *to* a dbRECORDID,
you can't use TR.ID, or any SGMl/XML ID.  That's not what it's for.  
You want the moral equivalent of an IDREF or HREF.

> The problem arises when two or more db tables (or "XML records")
> are on the same page. It is possible that two TRs (or <person>s)
> would have the same ID. Indeed given the nature of RDBMS, it is
> highly likely.

Precisely.  One can't constrain the number of *references*:)

> Case 2: Generated pages on demand

Without denying that many face this problem, it's still factitious.
Uncontrolled processes have unreliable results: the problem exists
only because people are just "using what's there" rather than truly
addressing their requirements.  Blind *syntactic* combination of IDs
or anything else is impossible in the general case, but this doesn't
constitute a limitation of IDs.  Again, that's not what they're for.
If blind combination is insisted upon, the lesson is clear: don't use
IDs at all.

> There is a highly restrictive subset of characters allowed in an ID,

Quite generous, in my opinion:)

> A NAME by comparison is "cdata".

The classic trap:)  Here, CDATA doesn't mean what you think it
means...

> You cannot have an ID with the value "Here I am" (spaces), for for
> that matter "Here%20I%am" (% is not alphanumeric)

True.  ("alphanumeric" isn't it: it's "name character", a class that
is defined rather than left to "ontological" precognition.)

> nor "Here&#20;I&#20;am" ("&#;", and the ID value isn't parsed
> anyway).

True, but wrong reason.  The value *is* parsed [2].

[2] http://lists.w3.org/Archives/Public/www-html/1999Dec/0009.html

> Nor can you have an ID beginning with a digit, so <tag id="1"> is
> not valid. Is this really desired behaviour? What advantages are
> there to this?

No particular ones, other than what was found convenient in the past.
As a general rule, I'd say a restricted class of name characters (or
"name-valued expressions") makes for greater portability.  Most of the
objections I've seen arise from the confusion of names and references,
i.e. particular systems have a wider class of names that people would
like to "re-use" - except that the intended re-use is *referential*,
something that is *not* the purpose of IDs in a SGML/XML document.


Arjun
 

Received on Wednesday, 9 February 2000 18:53:15 UTC