Re: Name namespace, namespaces and names in general from Jonny Axelsson on 2000-02-13 (www-html@w3.org from February 2000)

From: Jonny Axelsson <jonny@metastasis.net>
Date: Sun, 13 Feb 2000 14:14:11 +0100
To: www-html@w3.org
Message-Id: <3.0.6.32.20000213141411.00842d30@mail.linpro.no>
At 19:11 09.02.00 -0500, Arjun Ray wrote:
[[First, thanks for the URLs you gave]]

>It's important to distinguish between 'ID' as a name and ID as a
>declared value.  The 'unique namespace', in this case, is defined by

Yes, I consciously "overloaded" ID by giving the ID a meaning (reference to
some external scheme), but I did it for good practical reasons. Not doing
so would add a "conversion layer" to (X)HTML if some other process or
person want to access. It doesn't have to be much, a two column table where
one column is the (X)HTML ID, the other the "real" reference. Though if the
process adding IDs is unconnected to the one maintaining the external data
store, that table is not so simple to make. Not *having to* use such a
table can easily make (X)HTML "live" today.

That "conversion layer" is not a bad thing. The /semantic web/ however
implemented would add such a layer, making it possible for *any* system
with proper access to link up the page with that data store or other
stores. Shortcuts, such as the direct HTML<->database linking above, may
make the road there easier, and it will not conceptual traps like "P is
like BR, only with a little more vertical space" (that is, I don't think
overloading IDs is the first step towards any future bad practices).


Solution for Case 2 (SSI and merged pages)
It isn't too hard to find a way to make unique identifiers, even with
multiple sources. If each "page component" (SSI or whatever) has an unique
alphabetical (A-Za-z for HTML, the character list for XML) prefix string,
followed by "." (a good look charm) and consequtive numbering 0..last-ID,
every ID on every page *will* be unique: zfq.0, zfq.1..zfq.N,
banana.0..banana.N,... This isn't the issue.

The issue is I want mnemnonic IDs. Just as I prefer simple, comprehensible
URIs, I prefer comprehensible #fraction identifiers. I want simple rules to
generate and refer to a specific part of a page. For a typical text
document a rule could be something like

A headline's ID is its section number
A headline's ID is its sequence number
A headline's ID is its content [uniqueness constraint]
A headline's ID is the first word of its content [uniqueness constraint]

The last two are "designer friendly", but may use characters not allowed in
HTML (XML) IDs. On the other hand, programs like HTML Transit automatically
generates IDs for each element that will be referred to, the IDs may even
be unique in the entire "Transit-space" (not only the page in question),
fulfilling every need for an HTML ID. But they may be a pain to use. If you
want to link to that element (and can't use cut&paste), you have to type in
a long number, and you have no way to ensure that link is right or wrong
except for testing it.

Generally you try to avoid it, but sometimes you need to refer to a
fragment in print, like <http://philantroph.net/widows/#application form>
(the UA would send this as
<http://philantroph.net/widows/#application%20form>, and an URL like
<http://philantroph.net/widows/application/> is certainly better when
possible) or <http://my.employer.com/employees.html#Jonny Axelsson>.
<http://my.employer.com/employees.html#JonnyAxelsson> is possible, but more
awkward for the user, while
<http://my.employer.com/employees.html#gc930772015-1307> would be bad.

I admit that there *is* a danger with such generative rules: they are and
should be generative, not semantic. The IDs that *are* easiest to generate
within ID name characters bounds, section numbers and sequence numbers, are
the worst, but in any case if the contents of the HTML document change, the
IDs should not. That is, if you have <a
href="//source.org/report-2000-4/#sect3.2">, and some new section 2 is
inserted, the ID should not change to "sect4.2", even though the headline
now will be numbered 4.2 (if you want to refer to section 4.2, no matter
the content, XPath is available). To go back to the database example, the
record would want to refer to a specific part of the document unrelated to
what comes before or after.

-----

As for the "Big Question" in my original message, now rephrased "Would it
be (dis)advantagous to have non-unique IDs in the same HTML/XML document
(URI)?", I am still curious to the answer. Of course, this is in the "idle
speculation" category, there won't be any change for years to come, even if
there *were* advantages to this. It really comes down to the universal
"namespace", which today is URI/BLOB, eg. the XML page is the atom. An
alternative model would be URI/tree.
Received on Sunday, 13 February 2000 08:53:36 UTC