- From: Jonny Axelsson <jonny@metastasis.net>
- Date: Sun, 13 Feb 2000 14:14:11 +0100
- To: www-html@w3.org
At 19:11 09.02.00 -0500, Arjun Ray wrote: [[First, thanks for the URLs you gave]] >It's important to distinguish between 'ID' as a name and ID as a >declared value. The 'unique namespace', in this case, is defined by Yes, I consciously "overloaded" ID by giving the ID a meaning (reference to some external scheme), but I did it for good practical reasons. Not doing so would add a "conversion layer" to (X)HTML if some other process or person want to access. It doesn't have to be much, a two column table where one column is the (X)HTML ID, the other the "real" reference. Though if the process adding IDs is unconnected to the one maintaining the external data store, that table is not so simple to make. Not *having to* use such a table can easily make (X)HTML "live" today. That "conversion layer" is not a bad thing. The /semantic web/ however implemented would add such a layer, making it possible for *any* system with proper access to link up the page with that data store or other stores. Shortcuts, such as the direct HTML<->database linking above, may make the road there easier, and it will not conceptual traps like "P is like BR, only with a little more vertical space" (that is, I don't think overloading IDs is the first step towards any future bad practices). Solution for Case 2 (SSI and merged pages) It isn't too hard to find a way to make unique identifiers, even with multiple sources. If each "page component" (SSI or whatever) has an unique alphabetical (A-Za-z for HTML, the character list for XML) prefix string, followed by "." (a good look charm) and consequtive numbering 0..last-ID, every ID on every page *will* be unique: zfq.0, zfq.1..zfq.N, banana.0..banana.N,... This isn't the issue. The issue is I want mnemnonic IDs. Just as I prefer simple, comprehensible URIs, I prefer comprehensible #fraction identifiers. I want simple rules to generate and refer to a specific part of a page. For a typical text document a rule could be something like A headline's ID is its section number A headline's ID is its sequence number A headline's ID is its content [uniqueness constraint] A headline's ID is the first word of its content [uniqueness constraint] The last two are "designer friendly", but may use characters not allowed in HTML (XML) IDs. On the other hand, programs like HTML Transit automatically generates IDs for each element that will be referred to, the IDs may even be unique in the entire "Transit-space" (not only the page in question), fulfilling every need for an HTML ID. But they may be a pain to use. If you want to link to that element (and can't use cut&paste), you have to type in a long number, and you have no way to ensure that link is right or wrong except for testing it. Generally you try to avoid it, but sometimes you need to refer to a fragment in print, like <http://philantroph.net/widows/#application form> (the UA would send this as <http://philantroph.net/widows/#application%20form>, and an URL like <http://philantroph.net/widows/application/> is certainly better when possible) or <http://my.employer.com/employees.html#Jonny Axelsson>. <http://my.employer.com/employees.html#JonnyAxelsson> is possible, but more awkward for the user, while <http://my.employer.com/employees.html#gc930772015-1307> would be bad. I admit that there *is* a danger with such generative rules: they are and should be generative, not semantic. The IDs that *are* easiest to generate within ID name characters bounds, section numbers and sequence numbers, are the worst, but in any case if the contents of the HTML document change, the IDs should not. That is, if you have <a href="//source.org/report-2000-4/#sect3.2">, and some new section 2 is inserted, the ID should not change to "sect4.2", even though the headline now will be numbered 4.2 (if you want to refer to section 4.2, no matter the content, XPath is available). To go back to the database example, the record would want to refer to a specific part of the document unrelated to what comes before or after. ----- As for the "Big Question" in my original message, now rephrased "Would it be (dis)advantagous to have non-unique IDs in the same HTML/XML document (URI)?", I am still curious to the answer. Of course, this is in the "idle speculation" category, there won't be any change for years to come, even if there *were* advantages to this. It really comes down to the universal "namespace", which today is URI/BLOB, eg. the XML page is the atom. An alternative model would be URI/tree.
Received on Sunday, 13 February 2000 08:53:36 UTC