Re: Options for dealing with IDs

On Saturday, January 11, 2003, 7:02:58 PM, Paul wrote:

PG> At 20:02 2003 01 10 +0100, Chris Lilley wrote:

>>I agree that the "what happens when DTD validation is performed" is
>>still an issue that needs to be addressed. That might be as simple as
>>saying "if you have an external or internal subset and you declare
>>attributes to be of some other type than ID then interoperability will
>>suffer so you should ensure, if you are wise, that the IDness is the
>>same with and without DTD validation".
>>Or we could try and say which wins (but I am fairly sure the DTD
>>would win because of the ass-ba^H^H^H^H^H^H prior declaration wins
>>design and the instance is read last. 

Apologies for my aside about the first vs last declaration wins
approach. I still think its non obvious, but I do understand it.

PG> Suspecting that, by 'DTD' you mean external subset

No, explicitly not.

PG> and by 'instance'
PG> you mean (as far as declarations go) the internal subset,

Not that, either. I appreciate that these are common newbie errors and
that your first thought may be that I have blundered into them; I have

PG> what you say above isn't correct.

I believe it is because two of your suspicions were incorrect. By DTD
validation I mean exactly that - DTD validation. Whether the DTD is
located entirely in the internal subset, entirely in the external
subset, or is partitioned among them both is immaterial to the

And by instance I mean the instance. The first declaration wins, in
SGML and in XML. The internal subset is read before the external
subset. The instance (the part after the internal subset if any) is
read after that. So, any method of declaring IDness in the *instance*
(which method does not exist in XML 1.0 but which most of the options
listed here would add) then any declaration of IDness in the internal
or external subset would come first in parsing order and would
therefore be the winning declaration.

PG> The internal subset is processed before the external subset.

I know.

PG> In the case of entity declarations, all but the first are ignored.
PG> So the first declaration of a given entity in the internal subset 
PG> will cause any other declarations in (or referenced from) the internal 
PG> subset and all declaration in the external subset to be ignored.


PG> As far as ATTLIST declarations, multiple ATTLIST declarations for
PG> a given element are allowed and all take effect (they are "merged").  
PG> "When more than one definition is provided for the same attribute of 
PG> a given element type, the first declaration is binding and later 
PG> declarations are ignored."


PG> In short, unless I've misunderstood you,

You have.

PG> it is not true that "the DTD wins."

I suggest you go back and read the starting post in this thread,
entitled "Options for dealing with IDs" without which any discussion
of declaring IDness in the *instance* (and I do mean the instance
here) will not make much sense.

I then invite you to comment on whether, in those options, the DTD
declarations would win. I still assert it would, for the same reasons
you clearly explained here.

PG> In fact, the internal subset "wins" in most cases that multiple
PG> declarations are allowed because it is processed first.

Yes, I know. I am deleting a bunch of the rest of your message because
it is already established that you misunderstood what I was speaking
about, and your text merely reiterates the relative processing order
of the internal and external subsets.

>>c) best current practice for new document types is to use a single
>>attribute name for all attributes of type ID, where possible

PG> Yep.  The combination of a, b, and c is what led me to mention
PG> SGML's ATTLIST #ANY idea.  If this were allowed, your best practice
PG> suggestion would end up being a suggestion to put something like:


PG> into your internal subset.

Or the similar declaration in the instance, in any of the other XML
1.0-compatible syntaxes proposed.


Received on Saturday, 11 January 2003 14:10:23 UTC