Re: CR-xml-infoset-20010514: unexpanded entity reference

I think there are two quite different cases:

(a) The parser has seen a declaration of the entity as a parsed external 
general entity but, for whatever reason, decides not to include it

(b) A parser is presented with a document that is not standalone (because 
it lacks a standalone declaration and references an external parameter 
entity or external DTD) but processes only the declarations in the internal 
subset, and subsequently encounters a general entity reference for which it 
has not seen a declaration.

Why do I say these are different?

(a) corresponds to the "Included if Validating" case in 4.4.3, which 
includes the requirement "If a non-validating processor does not include 
the replacement text, it must inform the application that it recognized, 
but did not read, the entity." 4.4.3 is not applicable to (b). References 
to external parsed entities are not allowed in attribute values, thus the 
infoset can completely and correctly deal with this case, and must do so in 
order to satisfy the reporting requirement of 4.4.3.

The infoset in general cannot handle case (b). Specifically, a reference to 
an entity declared in an external DTD may occur in an attribute value, but 
there's no way for the infoset to represent this (without substantial 
changes to the way attributes are handled). There's also the related issue 
of attribute value of normalization and of default values: if the processor 
has not read the declarations, it cannot guarantee the construction of a 
correct infoset.  There's also no reporting requirement for such entities 
in the XML Rec.

I really the think the way you suggest is broken.  The unexpanded entity 
reference info item should just deal with the 4.4.3 case.

At the moment, you are far from fully handling with case (b).  I don't it 
is feasible do so, and there is no requirement in the Rec that you do so. 
I don't think you can do any better than the [all declarations processed] 
and [standalone] properties you have at the moment.

Your definition of [all declarations processed] is a bit misleading.  If 
[all declarations processed] is true, then the value of attributes is not 
always known (because of attribute value normalization), and even the value 
of the [attributes] property of the element info item is not always known 
(because of default attributes).

My suggestion would be to back off on some of the [all declarations 
processed] stuff you have added relatively recently, and say simply that if 
a processor hasn't read the declarations, then it cannot in general 
guarantee the construction of a correct infoset.

--On 18 July 2001 17:03 +0100 Richard Tobin <richard@cogsci.ed.ac.uk> wrote:

>> As I understand it, an [unexpanded entity reference] info item is only
>> for the case where the processor has seen a declaration for a external
>> parsed entity, but has chosen for whatever reason not to expand.  For
>> other cases (eg a parser that does not read external DTDs or parameter
>> entities), the [unexpanded entity reference] is not applicable.
>
> No, it is also intended to be used for the case where the parser has not
> seen a declaration.  Is there some reason why that should not be so?
>
> -- Richard
>
>
>
>

Received on Wednesday, 18 July 2001 21:58:32 UTC