Re: Non-Validating XML Parsers: Requirements

John Cowan wrote:
> 
> (There is nothing official about this: it is what I glean from
> reading the XML recommendation plus applying reason and common sense.)
> NVP = non-validating conforming parser(s).  Other capitalized terms
> are used as in RFC 2119.

I think that such requirements MUST be clarified (per RFC 2119 :-)
and hence I cc'd the xml editor alias.  (Thanks for writing this up!)

I'm basically in agreement, but after careful study of the spec I
believe that the treatment of external entities there defines not
two, but THREE (!!) categories of parsers:

	- Validating Parsers

	- Nonvalidating Parsers that read external definitions

	- Nonvalidating Parsers that don't read those definitions

Point being that as section 5.2 makes clear, some WF errors will
not be reported by the third category of parser, but must be
reported by the other two.

Making the latter categories explicit would resolve some confusion.
 

> 1.  An NVP MUST check the document entity for well-formedness and
> report any violations.
> 
> 2.  An NVP MAY check external entities (including the external DTD
> subset, external parsed entities, and external parameter entities) for
> well-formedness, and if it does so, MUST report any violations.
> 
> 3.  An NVP MUST process certain attribute list and entity declarations,
> and use them to normalize attribute values, include the replacement
> text of internal entities, and supply default attribute values.
> 
> 4.  An NVP MAY process attribute list and entity declarations that
> appear in external entities (including the external DTD subset and
> external parameter entities).
> 
> 5.  An NVP MUST NOT process attribute list and entity declarations that
> logically follow references to any parameter entities that have not
> been read by the NVP.  As usual, everything in the external
> DTD subset logically follows everything in the internal DTD subset.
> 
> 6.  An NVP MAY NOT signal an error if a reference is made to an
> undeclared entity, if the entity was declared in some external entity.

That'd be "SHALL NOT" ... but I think that a parser which
has read the external DTD entities MUST report as a WF error
such problems.   See the XML specification in section 5.2, which
is explicit that this is a WF error AND that it's not consistently
reported.

One could read the second paragraph of the "entity declared" WF
constraint in 4.1 as making this not be a WF error, in conflict
with 5.2 ... or one could read that as applying only to the case
of standalone documents.  (That is, only for standalone documents
is a nonvalidating parser that doesn't read external entities
able to report this as a WF error.)  I choose the latter reading
since it makes the spec not be internally inconsistent, but perhaps
the XML editors list see this differently.


> 7.  An NVP MAY NOT signal an error if a reference is made to an
> unparsed entity, if the entity was declared in some external entity.

As above -- this is a "MUST" if the parser reads external entities.
and a "MUST NOT" elsewhere.


> 8.  An NVP MAY NOT signal an error if an entity refers to itself
> directly or indirectly, if either the entity or some other part
> of the entity circle was declared in some external entity.

As above, MUST in some cases.


> 9.  An NVP MAY NOT signal an error if a reference to an external
> entity is made in an attribute value, if the entity was declared
> in some external entity.

As above.

> 10.  An NVP MAY NOT signal an error if a reference to an entity
> (other than a parameter entity) is made within the DTD, if the
> entity was declared in some external entity.

As above.

- Dave

Received on Monday, 3 August 1998 18:40:40 UTC