Re: RE deleta est from Peter Murray-Rust on 1997-06-11 (w3c-sgml-wg@w3.org from June 1997)

From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
Date: Wed, 11 Jun 1997 22:10:24 GMT
To: w3c-sgml-wg@w3.org
Message-Id: <7876@ursus.demon.co.uk>
In message <199706111436.KAA20852@www10.w3.org> Michael Sperberg-McQueen writes:
> On Tue, 10 Jun 1997 23:38:58 -0400 (EDT) Terry Allen said:
[... concerns which I share deleted ...]

[...]
> >The target audience for this spec shouldn't have to figure out
> >what kind of XML processor reads DTDs, because half the time he'll
> >get it wrong.

I think there is a real danger of this.

> 
> Point taken.  I had thought the following was clear enough from
> the spec, but perhaps it's not:
> 
>   - XML is designed to allow processors to skip reading the DTD,
> if there is a DTD and they choose to skip it; if the RMD in the
> XML declaration is correct, such processors can reliably know
> whether skipping the DTD will have any effect on the grove they
> produce.
>   - Validating processors must always read all the DTD.

Agreed.

>   - Non-validating processors need not read the DTD.

But if they do/not they may 'pass different output to the application'.
Is this optional behaviour intended?

>   - A processor may read the entire DTD and still not be a validating
> processor (I might read the entire DTD so as to be able to produce
> the absolutely correct grove, and still not get around to
> checking against content models) -- such processors are unlikely to
> be very very common, but that's irrelevant to the fact that they

The MythicalGradStudent may not have got round to writing a validator in the
week, but can still produce the grove :-)

> are logically possible and should be legal.
> 
> So the difference between processors which read the DTD and those
> which don't is not validating vs. non-validating, and not conformant
> vs. non-conformant.  It's just processors which read the DTD vs.
> those which don't.  The former is a superset of validating
> processors.

Under '5. Conformance' the spec says:
'Conforming XML processors fall into two classes: validating and 
non-validating'.

This implies only two modes of behaviour.  But the 'non-validating' now fall 
into at least two subclasses 'those that read the DTD and those that do not'.
And there is ample scope for subclassing in those which 'read the DTD' and 
vary in action according to whatever the MSG author thinks is reasonable - 
they are only allowed a week to work it out.

What does a reader/non-reader do with:
<!DOCTYPE FOO [
<!ENTITY BAR "baz">
<!ATTLIST FOO BLORT CDATA #FIXED "XYZZY">
]>
<FOO A="B">&BAR;</FOO>

Although it might seem unreasonable, a non-reader could fail to read the entity
declaration and presumably (4.4) its only option is to throw an error - i.e
a non-reader has ipso facto a lower conformance level than a reader.  In 
similar fashion the non-reader will fail to insert the BLORT attribute in the
output - is this really acceptable behaviour?

In fact I can see little justification for non-reading processors at all.  The
only justification could be that the MGS couldn't work out how to implement
PEs, but I gather that is not longer a problem for the MGS.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
Received on Wednesday, 11 June 1997 17:41:51 UTC