Entity resolution

We have been having considerable problems on XML-DEV interpreting
how and when entities are to be resolved, and there is sufficient ambiguity
that it would be useful for the WG/ERB to see if this aspect of the 
XML draft could be clarified.  I apologise for those WG/ERB members who 
will already have seen this on XML-DEV but I have been encouraged to bring 
this back.

> Peter Murray-Rust wrote:
> > Entity substitution is very briefly defined in the draft.  I don't know
> > what it's like in 8879 (and I'm not going to find out!).
> > I see the following problems:
> >         - it is *possible* (though I think unlikely) that not everyone on the
> >                 ERB agrees as to what is meant to happen during substitution
> >         - parser implementers may:
> >                 * find the spec not well-enough defined
> >                 * interpret it in different ways
> >         - DTD implementers (i.e. those using PEs) may:
> >                 * find the spec not well-enough defined
> >                 * interpret it in 'incorrect' ways
> > 
> > <FACT>
> > I have found 'programming' in SGML one of the most tedious and
> > counter-intuitive things I have had to do.  The primary problem has been
> > entities, though RE hasn't helped.  I had only two ways of proceeding:
> >         - if it failed with sgmls it was my fault
> >         - Joe English helped a great deal by answering 'simple' questions
> >                 over e-mails.
> > I finally ended up with a complex, hairy, and totally non-intuitive way
> > (to non-SGML folk) set of DTDs and 'include' files.  sgmls was the only
> > way that I could tell whether it was 'right'.
> > </FACT>
> > 
> > The only way that we can expect people to develop applications for XML
> > using entities is:
> >         - be absolutely clear what we are doing
> >         - be as consistent as possible with past practice in SGML and
> >                 provide guidance on conversion
> >         - have 100% accurate parsers
> >         - have very clear examples and torture tests
> >         - have tutorials
My own experience comes from trying to 'convert HTML2.0 DTD to run under
XML'.  (My own DTDs use HTML2.0).  [I appreciate that there is no algorithmic
conversion and that certain constructs (e.g. inclusions) have no equivalent in 
XML].  HTML2.0 uses a significant number of PEs and PEReferences and some
have to be called iteratively.  My question therefore might be rephrasable
'does XML have the same power of entity substitution as SGML?  If not, where
not?  Where it does, is it the intention of the ERB that it adopts the
same strategy for entity substitution? In any case, can it please be 

[Please accept this question from someone who learns by example, rather than
trying to understand 8879.  I'm afraid there are others like me...]


Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences