Re: Loss of information going from SGML to XML
In message <199702212234.AA127644488@hpsgml.fc.hp.com> Dave Hollander writes:
> That Peter MR is an early indicator of how XML will be received by the
> SGML audience, is a point well taken. Perhaps we should consider a "Guide to
> Moving from SGML to XML" to help prevent surprise and confusion.
I think this is an excellent idea. (I'm not volunteering, because I don't
know enough about the practice of SGML.) It's not as bad as perhaps came
over. I can hack my own stuff easily enough.
As is clear I represent the untested market which is (obviously) less well
understood. That has to involve "Moving from HTML to XML", and I have
started to try to address that.
> Befor we consider changing the design based on one person's reaction we
> should first understand the nature of the reaction which the wg list
> did. Then, we should consider if this case is compatible with the
> design goals.
> As I recall Peter's initial dismay/confusion, it was that he would have
> to make changes, not the nature of them. The design never assumed that
> existing SGML applications would be XML complient, just that it would be
> straight forward to make the application complient.
If it helps, the problem was not that I had to make changes to my own
material - I'd expected that. (I had a little whinge about having to
change all the PUBLIC ENTITYs since that was an issue at the time and it
seemed (and still does) unnecessary to remove them.) The problem was
that it suddenly dawned that the whole of the standard existing SGML
components were incompatible, and I got the impression (wrongly)
that this had not been thought about.
> Since Peter's case is not really what we were designing for, then we
> should consider how XML is described and what types of support various
> audiences will need. Here, Peter's case has really helped. Not only
> should the nature of the transform be described, there needs to be work
> with various vendors and standards bodies to address the issues with
This is the key point. It's perfectly possible for Jane and Joe Doe to
hack ISO12083 or 12200 or whatever and use it in privacy. It's a different
matter when someone uses components which belong to someone else and
which cannot interoperate however had you try and code round. For example,
my DTDs include HTML2.0, and I asked Dan Connolly's permission (as author)
to distribute it, and it was (of course) willingly given. However Dan
wasn't at that stage clear what the precise copyright position was on
HTML2.0. I cannot imagine getting a speedy response back from any standards
body to the question "please can I make some (apparently minor) adjustments
to your DTD". I work with (chemical) standards bodies and I would not
contemplate editing any DTD and redistributing it. So the suggestion was
mainly to get round the legal position by using the standard components.
[I would also suggest that thought be given as to how an XML file identifies
itself. If you have (say) two HTML2.0 DTDs (one 'SGML' and the other 'XML'),
how do you know which is which? (apart from grepping for asterisks). Are
they going to have a PI something like <?XML?> just to alert any software?
And do they share the same FPI?]
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences