Re: validating XML made more difficult than necessary

On Thu, 12 Oct 2006 23:21:12 +0200, Eric Bednarz <lists@bednarz.nl> wrote:

[This was quite a while ago, and I missed it, but it definitely
 needs to be answered.]

><?xml
>
>Martin Duerst <duerst@it.aoyama.ac.jp> wrote:
>
>> At 14:54 06/10/05, olivier Thereaux wrote:
>
>>> On Oct 3, 2006, at 10:46 , Martin Duerst wrote:
>
>>>> - A document starting with "<?xml" can easily be guessed to be XML
>>>> rather than SGML.
>
>I guess you are reading XML right now.

If you mean that I was just reading the XML spec for the first
time, then you definitely guessed wrong. I know XML since 1998
at least.

>>> I guess, although I'm sure the usual suspects on this list will
>>> happily prove you wrong with some fun corner case.
>
>It's the WWW, and it's not (only) about corner cases and standards but  
>specifically about UA's doctype sniffing.
>
>> But even if technically possible,
>
>There's not really an *if*; an SGML document instance that does not  
>contain the SGML declaration -- say, any practical application of HTML --  
>can start with an arbitrary processing instruction.

Yes, but HTML documents don't start with arbitrary processing instructions.
And a correct XML declaration isn't an arbitrary processing instruction,
it's something very specific.

>> anybody who
>> wants to validate a document starting with <?xml with an SGML
>> declaration that does not correspond exactly to XML (at which
>> point we are back to validating with XML :-) is just a danger
>> to him/herself.
>
>Oh, why?  And I wouldn't want that, you were suggesting it.  In practice,  
>the string literal '<?xml >' is good enough for CSS1Compat mode in IE 7 --  
>if you'd want to take advantage of that -- while it keeps one's legacy  
>hacks and cracks in sync with reality (and older versions).

I have no problem with you using '<?xml >' to force IE into
compatibility mode. '<?xml >' isn't XML, so using that isn't
a danger to well-formed XML. That wouln't create any problems
with my propoal, because of course only correct XML declarations/
text declarations should be accepted as such by the validator.
But if you mean "<?xml version='1.0'?>" or anything similar when
writing '<?xml >', then I'd have to strongly disagree.

You would also, as I announced above, just hurt yourself.
For details, please see:
http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx

>If neither practical applications nor reality matter, it's still  
>conforming to ISO 8879,

Practical applications matter. IE compatibility mode is, for better
or worse, reality. But XML is also reality.

> and considerably more convenient than e.g.
><!--[if lt IE 7]><!-- -- BackCompat -- --><![endif]-->

Yes. But if it has to be short, why don't you just use something
like '<?ie >', or even a one-letter PI, '<?a >'? That's even more
convenient. And there is absolutely no need to use the letters
'xml' and create confusion.

Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Received on Sunday, 27 May 2007 08:05:51 UTC