Re: XML namespaces on the Web from John Cowan on 2009-11-19 (public-html@w3.org from November 2009)

From: John Cowan <cowan@ccil.org>
Date: Thu, 19 Nov 2009 03:21:14 -0500
To: Simon Pieters <simonp@opera.com>
Cc: John Cowan <cowan@ccil.org>, Lachlan Hunt <lachlan.hunt@lachy.id.au>, Liam Quin <liam@w3.org>, public-html@w3.org, public-xml-core-wg@w3.org
Message-ID: <20091119082114.GC30020@mercury.ccil.org>

Simon Pieters scripsit:

> Why would one need to reverse engineer an XML parser? It is defined in XML  
> 1.0 what is an error, so one can just read the XML 1.0 spec and modify the  
> XML5 algorithm accordingly.

Sure, it's possible, but it's about equivalent in complexity to writing
a parser, which has already been done repeatedly.  Wake me up when
it's finished.

> It's not clear to me that that is a goal. It would be possible by making  
> up a bogus root element, but that seems just bogus. :-)

Fair enough, but then there needs to be some kind of restriction on what
documents can and cannot be repaired.

> I see "DOCTYPE internal subset state" and in total 38 tokenizer states  
> dedicated to handling the internal subset in  
> http://xml5.googlecode.com/svn/trunk/specification/Overview.html

Yes, it skips the internal subset all right, but there's no indication
that it uses the information to, for example, correctly implement
attribute value normalization.  Whitespace characters are added to
attribute values just like any other characters.

-- 
Mos Eisley spaceport.  You will never           John Cowan
see a more wretched hive of scum and            cowan@ccil.org
villainy --unless you watch the                http://www.ccil.org/~cowan
Jerry Springer Show.   --georgettesworld.com

Received on Thursday, 19 November 2009 08:21:54 UTC