W3C home > Mailing lists > Public > public-appformats@w3.org > September 2006


From: T.V Raman <raman@google.com>
Date: Fri, 1 Sep 2006 10:03:37 -0700
Message-ID: <17656.26601.137249.628102@retriever.corp.google.com>
To: jim@jibbering.com
Cc: public-appformats@w3.org, www-forms@w3.org

Another sensible implementation of the same approach is in Daniel
Vallard's excellent libxslt  that is part of GNOME.

His xsltproc has an option that will let it take tag-soup html
and create a clean XMl parse-tree before applying the transform

It's always been a mystery to me as to why   people advocating
tag-soup continuation assert "we can build a DOM from tag-soup"
but then immediately insist on never doing failure recovery when
parsing  xhtml.

Jim Ley writes:
 > "Mark Birbeck" <mark.birbeck@x-port.net> wrote in message 
 > news:640dd5060609010619t4c7d6a88n251b2e28ad81bd27@mail.gmail.com...
 > >
 > > Hi Anne,
>>   5. "XML parsing failed: syntax error (Line: 8, Character: 0)" in Opera;
> That's interesting...does failing to parse properly necessarily have
> to prevent rendering?
> In Sidewinder we validate the XHTML against the XML schemas in one
> thread, and do some processing on the document before passing it to a
> renderer in another thread. (Current renderers are IE and Gecko.) This
> means that you'll always see something. We did it this way for two
> reasons; firstly, because most of the content that claims to be XHTML
> is actually invalid, so there wouldn't be a lot to see! And secondly,
> because we felt that the ability to know whether something was valid
> or invalid was most probably something that authors and developers
> wanted, but most likely means little to an end user.

 > It's a common misconception that the XML 1.0 requirement that on a 
 > validation error that data be stopped being parsed to the application in the 
 > normal fashion means that UA's cannot render it.  I don't know why this 
 > misconception exists, but it's generally used a stick to beat XML based 
 > languages.
 > The sidewinder approach is an extremely sensible one, indeed the only 
 > sensible one that I can think of.
 > Jim. 

Best Regards,

Title:  Research Scientist      
Email:  raman@google.com
WWW:    http://emacspeak.sf.net/raman/
Google: tv+raman 
GTalk:  raman@google.com, tv.raman.tv@gmail.com
PGP:    http://emacspeak.sf.net/raman/raman-almaden.asc
Received on Friday, 1 September 2006 17:04:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:50:05 UTC