- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 12 Nov 2009 03:55:06 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8268
Summary: XMLHttpRequest fails for documents with named entities
due to doctype
Product: HTML WG
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Keywords: NE
Severity: normal
Priority: P2
Component: HTML5 spec bugs
AssignedTo: dave.null@w3.org
ReportedBy: Simetrical+w3cbug@gmail.com
QAContact: public-html-bugzilla@w3.org
CC: ian@hixie.ch, mike@w3.org, public-html@w3.org
Wikipedia just experimented with switching to an HTML5 doctype. A lot of user
tools broke, and after two hours of investigation, we determined that the
problem is intractable and switched back to XHTML 1.0 Transitional.
XMLHttpRequest was historically intended only for XML, and lots of scripts rely
on the responseXML property being set to a Document. In current browsers, this
only happens when the document is actually well-formed XML. But named entities
are treated differently based on the doctype. Consider this document:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html><head>
<title>Hello</title>
</head>
<body>
<p> </p>
</body>
</html>
This works just fine in all browsers I tested in (latestish versions of
Firefox, Chrome, Opera). However, if you serve the exact same document but
replace the doctype with <!DOCTYPE html>, all of them throw a syntax error on
.
Practically speaking, this means that any site that wants to serve content
compatible with XHR cannot use either of the two doctypes that the spec
recommends for authors. There are a variety of widely-used scripts on
Wikipedia that rely on XHR, so this is currently a blocker for us. It's very
unlikely that we'll deploy HTML5 in the foreseeable future if it means our
users have to rewrite all their scripts. I'm pretty sure that XHR is used for
screen-scraping beyond Wikipedia, too, so this will probably crop up elsewhere
too.
I don't know what the extent of the magic is that causes this problem. Could
some reasonably minimal, distinctive doctype be invented that would avoid the
problem but not make the document look to humans and validators like it thinks
it's some old version of XHTML? If an existing XHTML doctype must be reused,
should validators continue to raise warnings as they do now, or should an XHTML
doctype be promoted from "obsolete permitted DOCTYPE" to a fully permitted
doctype?
Also, is this a wider problem? Are there any other tools besides browsers that
might be magically allowing named entities for some doctypes only?
--
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 12 November 2009 03:55:08 UTC