W3C home > Mailing lists > Public > public-html@w3.org > March 2010

Re: Bug 7034

From: Henri Sivonen <hsivonen@iki.fi>
Date: Mon, 22 Mar 2010 12:55:14 +0200
Cc: HTMLwg WG <public-html@w3.org>
Message-Id: <90357C43-4D35-4588-B906-64BD49D76A8C@iki.fi>
To: Sam Ruby <rubys@intertwingly.net>
On Mar 20, 2010, at 14:14, Sam Ruby wrote:
> One simple example to show how this relates to issue-41.  Suppose a person authors a page for iPhone users.  This page to be served in PHP.  This person uses Emacs.  During the course of development, at one point some portion of the page is commented out.  That portion happens to contain to contain consecutive dashes.  Per the current draft, this is a conformance error.  Per Validator.nu, the reason given is this data can't be serialized as XML 1.0.

I think you are mischaracterizing what Validator.nu says. It gives an error saying that consecutive hyphens aren't allowed. It doesn't give a reason why they aren't allowed. Then it gives a discretionary warning that says the document isn't representable as XML 1.0 due to consecutive hyphens in a comment.

> As a user, my reaction would be along the lines of "thanks for sharing".  At no point in any scenario that this user cares about is an XML 1.0 serializer involved.

I'm now confused about your position on polyglot documents. I thought you wanted more validator warnings on constructs that aren't permitted in both HTML5 and XHTML5. Did you want them only optionally?

As for the error, wouldn't the spec move away from--not closer to--what the Super Friends (and maybe at one point or another the TAG) have asked for if the consecutive hyphens weren't an error? (I'm not suggesting that you'd need to agree with the Super Friends or anyone else. I'm just pointing out that aligning the spec with your wishes more may end up aligning it with someone else's wishes less.)

I also note that in this general area, there lurks an actual interop issue with Gecko's old HTML parser:
https://bugzilla.mozilla.org/show_bug.cgi?id=214476

When you talk about interop issues, do you mean actual interop issues with software deployed today (even if that software might fade away in the future) or interop issues in a future scenario where every piece of software conforms to the spec?

> Now consider site #5 on the internet: live.com.  I'm also pretty sure that this site was not authored using Emacs.  It, too, is served as text/html.  It contains an attribute that validator.nu asserts can't be serializable as XML 1.0.  The statement that validator.nu makes is somewhat incomplete and arguably misleading.

How so? You can't represent an attribute whose local name is "xmlns:web" and that doesn't have a namespace in XML 1.0 plus Namespaces.

Not saying this because the source stream would be well-formed XML but with another document tree seems to me to be about scoring political points among people who like the appearance of using XML more than they care about the document being actually polyglot. However, making various presentational elements and attributes conforming would lose a lot of political points with another (but non-trivially overlapping) constituency.

How should we decide which political points to go for?

> I'll also note that the xml:lang attribute that is also present in this same page does not meet the criteria of producing a DOM when parsed using an HTML parser that can also be produced using an XML parser.

True. However, for conforming documents, this doesn't alter the meaning of the document, because xml:lang in text/html is only allowed when accompanied by lang with the same value.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Monday, 22 March 2010 10:55:51 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:59 UTC