[Bug 15831] validator prevents XHTML5 from containing XML declaration

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15831

--- Comment #3 from Michael[tm] Smith <mike@w3.org> 2012-02-04 07:17:14 UTC ---
(In reply to comment #2)
> What I don't get is that my document is arguably more compliant with W3C
> specifications than any of the other documents that validate just fine. If
> someone were to ask me, "if I were to follow W3C best practices as much as
> possible," what would I tell them---wouldn't it be to make your document HTML5
> compliant *and* XML compliant?

No, not necessarily. While it may be that case that the W3C organizationally
took the position in the past that it was a best practice to make your
documents XML-compliant, I don't think the W3C takes that position now. I don't
at least. And the HTML5 spec does not take any position on it either. Documents
can be fully valid and "good" according to the HTML5 spec without also needing
to be XML compliant.

In fact because most documents on the Web are served with a text/html MIME type
and not with an XML MIME type, a more realistic best practice to encourage
authors in general to follow is to make sure that their documents are valid
text/html documents. But making a document that is both a valid text/html
document and also XML compliant can actually be difficult. You may already be
familiar with the guide we have published on how to do that:

http://dev.w3.org/html5/html-xhtml-author-guide/

If you've read through that document you know there are a lot of "gotchas" that
can cause problems in how your documents are processed when you author them as
well-formed XML but serve them as text/html.

The case of authoring documents as XML and also serving them with an XML MIME
type is of course a lot less error-prone. But the reality of the Web is that
far few people actually do that.

So at the time when the HTML5-checking feature was added to the current
validator, I guess it made more sense to have that option be for HTML5 and not
for XHTML5. But I don't know because I was not involved in that decision and in
fact I'm not really involved at all with work on the current validator. I only
work on it indirectly, by maintaining the part of it that provides the
HTML5-checking feature.

> Why is it, then, that the documents that most closely follow W3C
> recommendations are the last ones to validate correctly on the W3C validator?
> And I still don't understand why it's so hard to validate---XML is not a new
> technology by any stretch of the imagination.
> 
> Shouldn't documents that most closely follow W3C recommendations be the first
> ones to validate properly? Isn't HTML5 with XML compliance better than HTML5
> without XML compliance?

No, it's not better. It's not worse either. But it's also not what most people
are doing. That is, most documents on the Web are not well-formed XML
documents. Many documents on the Web that claim to be XHTML documents are in
fact not well-formed XML documents. The only reason they work correctly in
browsers is that they're being served with a text/html MIME type. Given that it
makes some sense to focus on providing text/html checking as the first choice.

But anyway, we really don't need for the service to take sides either way, and
the current validator mostly does not. What I mean is, the current validator
does actually already do the right thing for XHTML5 documents if, instead of
using the "Validate by direct input" option, you just give it the URL of an
XHTML5 document that's being served with an XML MIME. That is, it correctly
recognizes your document as XHTML5. So the support is already there; the only
thing that's missing is it doesn't expose that option for the "Validate by
direct input" case.

The history behind the HTML5-checking feature in the current validator is that
it was kind of just bolted on to the existing service as a way to make HTML5
checking available through the same user interface in the same place as the
current validator. And it has served that purpose OK. And while they could also
have bolted on XHTML5 checking for direct input at the time when HTML5 checking
was added, they didn't, and here we are now. We could now also bolt on XHTML5
checking for direct input but I don't think that's the right way forward. The
better way is to provide an additional service that exposes all the right
options in the right way. And that is what I have been working on and what we
will be launching very soon. So please wait for the announcement about that.

In the mean time, we have a pre-production version of that service available
here:

http://www.w3.org/html/check

That gives you all the same options as the validator.nu UI does. In fact it the
core part of it is exactly the same UI as validator.nu -- just with some W3C
branding wrapped around it.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Saturday, 4 February 2012 07:17:19 UTC