[whatwg] Allow trailing slash in always-empty HTML5 elements?

Michel Fortin wrote:
> It seems I was mistaken about that. I was pretty sure that it'd be a 
> parse error in XML, but I now look at the [DTD construct in the XML 
> spec][1] and I cannot see why. Apparently this is a valid DTD for an XML 
> document where the root element is <html>:
> 
>     <!DOCTYPE html>

It's a well-formed DOCTYPE (unfortunately), not a valid DTD.  If only it 
weren't and perhaps this nonsense about treating HTML as XHTML and vice 
versa would stop.

> These wouldn't since XML is case-sensitive:
> 
>     <!DOCTYPE HTML>

That would only be a validity error because the root element is not 
<HTML>, not a well-formedness error.

>     <!doctype html>

That would be a well-formedness error in XML.

>  [1]: http://www.w3.org/TR/REC-xml/#dtd
> 
> So it appears after all that if HTML allows "/>", it would be possible 
> and practical to have a single document which is valid for both HTML and 
> XHTML at the same time.

It would be theoretically possible, but totally impractical in the real 
world. You can do whatever you like in your own authoring environment 
where you have control over exactly what goes into your documents, but 
XML parsing for HTML on the web is totally impractical and I really do 
understand the desire to do so.

HTML and XML have significantly different parsing requirements and they 
absolutely must be treated as significantly different file formats.  Any 
attempt to treat them as the same format is an extremely bad idea.

> That doesn't mean the document will behave in the same way in the two
> cases however.

Exactly, that's one of the problems!  This is why the spec is defined in 
terms of the DOM, so that there can be both HTML and XHTML 
serialisations of the same document, rather than defining that both 
serialisations are the same syntax.

> I wonder if allowing "/>" in HTML couldn't, on the opposite of some 
> other arguments, help authors and developers to grasp the real 
> difference between the two markups.

No, I think the evidence of people wishing to blur the distinction 
between HTML and XHTML by having a fully compatible syntax only proves 
that it will serve to confuse the issue more.  They are separate 
syntaxes with separate parsing requirements and it makes no sense 
whatsoever to treat one like the other.

> they'll just take note that "/>" doesn't necessarily mean XHTML anymore

That has never been a reliable indication of XHTML.  There are many 
authors that just use that XML syntax regardless of the DOCTYPE, 
namespace declaration or MIME type.  Many authors just don't have a clue 
that it's XML syntax and that it is absolutely meaningless in HTML.

-- 
Lachlan Hunt
http://lachy.id.au/

Received on Friday, 1 December 2006 06:47:32 UTC