Re: html5 syntax - why not use xml syntax? from Robert Burns on 2007-07-07 (public-html@w3.org from July 2007)

From: Robert Burns <rob@robburns.com>
Date: Sat, 7 Jul 2007 06:59:13 -0500
To: Ben Boyle <benjamins.boyle@gmail.com>
Cc: "Mynthon Gmail" <mynthon1@gmail.com>, public-html@w3.org
Message-Id: <88ABE1F4-125E-49B4-BEF1-2890331B4355@robburns.com>

On Jul 7, 2007, at 6:26 AM, Ben Boyle wrote:

> Isn't it possible to have compatible syntax already?
> Is there any XHTML syntax that is invalid in a HTML document?
>
> Do any of these cause problems in HTML? Is this valid?
> <input type="radio" name="foo" value="bar" checked="checked"/>
>
> What about <?xml prolog, @xmlns, @xml:lang?
>
> I have noticed the W3C HTML validator is confused by <link ... /> and
> <meta ... /> empty tags, but had assumed it to be a valiator bug.

Well, the main things that come to mind are actually including  
closing tags for canonically empty elements like: <img></img> or  
<meta></meta>. This is perfectly acceptable in XHTML, but in HTML its  
a quirky error (though it might not necessarily lead to rendering  
issues it does sometimes add superfluous elements to the DOM. I think  
IE adds an actual empty </img><//img> element to the DOM).

I would say the differences are mostly minor. The XHTML 1.0 (I  
accidentally said HTML 4.01) appendix C covers most of it:

<http://www.w3.org/TR/xhtml1/#guidelines>

A big factor that could help this would be to provide an HTML5  
conformance checker that actually accepted or even enforced this XML- 
like syntax.

The only other issue, would be the resistance many here exhibit  
towards XML syntax. The authors in the know would read the UA  
conformance criteria and know that they could optimize their source  
and leave off many opening and closing tags, leave off quotes in  
certain circumstances, etc. However, we could have the conformance  
checkers enforce the XML like syntax anyway for forward compatibility  
reasons. Some have expressed concerns about other subtle differences  
if an author tried to repurpose XML-like serializations into genuine  
XML processing. We would want to be sure to provide authors with  
information about this and maybe even encourage changes to XML  
processing where needed to minimize those problems.

> On 7/7/07, Robert Burns <rob@robburns.com> wrote:
>>
>>
>> On Jul 7, 2007, at 3:59 AM, Mynthon Gmail wrote:
>> > My idea is to have compatible syntax, but xhtml is xhtml with its
>> > own parse and html is html with its own parser. Only syntax is
>> > unified.
>>
>> That does seem like the right thing to do for authoring conformance.
>> I have a hard tim thinking of any cons for that. Of course there
>> would still be HTML 4.0,1 HTML 4, HTML 3.2, etc. — all handled by the
>> same HTML parser — along with HTML5. But its hard for me to think of
>> downsides to just requiring of authors a very XML-like syntax for
>> HTML5's non-SGML / non-XML serialization. We would still need to deal
>> with issues of implied elements (e.g., <colgroup> and <tbody>) and
>> perhaps some escaping issues when moving between XML and HTML5
>> serializations.
>>
>> What do other think about this proposal?
>>

Take care,
Rob

Received on Saturday, 7 July 2007 11:59:21 UTC