W3C home > Mailing lists > Public > public-html@w3.org > July 2007

RE: [Parsing] When/how flagged as being HTML

From: DESCHAMPS Stephane ROSI/SI CLIENT <stephane.deschamps@orange-ftgroup.com>
Date: Tue, 10 Jul 2007 11:21:15 +0200
To: "'Karl Dubost'" <karl@w3.org>, "'HTMLWG WG'" <public-html@w3.org>
Message-ID: <008901c7c2d3$aa634930$4111b30a@stquentiny.francetelecom.fr>


> -----Message d'origine-----
> De la part de Karl Dubost
> Envoyé : mardi 10 juillet 2007 03:47

> But it isn't related to when the document or the input is 
> flagged as being HTML. So back to the sentence
> 
>      Document objects are assumed to be XML documents
>      unless they are flagged as being HTML documents when
>      they are created.
> 
> * When an input stream is actually flagged as being HTML?
> * How do we flag an input stream as being an HTML document?
>      * HTTP text/html
>      * local filesystem?
> 
> Related question:
> A document sent with application/xhtml+xml must be treated by 
> an XML parser.
> What an HTML parser does when receiving such a document. ignores it?  
> (in the case I have built an application which has only an 
> HTML parser and not an XML Parser.)

All your questioning brings me back to a question I didn't dare ask thus
far, for fear of reopening what looks like a huge can of worms, but why was
the "5" dropped in the doctype?

I'm back-reading on that, but all the threads are very long, so could
anybody summarize why the "5" was dropped?
(in reference to Chris Wilson's message "Versioning and html[5]" [1] - and
even before reading that, I thought it's sound to include a version number).

Side thought related to your question: maybe we could have two doctypes, one
for HTML5 parsed as HTML, one for HTML5 parsed as XML. (after all, we've got
three DTD's for HTML4[2] and apart from transitional, two of them are for
separate types of HTML with separate validation criteria).

Again, my apologies if people find my question offensive and repetitive. I
have a few more candid points coming up ;)

[1] http://lists.w3.org/Archives/Public/public-html/2007Apr/0612.html
[2] http://www.w3.org/TR/html4/sgml/intro.html

-- 
Best regards,
Stéphane Deschamps
  Web HCI expert
  orange / france telecom group


*********************************
This message and any attachments (the "message") are confidential and intended solely for the addressees. 
Any unauthorised use or dissemination is prohibited.
Messages are susceptible to alteration. 
France Telecom Group shall not be liable for the message if altered, changed or falsified.
If you are not the intended addressee of this message, please cancel it immediately and inform the sender.
********************************
Received on Tuesday, 10 July 2007 09:21:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:02 GMT