Re: ISSUE-54: doctype-legacy-compat from Sam Ruby on 2009-01-25 (public-html@w3.org from January 2009)

From: Sam Ruby <rubys@intertwingly.net>
Date: Sun, 25 Jan 2009 12:31:37 -0500
To: Henri Sivonen <hsivonen@iki.fi>
CC: HTML WG <public-html@w3.org>
Message-ID: <497CA1F9.4000500@intertwingly.net>
Henri Sivonen wrote:
> 
> On Jan 25, 2009, at 16:00, Sam Ruby wrote:
> 
>> Henri Sivonen wrote:
>>> On Jan 25, 2009, at 12:48, Sam Ruby wrote:
>>>> Julian Reschke wrote:
>>>>> Henri Sivonen wrote:
>>>>>> ...
>>>>>> Thus, "about:sgml-compat" is *not* interpreted as a URI by any 
>>>>>> conforming HTML5 consumer. In my opinion, it is therefore 
>>>>>> unnecessary for it to be of the form of a URI in a registered scheme.
>>>>
>>>> What about XHTML5?
>>> XHTML5 doesn't need a doctype and the best practice is not to use one.
> 
>> Can you site *any* issues with placing a DOCTYPE in XHTML5 that backs 
>> up your claim that not placing an DOCTYPE in such documents is a best 
>> practice?
> 
> <!DOCTYPE html> doesn't have any useful effect in XML, so it is 
> completely pointless for *XML* purposes. It would be weird to 
> characterize something pointless as "best practice". (It does have a 
> point for non-XML purposes in the Venus case, though.)

Again, it is worth repeating that Venus produces a file.  Whether that 
file is later served as text/html or as application/xhtml+xml is 
something the person who uses Venus decides.  As such, <!DOCTYPE html> 
serves a very valid purpose for this application.

It would be weird to characterize <!DOCTYPE html> as anything other than 
best practice merely because the output *might* be served as 
application/xhtml+xml.

> With any doctype that points to an external subset, there are the 
> following problems, the creation of which would be weird to characterize 
> as "best practice":
>  1) If a URI scheme that involves network traffic to derefernce is used, 
> there exists an XML parser that generates wasteful network traffic. 
> (These requests have a dramatic performance impact on XML parsing. They 
> are wasteful, because XHTML5 doesn't need a DTD.)
>  2) If a non-deferencable URI scheme is used, there can be assumed to 
> exist a parser that fails when it is tries to dereference the URI and 
> fails.
>  3) If a locally-dereferencable URI scheme (data:) is used, there can be 
> assumed to exist a parser that doesn't support the scheme and fails.
>  4) If there is a public id, there exists a parser that doesn't 
> recognize it and falls back onto the system id reducing the problem to 
> points 1-3 above.
> (These problems aren't specific to XHTML5--they are general XML problems.)
> 
> A doctype with an internal subset would not address you dual media type 
> concern, since cruft would be rendered in existing text/html agents.

I am not suggesting any of the above.  I was talking about either of the 
two DOCTYPEs that are defined for HTML5.  I would agree that using 
anything other than those two may happen to work as XHTML5, but would 
not be best practice.

>>> The concern of using a long doctype with XHTML5 only arises if one
>>> * is generating markup with a legacy serializer
>>> AND
>>> * is caching only one sequence of resulting bytes per URI
>>> AND
>>> * is serving the same cached bytes as application/xhtml+xml to non-IE 
>>> clients and as text/html to IE
>>> AND
>>> * wants to support (non-browser) XML clients that are configured to 
>>> process the DTD and fail if the entity resolver fails to resolve the 
>>> system id of the external subset.
>>> To me, this looks like fringe case combined with AND--i.e. something 
>>> very improbable to be concerned with. (I do realize that as 
>>> improbable as it is, Planet Intertwingly happens to hit this exact 
>>> combination. But it's already addressed by deploying a workaround at 
>>> the first point.)
>>
>> To me an existence proof trumps what might otherwise seem (logically) 
>> to be improbable.  But in any case, given the size of the internet, 
>> the spec can (and does) need to consider "improbable" cases.
> 
> Can you suggest any doctype that isn't <!DOCTYPE html> and that doesn't 
> have any of the issues I outlined above on the existence proof level? As 
> far as I can tell, using a doctype that can be produced by legacy XSLT 
> serializers in XHTML5 and supporting all conceivable XML clients are 
> conflicting requirements.

I am capable of producing <!DOCTYPE html> with the XSLT processors I 
happen to use, though I recognize I'm depending on behavior that is may 
not be interoperable.

I did not comment on this previously, but as you have persisted, I 
object to you continuing to refer to XSLT as "legacy".  Such a 
pejorative adjective would be appropriate if there were anything else 
that could be employed to transform a series of Atom entries into a 
document that can be rendered by browsers.  I say that knowing that I 
have implemented a number of transformations involving converting Atom 
entries into a series of hash tables and post processing the same with 
templating processors.  While I would love to say that they replace 
XSLT, sadly, this is not the case.

- Sam Ruby
Received on Sunday, 25 January 2009 17:32:19 UTC