W3C home > Mailing lists > Public > public-html@w3.org > January 2009

Re: ISSUE-54: doctype-legacy-compat

From: Henri Sivonen <hsivonen@iki.fi>
Date: Sun, 25 Jan 2009 16:40:50 +0200
Cc: HTML WG <public-html@w3.org>
Message-Id: <D91C2785-8F36-4424-8F67-B057E8F47A48@iki.fi>
To: Sam Ruby <rubys@intertwingly.net>

On Jan 25, 2009, at 16:00, Sam Ruby wrote:

> Henri Sivonen wrote:
>> On Jan 25, 2009, at 12:48, Sam Ruby wrote:
>>> Julian Reschke wrote:
>>>> Henri Sivonen wrote:
>>>>> ...
>>>>> Thus, "about:sgml-compat" is *not* interpreted as a URI by any  
>>>>> conforming HTML5 consumer. In my opinion, it is therefore  
>>>>> unnecessary for it to be of the form of a URI in a registered  
>>>>> scheme.
>>>
>>> What about XHTML5?
>> XHTML5 doesn't need a doctype and the best practice is not to use  
>> one.

> Can you site *any* issues with placing a DOCTYPE in XHTML5 that  
> backs up your claim that not placing an DOCTYPE in such documents is  
> a best practice?

<!DOCTYPE html> doesn't have any useful effect in XML, so it is  
completely pointless for *XML* purposes. It would be weird to  
characterize something pointless as "best practice". (It does have a  
point for non-XML purposes in the Venus case, though.)

With any doctype that points to an external subset, there are the  
following problems, the creation of which would be weird to  
characterize as "best practice":
  1) If a URI scheme that involves network traffic to derefernce is  
used, there exists an XML parser that generates wasteful network  
traffic. (These requests have a dramatic performance impact on XML  
parsing. They are wasteful, because XHTML5 doesn't need a DTD.)
  2) If a non-deferencable URI scheme is used, there can be assumed to  
exist a parser that fails when it is tries to dereference the URI and  
fails.
  3) If a locally-dereferencable URI scheme (data:) is used, there can  
be assumed to exist a parser that doesn't support the scheme and fails.
  4) If there is a public id, there exists a parser that doesn't  
recognize it and falls back onto the system id reducing the problem to  
points 1-3 above.
(These problems aren't specific to XHTML5--they are general XML  
problems.)

A doctype with an internal subset would not address you dual media  
type concern, since cruft would be rendered in existing text/html  
agents.

>> The concern of using a long doctype with XHTML5 only arises if one
>> * is generating markup with a legacy serializer
>> AND
>> * is caching only one sequence of resulting bytes per URI
>> AND
>> * is serving the same cached bytes as application/xhtml+xml to non- 
>> IE clients and as text/html to IE
>> AND
>> * wants to support (non-browser) XML clients that are configured to  
>> process the DTD and fail if the entity resolver fails to resolve  
>> the system id of the external subset.
>> To me, this looks like fringe case combined with AND--i.e.  
>> something very improbable to be concerned with. (I do realize that  
>> as improbable as it is, Planet Intertwingly happens to hit this  
>> exact combination. But it's already addressed by deploying a  
>> workaround at the first point.)
>
> To me an existence proof trumps what might otherwise seem  
> (logically) to be improbable.  But in any case, given the size of  
> the internet, the spec can (and does) need to consider "improbable"  
> cases.

Can you suggest any doctype that isn't <!DOCTYPE html> and that  
doesn't have any of the issues I outlined above on the existence proof  
level? As far as I can tell, using a doctype that can be produced by  
legacy XSLT serializers in XHTML5 and supporting all conceivable XML  
clients are conflicting requirements.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Sunday, 25 January 2009 14:41:34 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:00 UTC