Re: validator.w3.org and application/xhtml+xml

Gerald Oskoboiny <gerald@w3.org> writes:

> On Wed, Jun 26, 2002 at 06:47:40PM -0400, William F Hammond wrote:
...
> Hi,
> 
> Could you please send this to www-validator so it doesn't get
> lost? thanks!
> -- 
> Gerald Oskoboiny     http://www.w3.org/People/Gerald/
> World Wide Web Consortium (W3C)    http://www.w3.org/
> tel:+1-613-261-6630             mailto:gerald@w3.org

Here it is:

  Amaya honors xhtml when served as text/html, while the HTML WG
  advocates application/xhtml+xml for xhtml.  See
  http://www.w3.org/MarkUp/  and, in particular,  RFC 3206,  which,
  according to the former reference, does not in any way supersede
  RFC 2854 (informational).

  When I submit an xhtml document to your validator, I get:

  ------
     HTML Validation Service Results

    Sorry, I am unable to validate this document because its returned
    content-type was application/xhtml+xml, which is not currently
    supported by this service.
    
    Valid HTML 4.01! Gerald Oskoboiny
    Last modified: Date: 2001/09/14 04:13:13 
  ------

If I submit the same content as "text/xml", it sails through cleanly.

If I submit it again as "text/html" after (optional) small tweeks for
XHTML 1.0 backward compatibility, it validates apart from a comment
about the absence of an encoding, which is a consequence of (optional)
exclusion of the XML declaration.

Shouldn't the W3C validator attempt to parse any content submitted as
text/html (RFC 2854), text/xml (RFC 3023), application/xml (RFC 3023),
or application/xhtml+xml (RFC 3206)?

Isn't it assumed for text/html transfer that any necessary non-default
encoding information is to be derived from a "charset" spec in the
Content-Type transfer header?

Thanks.

                                    -- Bill

P.S.

I think it would be a good thing if there was an update of RFC 2854 --
or a just separate re-registration of text/html -- with the purpose of
making transfer charset information robust for use with XHTML.

I suggest that this simply involves providing profile=xhtml (as
opposed to a default profile=classic) to give proper context for
charset interpretation since the SGML character set has different
defaults for classic HTML (i.e., version 4.01) and for XML versions of
HTML.  I doubt if complicated namespace profile settings, parallel to
application/xhtml+xml profile values, make sense in the near future
for text/html.

Received on Friday, 28 June 2002 09:19:30 UTC