Re: validator not doing application/xhtml+xml

Terje Bless <link@pobox.com> wrote:

> [ BTW, mimasa, Iım looking at this squarely from the Validator ]
> [ POV.  The RFC in question may well solve all problems in the ]
> [ general case, itıs just that we have some odd things to take ]
> [ into account where the Validator is concerned.  Lots of old  ]
> [ baggage in the implementation not least of all! :-(          ]

Noted.  Likewise, I'm trying to limit the discussion to validator-
related issues only here.

> >An important information for the validator is that the body of a MIME
> >entity sent as 'application/xhtml+xml' is syntactically XML.  That is,
> >the validator can switch to the "XML mode" without sniffing the actual
> >content.  That's a big difference with 'text/html'.
> 
> Itıs a big difference from Œtext/htmlı, but is it usefull?
> 
> Ok, so weıre now to the point where what we get is known to be generic XML.
> Now what? Can I go ahead and assume SGML-type semantics for the XML
> Application in question? Will it have a nice easy flattended DTD I can
> expect SP to handle? Or does this particular brand of application/xhtml+xml
> require XML Schema Validation? Namespaces? Do I need something that groks
> M12N? What are the Character Encoding semantics? What are the higher-level
> semantics so I can implement pretty-but-not-formal features (aka.
> ³linting²)?

Most of these points are not specific to this particular media type.
The validator already handles XHTML documents sent as text/xml and
application/xml.  I don't see any big difficulty to support
application/xhtml+xml as well.   On a specific point, 

> What are the Character Encoding semantics?

The same semantics as application/xml, as defined in RFC 3236.

> Are we even at a point were I would avoid all these problems by using a
> real Validating XML Processor instead of the half-baked hack that SP is in
> relation to XML? Because as a practical matter, real XML Validation is
> beyond us at the moment so SP-based hacks are what we can do so far.

I do understand that, and I'm not going to ask more than what
the validator can do for XHTML documents sent as application/xml.

> >In the absense of a DOCTYPE declaration, the validator may only perform
> >well-formedness check, just like it does for XML documents sent as
> >'text/xml' or 'application/xml' at the moment.
> 
> Yes, we can implement that much right now, but Iım worried that
> application/xhtml+xml will need to cater to the same crowd that makes
> validating text/html such a joy. IOW that itıll need to be pragmatic rather
> then formal and strict in some key aspects. Doing just WF checking and then
> suddenly switching to full blown Validation is a sure way to get the
> Besserwissers to come crawling out of the woodwork (and Iım not even sure I
> blame them).

At the moment the validator completely stops validation when an HTML
document sent as text/html lacks a DOCTYPE declaration, and does
well-formedness checking when it receives an XHTML document without
a DOCTYPE declaration.  I don't see much difference.

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Wednesday, 13 February 2002 15:08:42 UTC