Re: XHTML 1.0, section C14 from Benjamin Hawkes-Lewis on 2006-11-27 (www-html@w3.org from November 2006)

From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
Date: Mon, 27 Nov 2006 18:48:10 +0000
To: www-html <www-html@w3.org>
Cc: Mikko Rantalainen <mikko.rantalainen@peda.net>
Message-Id: <1164653290.4786.73.camel@galahad>
On Mon, 2006-11-27 at 14:26 +0200, Mikko Rantalainen wrote:

> If the content I'm serving is mathematically oriented and the 
> *source* format for that content is XHTML+MathML how on earth am I 
> supposed to convert that to HTML of equivalent quality? The best I 
> can think is HTML+PNG images: user loses at least font scaling, 
> baseline alignment and formula source.

1) Nobody is saying that the HTML variant would be of equivalent
quality.

2) I am not a mathematician, but I would suggest something like the
following. Embed a MathML document into the HTML document with an OBJECT
element. As a fallback for the MathML, have an SVG OBJECT. As a fallback
to the SVG OBJECT, have a PNG or GIF image. As the fallback or ALT text
to the image, serve a text serialization of the equation(s). Such text
serializations do exist; they are used on in spreadsheets, email, IRC,
programming, and for screen reading and braille displays. See for
example:

http://mega.ist.utl.pt/~pocm/math-in-ascii/index.html

http://www.math-atlas.org/collection/how-to-read

http://groups.google.co.uk/group/sci.math/browse_thread/thread/cb48fc0d292aebe2/

http://www1.chapman.edu/~jipsen/mathml/asciimath.html

http://mathforum.org/typesetting/ascii.guidelines.html

http://en.wikipedia.org/wiki/User:Merge/Drafts/Plaintext_mathematics

http://www.dotlessbraille.org/NemethIntro.htm

It would be helpful if W3C (or some other standards organization)
adopted one of these.

This sounds like a lot of work, but it should be possible to set up a
system to generate MathML, SVG, GIF, and text serializations
automatically from a single source. You could also skip steps and jump
from MathML straight to the text serialization.

> There's no point to use XHTML over HTML unless one is going to use some
> other XML based language so I consider this a meaningful example.

That's mostly true now; but it will be less true when Web Forms 1.0, Web
Applications 1.0, and XHTML 2.0 begin to appear.

> Note that if the user agent supports even basics of XHTML+MathML 
> it's usually much better than a perfect HTML user agent in this use 
> case.

Since MathML can be placed in an OBJECT, HTML doesn't limit MathML at
all.

> Yes, this is only one example but I hope it illustrates the need for 
> quality parameter. Only one variant can be the *source* format, all 
> the other variants that the server is able to provide a more or less 
> perfect approximations.

There's no particular reason to assume the source format is a web media
type at all. It might be serialized out of a database somewhere. 

> I'd ban the "*/*" in the Accept header unless it had a quality less 
> than one. If the intent is to hint the server that UA is willing to 
> download any binary file that choice should be considered a fallback 
> and it's quality can never be 1. Or if UA's "download source 
> variant" action has been triggered, then the UA shouldn't sent 
> Accept header at all.

What is a "download source variant" action?

Also, according to the HTTP 1.1 specification, sending no Accept header
is exactly equivalent to "Accept: */*".

> Why? Imagine a perfect user agent that supports both XHTML and HTML 
> without a flaw. Why should it prefer one over another? The server 
> should send the *source* variant if it is able to provide any. (In 
> the *best* case XHTML and HTML variants really are interchangeable 
> and in that case it doesn't matter which variant the user agent gets).

The best case is a very limited subset of the potential of
application/xhtml+xml, as you've just demonstrated with your MathML
example. UAs can never know they are requesting a resource that falls
into that category, so assuming they can handle application/xhtml+xml as
well as text/html, they should (probably) prefer application/xhtml+xml.

> IMHO, the UA should describe its support for various MIME types in 
> its Accept header

What does "support" mean? Rendering (say HTML)? Opening with a plugin
(say QuickTime)? Opening automatically in another program (say Microsoft
Word)?

> I agree. At least recommend that the *default* setup makes it 
> immediately obvious that the content provider is sending invalid 
> content. That way web "designers" cannot just ignore those errors 
> and pretend that it hurts only a few.

That's right, the default setup MUST trigger a high priority warning.
The warning dialogue MAY include the option to relegate such irritating
warnings to the status bar (or equivalent) for this site or all sites.
(This is important if end-users are to be dissuaded from switching back
to more broken user agents.) Most browsers do something similar as you
move between HTTP and HTTPS locations, or when a site tries to apply
cookies. The security benefits of such warnings are extremely dubious:

http://www.cs.auckland.ac.nz/~pgut001/pubs/phishing.pdf

http://people.deas.harvard.edu/~rachna/papers/why_phishing_works.pdf

Given these problems, the idea that error messages should be reserved
for security reasons is extremely untenable. Gobbledygook security
warnings worry site creators much more than most site users. If you'll
forgive the cynicism, I predict that gobbledygook MIME errors would have
a similar effect, and encourage site creators to fix how content is
served. And at least they don't risk encouraging a false sense of
security (except to any extent that MIME errors represent a security
problem  in themselves).

--
Benjamin Hawkes-Lewis
Received on Monday, 27 November 2006 18:56:48 UTC