Re: IBM Position Statement on XForms and Web Forms 2.0

Mark Birbeck wrote:
> but there are lots of things you can do with a C++ plug-in that you 
> cannot do efficiently in script. (If there weren't, why would you bother
> implementing WF 2.0 natively?)

Of course.  Any JS implementation is going to be limited and a plugin or 
native support in the browser is obviously going to be superior.

> [...] the different XForms implementations reflect this--you can go 
> from a server-hosted, zero-install solution like Orbeon or Chiba,

As far as I can tell from my brief look at those two today, they convert 
XForms on the server side to HTML 4 forms on the front end.  Is that 
correct?  If so, that fits the model described in WF2.

http://www.whatwg.org/specs/web-forms/current-work/#r-to-xforms

> FormsFaces

For that JS implementation to function in IE, it requires serving as 
text/html, which (as I've already said) is unacceptable.

> and facileXForms in between.

I couldn't find that one.  I found "FacileForms" for Mambo and Joomla, 
but that didn't appear to have anything to do with XForms.

>> XForms is also requried to be used within an XML document served with an
>> XML MIME type, such as application/xhtml+xml...
> 
> That's not true. XForms not only does *not* mandate a MIME type, but
> it doesn't even mandate a specific host language!

XForms is defined as an application of XML and defines a host langauge 
as "An XML vocabulary, such as XHTML, into which XForms is embedded.".

http://www.w3.org/TR/xforms/slice13.html#def-host-language

Since both XForms and the host language are XML, both should be parsed 
as XML.  When sent over the wire, this is achieved by using an XML MIME 
type.  XML parsing is *not* used by browsers when served as text/html, 
tag soup parsing is used instead.

>>  In fact, formsPlayer seems to add
>> support for XForms in text/html documents, which is obviously
>> non-conformant, because text/html is not XML!
> 
> Non-conformant to what, exactly? See previous point about XForms not
> mandating any particular host language.

See previous comment about the host language being defined as XML.

>> I believe he meant that XForms cannot be used correctly in text/html
>> documents (despite what the formsPlayer plugin does), and thus documents
>> using XForms would not be compatible the most prevalent user agent (IE)
>> and cannot be used in the vast majority of HTML on the web.
> 
> I don't really understand this, although I'm trying. Just so that
> people aren't getting the wrong impression here, an XML document does
> not cease to be XML just because it is delivered with the wrong MIME
> type, even though that is of course bad practice.

Perhaps, theoretically, it may be XML.  But practically speaking, using 
an XHTML DOCTYPE, xmlns attribute and/or XML declaration is virtually 
irrelevant, and documents served as text/html will be handled as tag 
soup, not XML.  So for all intents and purposes, such documents should 
be considered to be HTML.

> [...] and obviously the type of browser being used by your your users 
> will have an effect on what solution you choose.

Precisely.  WF2 is aimed at authors writing web applications for users 
who are predominately using browsers like IE, Firefox, Safari, Opera, 
etc.  The vast majority of users around the world are not using browsers 
that support XForms yet (either natively or by a plugin); they're using 
browsers that support HTML 4 forms.  Of course, this situation may 
change in the future, but a good solution will take the current 
situation into account and provide reasonable graceful degradation now.

>> And in the past 6 years, XHTML has failed to take off.  The vast
>> majority of authors don't use XHTML, they use HTML.  Why should we
>> attempt to force authors to move to XHTML, when there is significant
>> evidence to show that it won't work.  Just compare the number of pages
>> served as text/html with the number served as XML, and consider the fact
>> that the most widely used UA doesn't even support XHTML, thus making it
>> virtually pointless to publish XHTML in the real world today.
> 
> I'm not sure that this is true. I'll keep an open mind of course, but
> factors that make me think that what you say is difficult to prove,
> and probably incorrect, are:
> 
>  * most web pages will be served by automated systems, not
>    hand authored;

How is that relevant to the question of whether authors use HTML or XHTML?

>  * even of those that are hand authored, a significant number
>    will be authored with XML tools, not 'pure' HTML ones;

And a significant number will not use real XML tools on the back end 
(see below).

>  * even XHTML pages will be served with the text/html MIME
>    type, since that gives the widest browser reach;

Such pages are to be considered HTML, not XHTML (see above).

>  * so to know how many pages are delivered as XHTML you'd
>    need to know what systems are doing the delivering--the
>    MIME type tells you nothing.

No, the MIME type tells you everything about whether or not a page is 
HTML or XHTML.  The systems used on the back end are irrelevant, as is 
the markup itself.

If the MIME really does tell you nothing, would you consider the 
following invalid, but well-formed, markup to be HTML or XHTML?

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
    ...
</html>

If it's XHTML, it's certainly well formed, though invalid.  If it's 
HTML, it's just invalid due to the xmlns attribute.  <?xml ... ?> can 
either be the XML declaration or an SGML PI.

(I have seen variations of that in the wild)

> My guess is that a lot of systems are generating pages on the fly on
> the server, and are using XML-related tools to do so,

What evidence do you have to support that claim?  Here's my evidence 
against it:

* In all the projects I've taken on over the last few years, only one of 
them used XML processing on the back end.  In that case, it was used to 
generate a small fragment of XHTML that was inserted into hand-authored 
markup which was mostly not well-formed.

* Many popular CMSs (like WordPress, MovableType, etc.) which output 
markup with XHTML DOCTYPEs *do not* use XML processing.

* XML tools should guarantee well-formed output, but (as can be easily 
assessed by validating a sample of pages) the majority of XHTML as 
text/html is *not* well-formed.

-- 
Lachlan Hunt
http://lachy.id.au/

Received on Friday, 1 September 2006 02:15:40 UTC