Re: xml:base

John Cowan wrote:

> Note:  I'm a member of the XML Core WG, which owns the XML Base spec,
> and I may speak in accordance with my best recollection of things
> discussed there when making statements about intentions.  However,
> I don't speak for the WG.

I'd love to know what the WG smoked when they decided the part about
the <SystemLiteral>.  The version in XML 1.0 2nd ed. was still KISS:

| The SystemLiteral is called the entity's system identifier. It is a
| URI reference (as defined in [IETF RFC 2396], updated by [IETF RFC
| 2732]), meant to be dereferenced to obtain input for the XML
| processor to construct the entity's replacement text.]

After that it deteriorated rapidly to "a SystemLiteral is gibberish,
and you're supposed to know how to convert it to an URI (preferably
by not reading RFC 3987)" - admittedly not reading RFC 3987 was easy
before it was published:

| The SystemLiteral is called the entity's system identifier. It is
| meant to be converted to a URI reference (as defined in [IETF RFC
| 3986]), as part of the process of dereferencing it to obtain input
| for the XML processor to construct the entity's replacement text.]

This means worldwide punycode as specified in RFC 3987 for the mere
purpose of resolving an external entity.  RFC 3987 is a "proposed
standard", IIRC the complete IDNA stuff is also at PS.  And the IETF
definition of a "proposed standard" in RFC 2026 contains:

# Implementors should treat Proposed Standards as immature
# specifications.  It is desirable to implement them in order to gain
# experience and to validate, test, and clarify the specification.
# However, since the content of Proposed Standards may be changed if
# problems are found or better solutions are identified, deploying
# implementations of such standards into a disruption-sensitive
# environment is not recommended.

What fun, XML 1.0 4th edition is an "immature specification".  Or
rather it's <SystemLiteral> is "immature", and the given algorithm
doesn't even reference or summarize RFC 3987 correctly... :-(

Do they have no "downref" procedures here ?  Optionally 3987 IRIs
can contain _spaces_ and other horrors.  The chance of RFC 3987
going to draft standard unmodified is IMO zero.  The chance of IDNA
surviving as is is also lousy, they're already working on IDNAbis.

And I doubt that the XML WG or anybody else would like to get an
indirect reference to Unicode 3.2 (sic!) with IDNA as it still is.

Frank

Received on Saturday, 21 April 2007 00:50:38 UTC