Recent ERB Work on URL addressing

This report is on the last couple of ERB meetings, on March 8 and 12th.

The majority of the time was spent on the issues (the discussion items
numbered 4.*). One issue is central: can we afford to assert that all
XML locators are URLs?  The advantages are huge:
 - no need for separate locator-language machinery
 - web conformance.  On the WWW, locators are URLS, and that's that
 - existing understanding and machinery

There is one substantial downside: pure XML IDREFs cannot in this
framework be XML Links; a URL consisting only of a Name is interpreted
in URL-talk as a relative URL - you'd have to put a leading '#' in order
to get IDREF behavior.  You could rely on the declaration - if its
declared IDREF, then you know - but this probably requires DTD processing
in order to recognize links, and adds complexity; neither are good.

Another issue arises, unavoidably.  When a URL points into an XML
doc, what does the part after the '#' mean?  We have to define this.
And if we are going to assert that we want to have all our pointers 
be URLs, then we have to explain how to squeeze TEI Xpointers into
these things.

This leads to the #1 problem that makes the ERB unhappy about the 
pure-URL idea: internationalization.  We have gone to great lengths
to allow the use of any sane Unicode encoding in data and markup -
and yet the rules as to what can be in a URL are very restrictive;
URL-encoded UTF-8 is going to be massively non-human readable.  On
the other hand, it may be the case that browsers de facto do the
right thing with the part after the '#' - for sure this doesn't get
sent out over the network, so why can't it be internationalized.  So
maybe we could just assert that the URL-encoding is not required 
after the '#'.  We have some action items to check out what the specs
say, what people think they mean, and what the de facto behavior of
popular browsers is.

Summary: the ERB is leaning *very* strongly to asserting that all
locators are to be URLs - and will almost certainly go this way, if
we aren't thereby throwing away our nice clean international 
interoperability.  Input welcome. - Tim


Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-708-9592

Received on Friday, 14 March 1997 11:49:33 UTC