[Prev][Next][Index][Thread]

Re: Recent ERB Work on URL addressing



Tim Bray wrote:
> 
> This report is on the last couple of ERB meetings, on March 8 and 12th.
> 
> The majority of the time was spent on the issues (the discussion items
> numbered 4.*). One issue is central: can we afford to assert that all
> XML locators are URLs?  The advantages are huge:
>  - no need for separate locator-language machinery
>  - web conformance.  On the WWW, locators are URLS, and that's that
>  - existing understanding and machinery
> 
> There is one substantial downside: pure XML IDREFs cannot in this
> framework be XML Links; a URL consisting only of a Name is interpreted
> in URL-talk as a relative URL - you'd have to put a leading '#' in order
> to get IDREF behavior.  You could rely on the declaration - if its
> declared IDREF, then you know - but this probably requires DTD processing
> in order to recognize links, and adds complexity; neither are good.
> 
> Another issue arises, unavoidably.  When a URL points into an XML
> doc, what does the part after the '#' mean?  We have to define this.
> And if we are going to assert that we want to have all our pointers
> be URLs, then we have to explain how to squeeze TEI Xpointers into
> these things.
> 
> This leads to the #1 problem that makes the ERB unhappy about the
> pure-URL idea: internationalization.  We have gone to great lengths
> to allow the use of any sane Unicode encoding in data and markup -
> and yet the rules as to what can be in a URL are very restrictive;
> URL-encoded UTF-8 is going to be massively non-human readable.  On
> the other hand, it may be the case that browsers de facto do the
> right thing with the part after the '#' - for sure this doesn't get
> sent out over the network, so why can't it be internationalized.  So
> maybe we could just assert that the URL-encoding is not required
> after the '#'.  We have some action items to check out what the specs
> say, what people think they mean, and what the de facto behavior of
> popular browsers is.
> 
> Summary: the ERB is leaning *very* strongly to asserting that all
> locators are to be URLs - and will almost certainly go this way, if
> we aren't thereby throwing away our nice clean international
> interoperability.  Input welcome. - Tim

If the Web architecture cannot support clean internationalization, 
then making all locators URLs is only an option for the first 
release.  My guess is, this is a hole that can be exploited in the 
competition between the two commercial vendors of web browsers and 
should, therefore, be left open in some way to take advantage 
of the one who is able to devise a *beyond URL* solution.

If IDREFs can't be links, we are invalidating a lot of existing 
work in SGML as well in order to accomodate a weak addressing 
system.

It looks like URLs need some work.  You should make it clear 
that URLs are a first version limitation only and that XML 
developers should be working on a stronger, more flexible, 
and more international solution.  If they can't do this 
within the URL framework, then it should be replaced.

len


Follow-Ups: References: