Re: Base

(I hold no brief for the mess Dave is concerned about, and have no
intention of defending the URL RFCs ...)

Dave Hollander writes:
| Lee mentions the SoftQuad use of PI to contain a URL. Great for them, 
| but I can not rely on that across vendors. Since concepts like these 
| are so useful, can we not work to establish a consistent way to identify 
| them to vendors so that they can create richer universal tools?

It may be out of scope, but we can figure that out as we go along.

[ ... ]
| I will try again to list the two possible interpretations of 
| RFC 1808 in regards to base. Note, while Roy and others deny that the 
| spec can be interpreted in multiple ways, many others did see the 
| unambiguity. 
| 
| 1) (traditional) the value of the base is added to all relative URLs
| 
| 2) the url value for the "resource" (the resource may be reached by 
| 	multiple routes and may have multiple instances)

One of the problems here is the wooly, Hydra-headed concept of "resource."
But for the case of a particular document,

| The problem arises when a content provider is trying to give a better
| URL for use in hotlists than one of the many possible routes to the
| document. This may be done for maintenance reasons (a file spec is easier
| to maintain than a imagemap coordinate, or, as some on the list noted,
| for deception).

URNs offer a way to communicate "better URLs," and to avoid binding
that information with the document as originally served.  I think
any attempt to get the full effect with URLs will be unsatisfactory,
but it should be possible to get a partial effect, which is all you're
asking for.

| > Terry Allen writes:
| > | Areas of conflict:
| > |  - Should base value *always* be displayed as the address of the document
| > | 	in browsers, hotlists, and other processing applications?
| > 
| > conflict between what?  which document?  do you mean base value or
| > base value+relative URL as arrived at per RFC 1808?
| 
| Conflict in the discussion list. The spec is ambiguous as to how the
| base value should be applied beyond relative URL completion. The
| alternative roles for base in an application is not clear.

RFC 1808 isn't ambiguous on the use of base URLs beyond what it describes,
it just doesn't deal at all with them.  (Or am I missing something?)
Not that that observations helps answer your (legitimate) questions.

| > |  - Should the base be used to identify an alias for the document?
| > | 	If not, how?
| > what is an "alias for the document"?  which document?
| 
| There can be multiple paths to a resource. The owner should not be held
| responsible for maintaining all of them, particularly since many may not
| even be under their control. How can the owner identify a prefered
| (aka maintained) identifier/url for the resource?

This is a matter of defining semantics, and it might be best
to do it at the level of MIME type.  For any given DTD, you should
be able to specify that an element holds this information; for
HTML it could have been one of the values of META or LINK, but
that horse is out of the barn and the barn's burned down.  For
XML, this could be one particular link semantic communicated by
means of an attribute the value of which is (one idea) an FPI
or URN pointing to a well known description of the meaning of the
attribute.  It doesn't have to be done with BASE, and if I look
out into the driving rain long enough I can begin to think of 
circs under which I would want different BASE and PREFERRED ADDRESS
values  (I say preferred addresses because the same issue arises
for URNs.) 

| > 
| > |  - Should the application of the base value to a partial URL always imply
| > | 	a new entity? 
| > 
| > I think the rather involved algorithm in RFC 1808 does not prohibit
| > relative URLs that in fact point to the document containing them, and
| > in fact I don't see how it can.  But perhaps I misunderstand
| > "imply a new entity."
| 
| The RFC does not specify if the object identified by 
| BASE + RelativeURL + #loc (my external network connection is down and 
| I can find the correct term for "#loc") is indeed the same resource 

fragment identifier

| as just using #loc in the current resource. This can result in a 
| different entity if an explicit base value is not the same as the 
| implicit base value (how the resource was reached).

RFC 1808 does define precedence of BASEs derived by different
means, so at least there is no ambiguity here.  But here's where
that wooly monster "resource" bites me hard.  Are you concerned
that a different document will be fetched when the user clicks
on <a href="#loc"> than the one he already has, if BASE is supplied?
If so, why?  the combination of base and fragment ID was created
by the author, perhaps deliberately.  Or are you concerned that 
the same document is identified by the browser as two different
ones?  

NB:  RFC 1866 appears to indicate that in the above case
(<a href="#loc">) one doesn't use BASE:

7.4. Fragment Identifiers

   Any characters following a `#' character in a hypertext address
   constitute a fragment identifier. In particular, an address of the
   form `#fragment' refers to an anchor in the same document.
                       


Regards, Terry

Received on Wednesday, 22 January 1997 14:37:17 UTC