Re: URIs vs. URIviews (was: Agenda for RDFCore WG Telecon 2002-02-15) from Dan Brickley on 2002-02-19 (w3c-rdfcore-wg@w3.org from February 2002)

From: Dan Brickley <danbri@w3.org>
Date: Tue, 19 Feb 2002 01:52:56 -0500 (EST)
To: Pat Hayes <phayes@ai.uwf.edu>
cc: Aaron Swartz <me@aaronsw.com>, <w3c-rdfcore-wg@w3.org>
Message-ID: <Pine.LNX.4.30.0202190014170.21004-100000@tux.w3.org>
On Mon, 18 Feb 2002, Pat Hayes wrote:
>
> >On 2002-02-18 3:20 AM, "Brian McBride" <bwm@hplb.hpl.hp.com> wrote:
> >
> >>>  Of course! I don't think anyone disagrees with that. The issue at hand is
> >>>  whether we can define it as naming an abstract resource. See my message
> >>>  "URIs vs. URIviews (core issue)".
> >>  Hmmm, that seems like a question about the nature of resources and what
> >>  names them.  These are questions we have kicked to the tag.
> >
> >No, the issue is what URI-references name. I think it's pretty unambiguously
> >clear
>
> I very much doubt it.  Everything Ive read about what URI-references
> name or mean has been almost impenetrably murky and so ambiguous as
> to be meaningless, if taken literally.
>
> >  and I've seen Roy Fielding, Al Gilman, and many others say the same
> >thing. I guess we could take it to the TAG if we wanted to be absolutely
> >sure, but I'm not sure how they can say anything different than what the
> >spec says.
>
> RFC2396 seems to be pretty clear that frags, while not technically
> part of the URI, are expected to be used with URIs (why else exclude
> '#' from the URI BNF ?.)  I really do not understand what the problem
> is here, from reading RFC2396. It says quite clearly that urirefs CAN
> contain fragIds.
> I would also observe that all the web browsers I use seem to be able
> to handle fragIds without any problems.
>
> >
> >>  Have I understood you correctly?  You are arguing, not that we should
> >>  answer this question, but that we should discourage folks from using uri's
> >>  with frag id's until this has been cleaned up?
> >
> >I think the question is answered (feel free to look at the text in RFC2396
> >and decide for yourself).
>
> That text certainly does not say that such use is discouraged or
> deprecated. To me it gives a very strong impression in the other
> direction, eg section 4.3
> "   A URI reference is typically parsed according to the four main
>     components and fragment identifier in order to determine what
>     components are present and whether the reference is relative or
>     absolute. "
> which seems to assume that parsing a URI reference should take into
> account any fragIds, rather than ignore them.
>
> >  But yes, I think we should discourage their use as
> >a way to stop things from getting worse.
>
> I still fail to follow exactly in what way the situation is bad here.

Couple of things:

First (apologies if this URL already circulated here), Sean Palmer
collected up and summarised a bunch of prior discussion on this issue.
http://lists.w3.org/Archives/Public/www-archive/2001Nov/0083.html


Second, my view in a medium-sized nutshell:

There *is* a problem with using URI-references as if they were just a
particularly internet-friendly way of writing down referring expressions
as short textual strings. The problem is that the RFC2396 includes various
claims that appear to provide rules constraining the meaning of
those URI-references that include a frag-ID portion so that the
meaning/reference of a URI-ref-with-frag-ID becomes context-sensitive.

(I'm not citing details from RFC2396 here; I've dug out references before
and could do again if anyone is interested, outraged etc by any of this.)

One way of couching the problem as I see it is to describe how RFC2396
makes URI-refs with frag-IDs importantly different from those that don't
include the #blahblah suffix.

For URI(-references) *without* a frag-ID, it looks like we can get away
with assuming one or both of the following:

(i) at any particular point in time, each URI-reference denotes at most
    one thing-in-the-world (aka 'resource').

(ii) (stronger claim) across time, each URI reference denotes at most
    one thing-in-the-world, ever.

(i) (and perhaps (ii)) make it possible to reason about resources named
and described in the Web environment. If (i) and (ii) are threatened, some
very useful simplifying assumptions become endangered. Users of Web
languages which depend upon RFC2396 URI-references are less likely to
miscommunicate if (i) (and ideally (ii)) are universally accepted.

The problem I have with RFC2396 and fragIDs is that the spec (on my
reading; I believe DanC disagrees) seems to say that when we
consider URI-references that include frag-IDs, the *meaning* of these as
referring expressions is context sensitive.

Typical scenario is that the URI-ref minus the frag-ID names some resource
(eg. an Image, abstractly conceived), and that resource has multiple
(mime-typeable) renderings into bytes (eg. as image/jpeg, image/png,
image/svg). RFC2396 makes it hard to consider
'http://example.com/image#xyzabc' as a context-neutral referring
expression, since it stresses the importance of 'http://example.com/image'
having different renderings, each of which provides (possibly competing)
accounts of what the frag-ID #xyzabc means.  For a PNG image, it might be
a metadata field, for SVG, it might be taken as an XPointer, for a JPEG,
it might be the colour of the xyzabc'th pixel (counting in base-36 or
something).

RFC2396's account of fragment IDs has a load of baggage that comes from
the (pretty sane) HTTP design (ie. concern for representation and transfer
of format-negotiable resource renderings). It relativises the meaning of
#blahblah fragment-IDs to different categories of the resource's
bytestream renderings, taking us away from an environment in which all
URI-references can plausibly be treated as indendently meaningful. From
the RFC2396 perspective, there is no neutral account of what
http://example.com/image#xyzabc refers to (...ie.  *means*), since
meaning/reference of frag-IDs is specified by one (of possibly various)
format-specific rules. Considered as a URI-ref for a PNG image, it might
mean one thing, as a URI-ref for an SVG image, it might mean another.

I'm not sure what if anything to do about this. It might be a storm in a teacup.

One idea I'm toying with:

(aside: you can skip the rest of this email. possibly crackpot design
sketch follows)

we might consider http://example.com/image#abcxyz as a Web identifier
(URI-ref) naming / referring to a specific view (very loosly conceived) of
an image. This 'view' wouldn't be mime-type specific.

we might consider the pair 'http://example.com/image#abcxyz', 'image/svg'
or the pair 'http://example.com/image#abcxyz', 'image/png' as picking out
rather more precise views of the image, with the exact details controlled
by the mime-type-specific rules for interpreting frag-IDs within SVGs and
PNGs.

So a list of all the resources (er, things) mentioned so far, along with a
scribble of how we might model their interrelations. (As with datatyping,
there are several ways this situation could be projected into classes and
properties.)

1. http://example.com/image
   some resource, in this case, an image.

2. http://example.com/image#abcxyz
   another resource, in this case the resource that is the woolly-generic
   view of http://example.com/image whose rfc2396:fragID property has the
   value "#abcxyz".
   Informally 'the #abcxyz view of/into our image'


3. [no-URI-given-or-known]
   *another* resource, in this case the resource that is the
   view of http://example.com/image whose
   rfc2396:fragID_contextualised_to_image/png property has the value
   "#abcxyz".
   Informally 'the #abcxyz view of/into our image' considered as a PNG.


4. [no-URI-given-or-known]
   *another* *another* resource, in this case the resource that is the
   view of http://example.com/image whose
   rfc2396:fragID_contextualised_to_image/jpg property has the value
   "#abcxyz".
   Informally 'the #abcxyz view of/into our image' considered as a JPEG.

5. [no-URI-given-or-known]
   *another* *another* resource, in this case the resource that is the
   view of http://example.com/image whose
   rfc2396:fragID_contextualised_to_image/svg property has the value
   "#abcxyz".
   Informally 'the #abcxyz view of/into our image' considered as an SVG.


...etc. for each mimetype / fragid pair

Notes:
We are assuming for simple test case an image resource that (always) has just
three unchanging content-negotiable renderings available, PNG JPEG and
SVG.  We distinguish between the format-neutral frag-ID named views
into/of the resource and their one-per-format more specific neigbours. We
might use a bunch of new RDF properties (one per format) to relate the
former to the latter, though I don't want to get bogged down in that
detail.

I'm not sure this sketch helps, being pretty halfbaked, except
perhaps to point at some material for a more careful test case.

But I'm confident of the following:

There are several distinct things-in-the-world associated with (but *not*
named by) the URI-reference 'http://example.com/image#xyzabc'.  Loosly, a
particular PNG-view into the image, a particular JPEG-view, and a
particular SVG-view. Those three things-in-the-world are not *named* by
the URI-reference 'http://example.com/image#xyzabc', since they are
distinct and different individuals. They are not named by the
URI-reference 'http://example.com/image' either; that is in our scenario a
name for the resource (regardless of any specific format) that these are
views of. Those three things are presumably related to each other, and to
the image itself, and to the string '#xyzabc' in ways that we haven't got
around to creating RDF properties for.

Coming to a decontextualised reading of URI2396 fragIDs seems to me to
require us to have the 4 resources mentioned above be accompanied by a
rather nebulous 5th, which gets to be the (rather useless, featureless)
thing always named by the URI-ref-with-fragID.

Quite enough from me on this topic. Next time I'll just send test cases.

Dan


-- 
mailto:danbri@w3.org
http://www.w3.org/People/DanBri/
Received on Tuesday, 19 February 2002 01:52:57 UTC