Re: Naming from Dan Brickley on 2000-05-08 (www-rdf-interest@w3.org from May 2000)

From: Dan Brickley <danbri@w3.org>
Date: Mon, 8 May 2000 13:59:40 -0400 (EDT)
To: Dan Connolly <connolly@w3.org>
cc: www-rdf-interest@w3.org
Message-ID: <Pine.LNX.4.20.0005081043380.27405-100000@tux.w3.org>
On Mon, 8 May 2000, Dan Connolly wrote:

> Dan Brickley wrote:
> > On Mon, 8 May 2000, Godfrey Rust wrote:
> > 
> > > At 09:35 AM 5/8/00 +0100, McBride, Brian wrote:
> > > >What resource does the URI http://www.w3.org/2000/01/rdf-schema# name?
> > > >
> > > >Is it the namespace, the RDF Model, the XML serialization or some abstract
> > > >composition of all of them?
> 
> It identifies
> 	the empty-fragment ("") view of
> 	the resource accessable by (among other mechanisms, such
> 		as proxies, caches, and phoning your neighbor to
> 		ask him to ask him what's available there just now)
> 		HTTP to a host called www.w3.org
> 		using the path /2000/02/rdf-schema
> 
> 
> The term "resource" is pretty much opaque, like the term "point" in
> geometry. Consider:
> 
> 	What point does (10, 15) identify?
> 
> except that URI-space is very much non-euclidean ;-)
> 
> >From a less formal perspective, you can consider URIs as noun phrases;
> they identify a "person, place, or thing".
> 
> > > Or the particular manifestation or format of one of them as it appears at
> > > this location?
> > 
> > Great questions. Here's a meta-question:
> > 
> > Is this a question about the RDF schema namespace URI, about XML namespace
> > URIs generally, or about Web architecture and URI naming in the general
> > case? Unfortunately I don't think theres a clear answer, but I'm inclined
> > towards the latter.
> > 
> > Here's an analogy:
> > 
> >         http://www.w3.org/Icons/WWW/w3c_main
> > 
> > What resource is this? When I dereferenced it just now I got a resource
> > of type image/png.
> 
> Careful; you got an *entity body* of type image/png, not a resource.

In the RDF world, and on my reading of Web architecture, we   can think of
the entity body either as a literal chunk of data, or as a resource for
which we lack (knowledge of) a URI name. I'm inclined to think of it as a
resource, anonymous in the sense of being un-named in this context, but
possessing properties such as sizeInBytes, checksum etc.

So on this view: there are some resources that can provide representations
of themselves as other resources. The resource
http://www.w3.org/Icons/WWW/w3c_main and the resource it emits
[some-unnamed-entity-body] are related, and shouldn't be conflated. If we
want to exchange metadata about the latter, we either get used to talking
about it as a resource, or consider it a "literal", in the RDF sense.

> The resource identified by http://www.w3.org/Icons/WWW/w3c_main
> has state that's subject to change by the W3C webmaster, and
> you didn't get all that; you cannot, for example, locally simulate
> its behaviour indefinitely.

Appreciated. That was my point really, that the resource we're discussing
here 'w3c_main' is not something we can know directly, only exchange
messages with over HTTP. And that the things we get back in such
transactions are themselves the sorts of thing/entity/resource that we
want to use metadata to describe.

> 
> What you get back from a GET request to a resource is not, in general, a
> resource identified by any URI that you can discover; you get back some
> content;
> an 'entity body' in the HTTP spec terminology.
> 	http://www.w3.org/Protocols/rfc2616/rfc2616-sec1.html#sec1.3
> 
> An entity body is directly observable:
> you can compare it, locally, for identity with another entity
> body. In general, a resource is not directly observable;
> you can only observe it by bouncing messages off it and
> seeing how it responds.

Yep. 

> 
> In specific cases, a publisher can say "this resource *is*
> a particular piece of content; it has no other state than
> what's in this entity body: ...". For example, the W3C
> says this about
> 	http://www.w3.org/TR/1999/REC-xml-names-19990114/Overview.html
> 
> Once you have successfully done a GET on that resource,
> you can completely simulate its behaviour locally.

(It would be good to have a mechanical representation of this assertion,
eg. for caching, disconnected use etc...)

> 
> 
> Hmm... I think we also guarantee that
> 	http://www.w3.org/2000/01/rdf-schema
> doesn't change over time, though we don't guarantee that it's
> doesn't vary by MIME type.

It's not clear where we guarantee that. RDFS says that the model
corresponding to an RDF Schema shouldn't be changed, but is silent on
whether alternate syntactic representations might for eg become available
via content negotiation. This can of course be over-ridden by W3C
Publication Rules for specs. But I'm inlcined to agree: "it" shouldn't
change.


> 
> [...]
> 
> You make the same point below:
> 
> > So one lesson here is that the resource http://www.w3.org/Icons/WWW/w3c_main
> > is itself not the thing that is transferred across the wire in an HTTP
> > session.
> 
> but in somewhat misleading terminology, I think.
> 
> You can speak of the content returned from an HTTP GET as a resource,
> since you can refer to it with a noun phrase, and you can issue a URI
> to correspond to any natural language noun phrase. (There are even
> interesting design issues around naming request, responses,
> and their content ; for example, adding
> a Message-ID header to POST requests and responses, or using
> a Content-ID: header on a GET reply)

> But to do so just confuses the issue, in most cases.

I disagree. The issue is confused already - the conceptual model of a
URI-identified Web-GETtable resource and it's various accessible
representations over time hasn't yet been accessibly documented in a way
that enough people understand. My use of 'resource' in this context will
doubtless be puzzling, but the alternative seems to be inventing yet
another synonym for 'thingy'...

RDF uses 'resource' in a very general way, and makes good use of this
generality by showing how we can use this general notion of a resource to
describe properties, relationships, attributes etc of Web content. When
we pick apart some of these scenarios (eg. image, in 3 formats, changed
over time etc etc) we find lots of different
things/entities/resources/objects that are easily
confused with one another. But just as RDF models the creator of the image
as a 'resource' (aka thing) with properties and relationships, RDF models
the different public representations of an image as 'resources'. While
these may (like people...) lack public URIs, the situation is not very
different from the general problem of 'anonymous resource mentions' we've
discussed here recently.

"the person who created the image/png representation of the W3C logo"
"the image/png representation of the W3C logo"
"the literal content/state of that logo"

At least the first two seem to me to be equally deserving of the label
'resource'.

> > It's the Web's name for _something_. What you get when you ask
> > the Web about that thing, that resource, might well vary according to
> > the kind of message you send it, the time of day, or other properties of
> > that resource. HTTP transports representations not the resources
> > themselves
> 
> 'representations' is clear in most cases, but not all: what
> you get back from a GET to http://www.altavista.com/ is
> hardly a representation of all of altavista. You could say
> it's a representation of the particular service interface
> or something, but being clear when you do so is quite a challenge.

I'm not sure where you got "all of altavista" from, nor why you're
assuming that the representation of something can't be a concise summary,
eg. front pages serving as table of contents. http://www.altavista.com/ is
just some thing known to the Web; you're alluding to it being the same
thing as Altavista Inc's huge web index.

" but being clear when you do so is quite a challenge."

Quite. This is one of my concerns with RDF. However squeaky clean the
model, unless  the Web has a clear conceptual model (here's how to identify a Web
_Page_, a Web _Service_ a Web _Site_ etc) the richness of the formal model
is wasted.


> Using 'entity body' per the HTTP spec is fiarly straightfoward
> and usually clear. [hmm... checking my sources, I see
> that 'representation' is HTTP standard terminology too.]
> 
> > (although those representations might be considered
> > (anonymous?) resources too and have properties/attributes etc).
> 
> Yes, but as I say, that discussion is usually pretty confusing.

I can't see any other way to go without bloating the RDF model. If we
don't model these as 'resources' because we think the word is confusing,
what are we supposed to do instead?

If you accept that people will want to use metadata to talk about these
kinds of scenarios (images in different formats, versions etc -- pretty
mainstream problem in eg Digital Library community) we'll need some
way of using RDF without having a "pretty confusing" discussion. 


We've had this discussion before, eg. representing people. You've
previously  suggested that [dc:creator.PersonalEmail] and
[dc:creator.PersonalPhoneNumber]  properties might be used to relate a
document to details of its creator, without requiring an RDF resource
representing that person (aka as an 'anonymous node'). This strategy
doesn't work when multiple creators are involved, since the extra node in
the graph is needed to bind together their properties.

This leads to a strategy where you model the creator of a document as an
un-named resource, which serves to hook together properties such as
email/phone/age. As you've often pointed out, some of these properties
(eg. personalMailbox) can be uniquely identifying, a workaround for the
lack of URIs for people. 

I believe the same thing holds for entity bodies. We want metadata about
them (for all sorts of purposes, caching, searching, rights etc). We don't
usually have URIs for them. We don't (like people) usually think of them
as "Web" resources. But as far as RDF is concerned (and, I thought, Web
architecture generally) they are resources.
 
> > There are at least two other resources involved here: one is a thing that
> > has a mime type of image/gif and size-in-bytes of  5684, the other has a
> > mime type of image/png and a size-in-bytes of 5904.
> > It happens that these other two resources also have Web URIs, ie.
> > http://www.w3.org/Icons/WWW/w3c_main.gif
> > http://www.w3.org/Icons/WWW/w3c_main.png
> 
> Nope... we don't use http://www.w3.org/Icons/WWW/w3c_main.gif
> to name a particular stateless piece of content. We use it
> to name the image/gif variant of http://www.w3.org/Icons/WWW/w3c_main ,
> whatever that may be at the time.

OK, fine, yep.

> 
> > So, back to the XML/RDF namespace URI thing.
> > 
> > From one perspective (bare XML namespaces with no additional conventions
> > layered on top), "http://www.w3.org/2000/01/rdf-schema#" is simply a
> > string that can be used to compose URI references such as
> > http://www.w3.org/2000/01/rdf-schema#Class
> > 
> > Per the URI RFC, the URI reference
> > http://www.w3.org/2000/01/rdf-schema#Class is composed
> > of a URI proper, http://www.w3.org/2000/01/rdf-schema and a fragment or
> > view identifier, "Class". The interpretation of the latter is relative to
> > the mime type of the former.
> 
> careful... 'the MIME type of a URI' is, in general, an ill-formed
> definite description: what's *the* MIME type
> of http://www.w3.org/Icons/WWW/w3c_main ?
> 
> cf http://www.w3.org/Architecture/Terms#definite-description
> 
> rephrased carefully: the interpretation of a fragment identifier w.r.t.
> an entity body received in response to a GET request to a resource is
> relative to the MIME type of the entity body.

Thanks for the extra precision: this is exactly the problem I'm concerned
with. The mime type may differ from day to day and client to client, which
is a problem from an RDF perspective as we treat URI References as if they
made sense when lifted out of such contexts.

<rdf:Description about="http://www.w3.org/Icons/WWW/w3c_main#foo">
<s:size>1000</s:size>
</rdf:Description>

...is my concern. What does it mean to make such assertions against a URI
reference that includes the #foo, in the absense of some context such as a
particular HTTP transaction? The 'abstract' resource doesn't have a mime
type, but is associated with both image/gif and image/png representations.

> > As I've shown above, in the general case a
> > resource may use the Web to explose multiple mime-typed representations of
> > a resource. Access control, personaliseation etc also affect the
> > representations that are available in different contexts.

[...]
> > So back to my original question:
> > 
> > Do folks here think the issues around URIs and Resources, and around
> > identification of fragment identified views of representations of those
> > resources, deserve a general treatment, or should we try to figure out a
> > perspective for those cases where the resource is (in some sense) an RDF
> > model?
> 
> I hope you're not trying to reach consensus on that question. That's

No, I don't believe I understand the issue overlaps well enough to
propose a question worth seeking a "consensus" answer on. 

I was testing the water hoping for an informal response to a
hard-to-formalise question.

> a question about how to organize the most effective book on the subject
> or something.
> 
> I recommend you persue questions that have black-and-white answers,
> testable by software.

Fair point. My attempt below  was a bit more concrete but needs tightening
up.

> > In other words, do we expect there to be anything special about the
> > answers we give Brian and Godfrey for
> > 
> > http://www.w3.org/Icons/WWW/w3c_main#foobar
> > 
> > versus for
> > 
> > http://www.w3.org/2000/01/rdf-schema#foobar
> 
> Asked exactly that way, yes, I expect a different answer, because
> W3C has said that RDF Schemas don't change; i.e. if you use a URI
> in the syntactic role of RDF schema, you promise not to change
> its content over time. So W3C has promised that
> http://www.w3.org/2000/01/rdf-schema doesn't vary over time.

Asked slightly differently: how about if the latter was just some RDF
instance data, and didn't have the additional rules imposed through being
a representation of an RDF Schema.

Or: are URI references with fragement identifiers ever meaningful when
divorced from the context of some HTTP transaction? (and hence mimetype)
If so, can we specify to software the cases in which such fragment IDs
"make sense"? 

> 
> > My inclination is to say that the Web needs a fix for both cases,
> 
> A fix? I don't see any problem that you have identified.
> 
> > and that
> > attempting an RDF-specific clarification here would be unhelpful. Consider
> > this scenario:
> > 
> > http://www.w3.org/Icons/WWW/w3c_main
> > represents the W3C logo. Currently available in image/png and
> > image/gif. How do we deal with
> > "http://www.w3.org/Icons/WWW/w3c_main#foobar" if for example an SVG
> > representation (XML vector image format) were to also be made available by
> > HTTP content negotiation. This (RDF-free) problem seems to me to be very
> > close to the issues we're coming up against in an RDF context.
> 
> Problem? I don't see any problem in this scenario.

The problem:

I don't mind admitting that I find the problem hard to articulate. Here
are some of its component parts...

1. we want to have metadata about the mime-type specific
  representations of (for example) some image resource.

2. This metadata may need to make use of mime-type specific semantics for
  the # fragment/view component of a URI reference. eg. an image that was
  available in (say) SVG as one content-negotiable representation. If SVG
docs have fragement semantics (eg. allowing us to pick out some
sub-component of the image encoded in an entity body) we'll likely want to
exploit this in metadata descriptions.

3. Since there may be multiple mime-typed representations of any HTTP
GETtable resource, there can be no context-free interpretation of
URI references to these resources that use the '#foo' fragments.

Dan


--
mailto:danbri@w3.org
Received on Monday, 8 May 2000 13:59:41 UTC