Re: RDF from Roy T. Fielding on 2003-02-03 (www-archive@w3.org from February 2003)

From: Roy T. Fielding <fielding@apache.org>
Date: Mon, 3 Feb 2003 12:04:40 -0800
To: Tim Berners-Lee <timbl@w3.org>
Cc: www-archive@w3.org, Mark Baker <distobj@acm.org>
Message-Id: <BA7614F0-37B2-11D7-AA3D-000393753936@apache.org>
>> No, neither describes the state of the car.  They do fall into the 
>> category
>> of "random things associated with the car", and I suppose someone 
>> could
>> define a different URI for that purpose, but they cannot provide those
>> as representations *and* claim the URI identifies the car.
>
> I see. So when you talk about the resource being a car, then
> representations must be descriptions of the the car.

Yes.

> That makes sense.  I would just say that the tim:Resource
> is a description of the roy:Resource.

Right, which works fine for GET and PUT, but not very well for POST.
If I say that an HTML form submit is converted to a representation
of url-encoded key=value pairs and sent via an HTTP message to the
resource identified by the URI for processing, then it doesn't make
sense if I say resource==description.  It does make sense if I say
that messages contain representations that are descriptions of
resources (regardless of origin) and that URIs identify resources.

>> Why?  If we define an architecture in which the URI always denotes
>> one thing, and GET on that URI through the WWW interface always 
>> results
>> in a representation of that one thing (or 404 not found), then the
>> architecture conforms to the implementations of both the Web and the
>> Semantic Web regardless of how individual publishers choose to define
>> their own resources.
>
> That's exactly what I am describing - with tim:representation.

Yes, except that tim:representation is too restricted to describe
all resources that are identified by http URI.  I cannot use it not
because it isn't self-consistent, but because it isn't descriptive
of the entire system.

> You see how we could agree on an HTTP spec without resolving this?
> By message I meant here "abstract resource", like
>  "most recent media depiction of a BMW 530i", or
>   "some description of my robot".

"conceptual work" is better.  Message is a protocol element for which
only one instance exists.

>> I don't understand why you are insisting on this conceptual work view
>> of http resources.  It doesn't seem to have any value for RDF because
>> RDF cannot assume anything about what it receives as representations.
>> If you disagree, please let me know what an RDF processor can assume
>> from nothing more than observing GET results?
>
> When an RDF processor does an HTTP GET on a URI
> and parses it, then it gets a bunch of data back. Call it a formula.
> Its a set of statements.
>
> Example 1. A simple example, it finds a reference to 
> weather.rdf#boston,
> downloads it, and finds that is says 4 statements:
> {
> 	#boston  a :City.
> 	 #boston has wea:temperature 12.
>   	#nyc  a :City.
> 	#nyc has wea:tempterature 14 .
> }
> If it wants to tell anyone where it found the data, it refers to
> <http:/.../weather.rdf>. No problem.  It might write the statement:
>
>     #boston   :weatheravailableFrom <http://...weather.rdf>.
>
> This means (say) that weather.rdf is some resource which you can
> expect to give you the weather for boston.

Yes.

> Now the data might have actually been transferred in N3 or RDF/XML.
> When I talking about the source of the data, the normal thing
> is not to quote the representation, but the resource.

That is correct -- the URI identifies the resource.  You would only
be quoting the representation if you wanted to talk about that specific
result (the bits above).

> There is no point in content negotiation if one has to actually
> refer to a representation.

Well, client-side content negotiation works better, but that's a
separate issue.

> Example 2.  Now suppose the resource acutally uses its name
> as the name of a city. in <http://...weather/boston.rdf>,
> there is {
>
> 	<boston.rdf> a :City.
> 	<boston.rdf> has :temperature 12.
> }
>
> This is what RDF people will do if we don't tell them not to.
> The RDF concepts document tells them not to, but they quote
> you Roy as saying no problem, an HTTP URI can identify a city.

An http URI can identify a city if it does so consistently.  Claiming
that the identifier is a conceptual work doesn't make it any less
susceptible to poorly expressed assertions.  For example, the above
leaves out such issues as measurement method, units, time, and location.

> Now the web agent does a HEAD request, and finds out
>
>    <http://...weather/ boston.rdf>   Last-Modifed  "2003-01-01".
> (Actually, the HTTP spec has in 7.1 the line
> "Entity-header fields define metainformation about the entity-body or, 
> of no body is present, the resource itself" which doesn't help us a  
> lot.)

I guess not.  There are hundreds of errors like that in 2616
that were not fixed because of the editorial process, not because
anyone would disagree they are errors.  HEAD is defined to be the same
as GET but without the body, so it also follows that the semantics of
the header field would still refer to the entity.

In any case, it does clearly define

    14.29 Last-Modified

      The Last-Modified entity-header field indicates the date and time 
at
      which the origin server believes the variant was last modified.

and

    variant
       A resource may have one, or more than one, representation(s)
       associated with it at any given instant. Each of these
       representations is termed a `varriant'.  Use of the term `variant'
       does not necessarily imply that the resource is subject to content
       negotiation.

which can be easily demonstrated by looking at the HTTP behavior of
Last Modified for negotiated resources and seeing that the server
will send the last modification date for the representation chosen,
rather than the latest time found after looking at all representations.

> The same agent does a fetch against an annotation server and finds out
>
>  <http://...weather/ boston.rdf>  wea:certified  
> org:WeatherUnderground.
>
> It checks a Dublin Core repository and finds
>
> <http://...weather/boston.com>  dc:creator   "Massachusetts Port 
> Authority".
>
> All these systems are giving information about the
> resource as resource about Boston but not as the city of Boston.

Actually, no, they are not.  dc:creator is a targeted assertion -- the
notion of conceptual work and representations are specifically defined
as part of its semantics.  I was present in Dublin when it was defined.
Those were librarians, and library catalog-style metadata, which always
define the target of metadata as representations because that is what
they are cataloguing -- physical books on shelves.  They don't have a
card per conceptual work -- they have a card per publisher edition
which is both date and format dependent.  Look for your book on
<http://www.dbs.cdlib.org/> to see what I mean.

I'm not sure what wea:certified means, but my guess is they have
certified the authenticity of past representations from that resource
since there is no such thing as certifying future results.

> That's the way URIs *are* used.
> (Their use to identify cities is only starting now with RDF.
> So we have to stop RDF people using it differently
> to the rest of the web)
> What, in your model, do you say
> someone is refering to when they say:
>  <http://...weather/boston.com>  dc:creator   "Massachusetts Port 
> Authority"?
> It isn't the specific representation in bits.
> It isn't the city.

In my model, invalid assertions are invalid -- they do not effect the
validity of the identifier in any way.  I would simply say that any
identifier for which dc:creator is a valid assertion must be identifying
a specific edition of a conceptual work.

> But now, if we believe that the <boston.rdf> is a city, and combine 
> this
> with some more public information about that city,
> we find formally that boston was created by Massport, founded in 1722, 
> has a length
> was last modified on 1st Jan 2003, has a population of 678987 and
> is a source of weather information  as certified by the weather 
> underground.
>
> That doesn't work.

Of course not.  Why do you think that dublin core has an xmlns?  To
disambiguate dc:created from English:created.  Surely RDF doesn't allow
names to be merged without respecting the namespace.  An English
translation would have to take into account the semantics of the
assertions as they were described, which in this case is more like:

    Massport has created a description of Boston that was last modified
    on 1st Jan 2003, claiming that Boston is a city with a temperature
    of 12.  Boston is also believed to have a population of 678987 and
    was founded in 1722.  Boston temperature descriptions by Massport
    have been certified by the weather underground.

> The individual systems all work - you probably noticed
> the OWL folks don't see the problem with it being a city
> just as you haven't come across the contradiction from actually using 
> the
> URI for two things at the same time.
>
> That is the main point.  I've responded to yours below, but the main 
> point is above.

Okay, let's leave it at that.  I still see no evidence of any problem
that was caused due to the http identifier referring to the city, but
I do continue to see problems with how RDF uses both named semantics
and URIs to target assertions.  I think RDF could be more
self-descriptive if it were differentiated syntactically, whereas
you claim they are differentiated by requiring all http identifiers
to identify a conceptual work.

If you replace the "http" URI by a "city" URI, and provide
a name resolution service for returning information about cities
by accepting a GET city:... HTTP/1.1 and responding with information
in the form of a web page, then why is "city" less ambiguous in
regards to assertions than is "http"?

....Roy
Received on Monday, 3 February 2003 15:08:32 UTC