W3C home > Mailing lists > Public > www-rdf-interest@w3.org > December 2002

Re: RDF's Mixed-Mode Identifiers

From: Charles McCathieNevile <charles@w3.org>
Date: Thu, 26 Dec 2002 16:21:21 -0500 (EST)
To: Sandro Hawke <sandro@w3.org>
cc: <www-rdf-interest@w3.org>
Message-ID: <Pine.LNX.4.30.0212261601370.17763-100000@tux.w3.org>

Sandro, thanks for the explanation.

I wonder if the answer is simpler. If EARL is talking about whether a web
page conforms to some content requirement then what you get at that web page
for a fragment is the thing under discussion.

The problem case is when I want to talk about a thing that doesn't exist on
the web, such as my car, and make a claim in EARL that it conforms to the
roadworthiness standards of the state of Victoria (which also doesn't exist
on the Web).

The approach we have taken is to say that a web page cannot confom to
roadworthiness requirements, only a thing that is of type vehicle, and then
assume that we will infer that in this use of a URI it is merely identifying
a thing that can met the requirements. Of course this falls down if we want
to talk about who created it - the web page was made by me, but the car by
ferrari (or more likely matchbox...).

FOAF solves this problem by talking about "the thing that has a homepage at
http://example.net/foo" where 'has a homepage is unambiguous. This can be
used both for my car and the state of victoria (the publisher of a standard,
although the standard itself might actually be on the web).

The problem then becomes one of identifying statements that were made using
the assumptions of the original EARL approach and asserting that these
statements can be relied on if they are transformed in some way (e.g.
substitute the subject URI of a triple for some RDF that says 'the thing with
a homepage at this URI').



(I have oversimplified - there are no "things" on the web, just
representations of things. In some cases we have more or less general
instinctive agreement that the representation the Web gives us really is the
thing - such as the page at http://www.w3.org/ while for other things
represented on the Web (the person who has a homepage at
http://www.w3.org/People/Charles - a statement that is already addressable on
the web as a bit of foaf rdf/xml) the 'real' version isn't the one on the
web. But I don't think it is that important)

On Mon, 23 Dec 2002, Sandro Hawke wrote:

>Let me try again to explain what I now think is broken in RDF's use of
>URI-References (and how to fix it with very little pain).  Forgive me
>for starting with the obvious stuff, but it seems necessary.
>3.  Identifying Things
>In RDF (and other knowledge representation languages) we want to
>formally convey information about all sorts of things: people, places,
>times, mathematical functions, numbers, emotions, qualities, prices,
>and (of course) books.  We also want to talk about web sites and web
>pages.  How should we use the web's existing infrastructure to help us
>identify all the things we want to talk about?
>3.1.  Non-clickable links
>The simplest answer is to use strings which look like web addresses
>but don't really lead to web pages.  These could be UUIDs, tag: URIs,
>3.2  Reusing the Fragment Syntax
>Another approach is to generalize the fragment syntax.  The semantics
>of address#fragment are not fully specified in existing standards,
>mostly because the meaning of a fragment depends logically on the
>language in which the information-chunk is being conveyed.  Pointing
>into a text document is different from pointing into an audio
>recording or a 3-D image.  To leave the door open for new formats, RFC
>2396 says the semantics of an address with a fragment part depend on
>the media-type of the content served at that address.
>This open door allows us to define an RDF media type (application/rdf+xml)
>where "fragments" are not fragments, but rather arbitrary things.  When
>we say "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" we do not
>mean some part of the document at that address; we mean some abstract
>concept of a type-relation, because that document is an RDF one.
>Do we need to know the media type of the document?  Some people say
>not, that the use of that string as an RDF node or arc label is not
>governed by RFC 2396; RDF stands on its own and can use URI-like
>strings in its own way.  This may work, but as with UUIDs, it fails to
>use the web very well.  Moreover, its dilutes the power of URIs: any
>string on the planet which starts with "http://" and does not work as
>a web address is a wasted opportunity for communication and another
>chance to confuse and disappoint people.   We can do better.
>Reusing the fragment syntax also causes a few technical problems.
>What happens if the content at the given address is NOT only
>application/rdf+xml?  Maybe that's just a misconfigured system, but it
>could be a useful one.  I think it would be nice for existing browsers
>to get human-readable HTML at the same address where an RDF-capable
>client gets its information.  Like other forms of content-negotation,
>this allows all the forms of addressing (links, advertising, search
>engines, etc) to index the information itself, regardless of its
>presentation format.
>3.3  Using Descriptive Web-Content
>A third approach is to say that when a web page is about one thing, we
>can use the page's address as a kind of identifier for that thing.
>If you visit
>   http://www.w3.org/Consortium/
>you'll see it is clearly a page about the W3C.  We can use that to
>identify the W3C itself, calling the W3C "the subject of
>This is not the same as saying "http://www.w3.org/Consortium/ is a
>Consortium."  That's like pointing at a photographic image of the
>Eiffel Tower and telling someone "that's the Eiffel Tower!": it works
>perfectly well with humans, but it introduces more ambiguity than we
>want in machine processing.  (Some humans, of course, might take the
>opportunity to be pedantic and point out "No, it's a PICTURE of the
>Eiffel Tower."  Some of us try hard not to be like that.)
>4.  Node Labels (Subject, Container, Overloaded, and Distinguished)
>The challenge to using descriptive web-content to identify things is
>that we risk confusing the page with its subject.   If we just label
>an RDF node "http://www.w3.org/Consortium/" who knows if we are
>talking about a web location or an industry consortium?
>I suggest that ideally we would have two kinds of labels, which I'll
>call "Subject" and "Container" labels.  A node with the Subject label
>of "http://www.w3.org/Consortium/" represents a consortium; we would
>expect to see arcs from it saying, perhaps, that its director is Tim
>Berners-Lee.  A node with a Container label with the same text
>represents the web location itself, a container for some information;
>from it we might find arcs saying its last-modify date was "Wed, 13
>Nov 2002 21:57:38 GMT".
Received on Thursday, 26 December 2002 16:21:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:51:57 GMT