W3C home > Mailing lists > Public > www-rdf-interest@w3.org > April 2003

RE: URI for language identifiers

From: <Patrick.Stickler@nokia.com>
Date: Wed, 2 Apr 2003 09:28:12 +0300
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B5FBB5C@trebe006.europe.nokia.com>
To: <algermissen@acm.org>
Cc: <www-rdf-interest@w3.org>


> > At present, a representation can be anything. 
> 
> Hmm, I thought a representation is at least 'bits and bytes' (data),
> what do you mean by 'anything'?

Yes. At least bits and bytes. But any arbitrary sequence of bits
and bytes that the owner of the resource chooses to call a representation.

> > So if I have a URI that denotes a dog, I may HTTP GET a 
> representation
> > of that dog which is in fact a digital image or perhaps a 
> video stream,
> > or maybe an encoding of its DNA. Whatever. I may have 
> dozens of different
> > representations I can choose from.
> 
> Sure, but what I don't understand is how RDF solves the problem that
> some people might use http://www.w3.org/Consortium/ to 
> identify the W3C
> while others might 'make DC statements about it'. In fact, it's
> exactly the coexistence of those two things that I think is 
> powerful, because
> it allows me to use ordinary (e.g. HTML) web pages to 
> identify subjects.

But if the URI denotes two things, how do you differentiate
between statements made about one versus the other?

> > 
> > And *each* of those representations is a resource in its 
> own right, which
> > may (IMO should) be denoted by a URI that is distinct from 
> that denoting
> > the resource of which it is a representation -- and servers 
> returning
> > a representation may (IMO should) specify the identity of 
> the representation
> > in the response.
> 
> Hmm, but in REST a URI cannot denote a representation, you 
> cannot address the
> 'bits and bytes'. ?!

Sure you can. Why not?

REST allows you to use a URI to denote anything, including
representations.

What REST sorely lacks is a concept of "canonical" or "literal"
representation of digital resources which corresponds to a bit-equal
copy of the resource. One would expect that representations of
representations would be "literal" in that sense. But I agree
that this is a bit vague in REST (though still supported IMO).

> > Still, you *never* can GET a resource itself, even a 
> digital resource.
> > You always get a representation. And that representation 
> (unless a bit-equal
> > copy) is then a distinct resource.
> > 
> > > But I guess that is just
> > > something to accept as part of the (re-)definition of 
> resource if I
> > > want to use RDF.
> > 
> > Again. This has nothing to do with RDF. This is the web 
> architecture.
> 
> But in what way is the redifinition of resource related to 
> Web architecture,
> I don't get it. 
> 
> [Of course the concept of resource must include services 
> (volume control etc.)
> and PUTing etc. to such resources makes a lot of sense to me, 
> but abstract
> concepts.....why is it part of the Web architecture that 
> those can also be
> resources??]

It comes down to whether there is one web or many. Most folks
want there to be one web, not e.g. a REST Web and a Semantic
Web. In order for the Semantic Web to be "part of" the one
web, we need to be able to refer to anything whatsoever using
URIs, and that includes abstract concepts and other non-web
accessible resources.

Now some, including TimBL, would prefer to make a key distinction
between URIs and URIrefs, where URIs only denote web-accessible
resources, and URIrefs must be used to denote non-web accessible
resources. Others, including myself, see no need for such a
distinction.

> > 
> > As far as RDF is concerned, one need never dereference any 
> URI and never
> > get any representation. URIs denote resources and one may use RDF to
> > make statements about resources. Representations are 
> entirely outside
> > the scope of RDF proper.
> > 
> > However, where RDF and the web architecture agree is on the 
> fundamental
> > principle that URIs should have globally, consistent, 
> unambiguous, and
> > immutable meaning.
> 
> But you don;t need URIs for that, nor do you need the Web. 
> Maybe that is
> what I don;t understand: what is the idea of how RDF 
> interacts/intermingels
> with Web architecture (beyond the idea of 'controled vocabulary') ?

Well, fair enough. If you want your TMs to remain disjunct from the
web, then fine, but there is then no point to this discussion. Since
RDF is intended to operate effectively within the Web architecture
it may simply not be useful to compare RDF and TMs.

So, even though XTM may use URIs, it is perhaps not using them the same
way as the rest of the world (web) and thus there is no basis whatsoever
for interoperability between XTM and RDF.

> > A URI always denotes the same thing, no matter where you 
> encounter it, and
> > no matter how many representations might be associated with it, etc.
> > 
> > URIs are the global constants, the atomic elements, of the 
> web and semantic
> > web.
> > 
> > > Furthermore as it prohibits an author to use
> > > "http://www.w3.org/Consortium/"
> > > as an identifier for the W3C (since it is a Web page).
> > 
> > Is it? How do you know? Because you did an HTTP GET and got back
> > an HTML instance?
> 
> Ok,ok...but we are on the Web, so this assumption seems pretty
> normal to me.

It's a common mis-assumption. I used to make it regularly.

> > 
> > That does not prove that it denotes a web page. The representation
> > you got from the server may very well be a web page. Without
> > authoritative knowledge about the resource itself, you can't know
> > for sure.
> 
> But I can use the URI to assert that
> 
> http://www.w3.org/Consortium/ dc:Title  "About the W3C"
> 
> Is that wrong or right or what ?

If it's a web page, it seems correct. But of course, one must have some
idea of what some URI denotes before one can make statements about the
resource denoted.

The key is knowing what it denotes, and that the denotation is consistent.

If a given URI can denote more than one thing, then one can never be sure
what a given statement is actually describing.

Ambiguity of denotation is anathema to the semantic web. Yes, it will
occur, but that doesn't mean it is acceptable and should be easily
tolerated.

> > And getting such authoritative knowledge about resources from the
> > web authority based on the URI denoting the resource is 
> what I've been
> > working on for some time and am in the final stages of 
> completing some
> > open source software for accomplishing in a global, 
> scalable fashion.
> > 
> > (and, yes, RDF is at the heart of it ;-)
> > 
> > > > > No, in TM land, a URI allways is the address of 'the web
> > > page', a URI
> > > > > *never* addresses an abstract concept.
> > > >
> > > > Well, that wasn't my understanding. But if that's true, 
> then TMs and
> > > > RDF are even farther apart than I thought.
> > >
> > > Yes.
> > > >
> > > > > Then in TMs URIs can be used as subject indicators, 
> refering to
> > > > > arbitrary subjects.
> > > >
> > > > And how do you then make statements about the 'web page' versus
> > > > the subject? If you are using the same URI?
> > > >
> > > > > A key concept is that when the URI of a
> > > > > subject indicator
> > > > > is dereferenced and the retrieved information resource is
> > > > > rendered for human
> > > > > perception it should be clear what subject the URI indicates.
> > > >
> > > > But how do you differentiate between dereferencing the URI as
> > > > a subject indicator versus dereferencing the URI as a web page,
> > > > and is there any logical relationship between the web page
> > > > denoted by the URI versus the subject indicator denoted by the
> > > > same URI?
> > >
> > > > Having this ambiguity seems to make the core machinery alot more
> > > > complicated.
> > >
> > > Here is how we "see the world":
> > >
> > > There are subjects (anything you want to talk about).
> > 
> > OK, so TM subject = RDF resource
> > 
> > > Subjects are
> > > represented as topics (the topics are the nodes of the 
> graph that is
> > > 'produced' from a topic map). Topics have properties that say what
> > > the subject of the topic is.
> > >
> > > Topic Maps are not tied to the Web or URIs conceptually, but it is
> > > the most known application of them at the moment. So, when
> > > applying topic
> > > maps to the Web world, there are two properties that 
> handle the use
> > > of URIs: SubjectIndicators and SubjectAddress.
> > >
> > > The value (if any) of the SubjectAddress property is a URI
> > > and if a given
> > > topic exhibits a value for this property, then the topic is a
> > > surrogate
> > > for the subject that is the resource (in the sense of Web 
> page, never
> > > abstract concept).
> > >
> > > The value (if any) of the SubjectIndicators property is a
> > > list of URIs,
> > > and each Web resource (again: in the sense of Web page) 
> addressed by
> > > the URIs is called a subject indicator (or "subject
> > > indicating resource")
> > > for the subject that the topic represents.
> > 
> > topic  == subject    ?
> > topic  -> subject    ?
> > topic  -> subject+   ?
> > topic+ -> subject    ?
> 
> In a topic map, a subject is allways represented by a single topic,
> that is the heart of topic maps, actually.

OK. So subject -> topic

> > 
> > > So, the core machinery is actually as simple as "nodes 
> with properties
> > > the 'say' what the node represents".
> > >
> > > This is not the whole story of course, but I hope you get 
> the idea.
> > 
> > Well, I found the original TM spec pretty simple and 
> straightforward,
> > but XTM has left me continually confused, and I've read through it
> > numerous times.
> 
> Oh? I think it is much simpler than the former ISO13250
> > 
> > How about a simple example. Here are two URIs, the first denotes
> > the person John Doe and the second denotes an image of the
> > person John Doe.
> > 
> >    ex:John     rdf:type ex:Person .
> >    ex:John.jpg rdf:type ex:Image .
> > 
> > Then, I set my web server up so that if one does an HTTP GET on
> > ex:John, it returns a copy of ex:John.jpg as a representation
> > of ex:John. If one does an HTTP GET on ex:John.jpg, it returns
> > a copy of ex:John.jpg as a (bit-equal) representation of
> > ex:John.jpg.
> > 
> > Now, I can differentiate between the person John Doe and the image
> > of the person in various statements I make:
> > 
> >    ex:John ex:firstName "John" .
> >    ex:John ex:lastName "Doe" .
> >    ex:John dc:created "1966-03-31"^^xsd:date .
> >    ex:John ex:hasRepresentation ex:John.jpg .
> > 
> >    ex:John.jpg dc:title "Image of John Doe" .
> >    ex:John.jpg dc:format "image/jpg" .
> >    ex:John.jpg dc:created "2003-03-10"^^xsd:date .
> >    ex:John.jpg ex:representationOf ex:John .
> > 
> > The fact that ex:John is not a "web page" in no way prevents me from
> > getting a representation of the resource (person). The URI ex:John
> > does not denote both a person and a web page. It only 
> denotes a person.
> > The fact that HTTP GET on ex:John returns a web document in no way
> > changes the denotation of ex:John from being the person.
> > 
> > Thus, there is no restriction against a URI denoting a 
> non-web-accessible
> > resource. And it seems to me that XTM presumes that such a 
> restriction
> > exists, and that is the motivation for having the 
> subject/address dichotomy
> > and introducing ambiguity into the denotation of URIs.
> 
> The denotation is not ambigous but there are two *ways to use* of URIs

If a single URI can denote both a web resource and an abstract subject,
then it is ambiguous. Period. If the interpretation of a give URI depends
on context, then it is ambiguous. Period. XTM has ambiguous use of URIs.

> > And I say XTM, not TM, because this dichotomy is an XTM 
> invention not
> > present in the original TM model. 
> 
> That is not true, the use of addresses (not only URIs) as 
> addresses for
> information resources *and* as identifiers for subjects is at the core
> of topic maps.

I don't read the ISO spec that way. I think that XTM reads alot into the ISO spec.

> > XTM first did the right thing by
> > adopting URIs, and then broke everything by not preserving globally
> > consistent, unambiguous, and immutable denotation.
> 
> Hmm, what exactly do TMs brake?

I said XTM, not TMs. And if it's not clear from what I've said thus far,
I see little point in saying it in yet another way.

Patrick
Received on Wednesday, 2 April 2003 01:28:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:51:58 GMT