Re: URI for language identifiers from Jan Algermissen on 2003-04-01 (www-rdf-interest@w3.org from April 2003)

From: Jan Algermissen <algermissen@acm.org>
Date: Tue, 01 Apr 2003 22:42:47 +0200
To: Patrick.Stickler@nokia.com
CC: www-rdf-interest@w3.org
Message-ID: <3E89F9C7.293837CD@acm.org>
Patrick.Stickler@nokia.com wrote:

> > So, it would make sense for the DC folks to make these things
> > explicit,
> > to publish them as an RDF document?
> 
> Well, if they are in fact intending that subjects have a domain
> only of web pages

Well, what I mean is that the subjects of DC statements are intended
to be information resources, but I am willing to be corrected here.


 (and I don't think they are) then yes, it's
> optimal if that knowledge is expressed explicitly (whether or not
> in RDF is another matter).
> 
> However, I don't think that the domain of dc:subject is restricted
> to web pages. I think you're reading alot into the spec there.

What do you think is the domain of dc:subject?



> > > If a URI denotes an abstract concept, you may be able to GET a
> > > representation of that resource. Why not.
> >
> > This is a thing I just don't get about RDF.
> 
> This has nothing to do with RDF specifically. This is the way the
> REST (Web) architecture works. If XTM is to operate on the Web, then
> it must also do so in a way that is compatable with the web architecture,
> and that includes the relationship between URIs, resources, and
> representations.

Well, I don't see that the idea that a resource can also 'be' an abstract
concept is an idea inherent in REST. I know that resource has been redefined
to 'cover' also abstract concepts, but I never understood why that is part
of the REST architecture....


> > I find it VERY
> > strange that
> > a document can be a representation of a dog.
> 
> It's a slippery slope, and there is no clearly drawn line. There
> are many who would agree with you. There are many who wouldn't.
> 
> At present, a representation can be anything. 

Hmm, I thought a representation is at least 'bits and bytes' (data),
what do you mean by 'anything'?


> And there is no clear
> definition of how a representation must relate to the resource itself.
> All that is stated is that, given a URI denoting some resource, an
> HTTP GET can return a representation of that resource.

> 
> I simply consider a representation as some form of content which in
> some way reflects the nature of the resource in some useful manner.
> 
> Some representations will be able to reflect resources much more
> precisely -- and for digital resources, representations may even
> be bit-equal copies.
> 
> For abstract or otherwise non-digital resources, representations
> (which *will* themselves be digital resources) will reflect the
> nature of the resource less precisely.
> 
> And not all resources will have representations available.
> 
> So if I have a URI that denotes a dog, I may HTTP GET a representation
> of that dog which is in fact a digital image or perhaps a video stream,
> or maybe an encoding of its DNA. Whatever. I may have dozens of different
> representations I can choose from.

Sure, but what I don't understand is how RDF solves the problem that
some people might use http://www.w3.org/Consortium/ to identify the W3C
while others might 'make DC statements about it'. In fact, it's
exactly the coexistence of those two things that I think is powerful, because
it allows me to use ordinary (e.g. HTML) web pages to identify subjects.

> 
> And *each* of those representations is a resource in its own right, which
> may (IMO should) be denoted by a URI that is distinct from that denoting
> the resource of which it is a representation -- and servers returning
> a representation may (IMO should) specify the identity of the representation
> in the response.

Hmm, but in REST a URI cannot denote a representation, you cannot address the
'bits and bytes'. ?!
> 
> Still, you *never* can GET a resource itself, even a digital resource.
> You always get a representation. And that representation (unless a bit-equal
> copy) is then a distinct resource.
> 
> > But I guess that is just
> > something to accept as part of the (re-)definition of resource if I
> > want to use RDF.
> 
> Again. This has nothing to do with RDF. This is the web architecture.

But in what way is the redifinition of resource related to Web architecture,
I don't get it. 

[Of course the concept of resource must include services (volume control etc.)
and PUTing etc. to such resources makes a lot of sense to me, but abstract
concepts.....why is it part of the Web architecture that those can also be
resources??]

> 
> As far as RDF is concerned, one need never dereference any URI and never
> get any representation. URIs denote resources and one may use RDF to
> make statements about resources. Representations are entirely outside
> the scope of RDF proper.
> 
> However, where RDF and the web architecture agree is on the fundamental
> principle that URIs should have globally, consistent, unambiguous, and
> immutable meaning.

But you don;t need URIs for that, nor do you need the Web. Maybe that is
what I don;t understand: what is the idea of how RDF interacts/intermingels
with Web architecture (beyond the idea of 'controled vocabulary') ?
> 
> A URI always denotes the same thing, no matter where you encounter it, and
> no matter how many representations might be associated with it, etc.
> 
> URIs are the global constants, the atomic elements, of the web and semantic
> web.
> 
> > Furthermore as it prohibits an author to use
> > "http://www.w3.org/Consortium/"
> > as an identifier for the W3C (since it is a Web page).
> 
> Is it? How do you know? Because you did an HTTP GET and got back
> an HTML instance?

Ok,ok...but we are on the Web, so this assumption seems pretty
normal to me.

> 
> That does not prove that it denotes a web page. The representation
> you got from the server may very well be a web page. Without
> authoritative knowledge about the resource itself, you can't know
> for sure.

But I can use the URI to assert that

http://www.w3.org/Consortium/ dc:Title  "About the W3C"

Is that wrong or right or what ?
> 
> And getting such authoritative knowledge about resources from the
> web authority based on the URI denoting the resource is what I've been
> working on for some time and am in the final stages of completing some
> open source software for accomplishing in a global, scalable fashion.
> 
> (and, yes, RDF is at the heart of it ;-)
> 
> > > > No, in TM land, a URI allways is the address of 'the web
> > page', a URI
> > > > *never* addresses an abstract concept.
> > >
> > > Well, that wasn't my understanding. But if that's true, then TMs and
> > > RDF are even farther apart than I thought.
> >
> > Yes.
> > >
> > > > Then in TMs URIs can be used as subject indicators, refering to
> > > > arbitrary subjects.
> > >
> > > And how do you then make statements about the 'web page' versus
> > > the subject? If you are using the same URI?
> > >
> > > > A key concept is that when the URI of a
> > > > subject indicator
> > > > is dereferenced and the retrieved information resource is
> > > > rendered for human
> > > > perception it should be clear what subject the URI indicates.
> > >
> > > But how do you differentiate between dereferencing the URI as
> > > a subject indicator versus dereferencing the URI as a web page,
> > > and is there any logical relationship between the web page
> > > denoted by the URI versus the subject indicator denoted by the
> > > same URI?
> >
> > > Having this ambiguity seems to make the core machinery alot more
> > > complicated.
> >
> > Here is how we "see the world":
> >
> > There are subjects (anything you want to talk about).
> 
> OK, so TM subject = RDF resource
> 
> > Subjects are
> > represented as topics (the topics are the nodes of the graph that is
> > 'produced' from a topic map). Topics have properties that say what
> > the subject of the topic is.
> >
> > Topic Maps are not tied to the Web or URIs conceptually, but it is
> > the most known application of them at the moment. So, when
> > applying topic
> > maps to the Web world, there are two properties that handle the use
> > of URIs: SubjectIndicators and SubjectAddress.
> >
> > The value (if any) of the SubjectAddress property is a URI
> > and if a given
> > topic exhibits a value for this property, then the topic is a
> > surrogate
> > for the subject that is the resource (in the sense of Web page, never
> > abstract concept).
> >
> > The value (if any) of the SubjectIndicators property is a
> > list of URIs,
> > and each Web resource (again: in the sense of Web page) addressed by
> > the URIs is called a subject indicator (or "subject
> > indicating resource")
> > for the subject that the topic represents.
> 
> topic  == subject    ?
> topic  -> subject    ?
> topic  -> subject+   ?
> topic+ -> subject    ?

In a topic map, a subject is allways represented by a single topic,
that is the heart of topic maps, actually.

> 
> > So, the core machinery is actually as simple as "nodes with properties
> > the 'say' what the node represents".
> >
> > This is not the whole story of course, but I hope you get the idea.
> 
> Well, I found the original TM spec pretty simple and straightforward,
> but XTM has left me continually confused, and I've read through it
> numerous times.

Oh? I think it is much simpler than the former ISO13250
> 
> How about a simple example. Here are two URIs, the first denotes
> the person John Doe and the second denotes an image of the
> person John Doe.
> 
>    ex:John     rdf:type ex:Person .
>    ex:John.jpg rdf:type ex:Image .
> 
> Then, I set my web server up so that if one does an HTTP GET on
> ex:John, it returns a copy of ex:John.jpg as a representation
> of ex:John. If one does an HTTP GET on ex:John.jpg, it returns
> a copy of ex:John.jpg as a (bit-equal) representation of
> ex:John.jpg.
> 
> Now, I can differentiate between the person John Doe and the image
> of the person in various statements I make:
> 
>    ex:John ex:firstName "John" .
>    ex:John ex:lastName "Doe" .
>    ex:John dc:created "1966-03-31"^^xsd:date .
>    ex:John ex:hasRepresentation ex:John.jpg .
> 
>    ex:John.jpg dc:title "Image of John Doe" .
>    ex:John.jpg dc:format "image/jpg" .
>    ex:John.jpg dc:created "2003-03-10"^^xsd:date .
>    ex:John.jpg ex:representationOf ex:John .
> 
> The fact that ex:John is not a "web page" in no way prevents me from
> getting a representation of the resource (person). The URI ex:John
> does not denote both a person and a web page. It only denotes a person.
> The fact that HTTP GET on ex:John returns a web document in no way
> changes the denotation of ex:John from being the person.
> 
> Thus, there is no restriction against a URI denoting a non-web-accessible
> resource. And it seems to me that XTM presumes that such a restriction
> exists, and that is the motivation for having the subject/address dichotomy
> and introducing ambiguity into the denotation of URIs.

The denotation is not ambigous but there are two *ways to use* of URIs
> 
> And I say XTM, not TM, because this dichotomy is an XTM invention not
> present in the original TM model. 

That is not true, the use of addresses (not only URIs) as addresses for
information resources *and* as identifiers for subjects is at the core
of topic maps.

> XTM first did the right thing by
> adopting URIs, and then broke everything by not preserving globally
> consistent, unambiguous, and immutable denotation.

Hmm, what exactly do TMs brake?


Jan


> 
> Thus, there is no *need* to make any distinction between the resource
> denoted by a URI and some "subject" which that resource ambiguously
> also denotes. If you want to talk about a subject, give it a URI and
> just talk about the subject. Simple.


> 
> Patrick

-- 
Jan Algermissen                           http://www.topicmapping.com
Consultant & Programmer	                  http://www.gooseworks.org
Received on Tuesday, 1 April 2003 16:43:07 UTC