Re: What if an URI also is a URL from Oskar Welzl on 2007-08-30 (semantic-web@w3.org from August 2007)

From: Oskar Welzl <lists@welzl.info>
Date: Fri, 31 Aug 2007 00:47:34 +0200
To: Reto Bachmann-Gmür <rbg@talis.com>
Cc: wangxiao@musc.edu, semantic-web@w3.org
Message-Id: <1188514054.18844.47.camel@jupiter.hormayrgasse>
Am Donnerstag, den 30.08.2007, 22:28 +0200 schrieb Reto Bachmann-Gmür: 
> But talking about standards, why is this discussion on a list which has
> been replaced by semantic-web@w3.org?

dumb boy hit [reply] again; changed it now. 
Maybe we'll have to change to topic, too, soon: This is going to be somewhat
like "What's the content of an information resource"?

Am Donnerstag, den 30.08.2007, 22:30 +0200 schrieb Reto Bachmann-Gmür:
> Oskar Welzl wrote:
> > Pity, though, that there hardly seems to be an agreement on how to
> > handle this issue, so simply by choosing the above URI myself I will not
> > prevent *others* making statements like 
> > <#thismail> mail:sender <http://oskar.twoday.net>
> > when they refer to an update-notification they received from the weblog.
> >   
> Reading this I think I misunderstood what you mean with "blog" I was
> referring to a blog as a changing collection of articles not as
> something that sends email. If we agree that an information resource
> can't be the mail:sender of a mail then the statement
> 
> <#thismail> mail:sender <http://oskar.twoday.net>
> 
> is necessarily wrong, as a GET request to http://oskar.twoday.net is
> responded with a 2XX response and with this response the resource in
> unambiguously an information[1]. resource.

Well, the "sending mail"-example was certainly the outer limit of
nonsense I could possibly construct to get the message through, but I
meanwhile think my confusion has a different cause (and it was you who
pointed me to it):

Lets forget for a minute that a blog is more than just a collection of
posts and usually has properties like "allowsCommentsFrom",
"offersFeedType", "Blogroll" etc.
Assume that it *is* a mere collection of posts, sorted by date, latest
first, 10 per page. Period. You type http://my.blog.tld in your browser
to go there, subsequent pages can be reached with
http://my.blog.tld/?start=11 etc.

In one of your previous posts you wrote:
"A Blog is an Information Resource which could be described as
an ordered collection of posts, the HTML returned by the webserver is
(or should be) a suitable representation of that thing."
I didn't like this idea first (and said so, IIRC ;) ...), but it seems
logical to me now. *If* we think of a collection of posts and nothing
else, it would probably fit the concept of an "Information resource".
And what URI other than http://my.blog.tld would we have to name it?

On the other hand, the very content of the 10-posts-list returned by the
server (as what could be seen as the HTML-representation of the
information resource "blog") is an information resource in its own
right. Its "The 10 latest posts from my blog". No other way to refer to
it than via http://my.blog.tld again. Even in this simple construct, I
can make statements about http://my.blog.tld  in one RDF-document that
contradict each other, like (in OTN, oskars triple notation):

http://my.blog.tld  dc:coverage a period from 2003-2007
(this was about the blog)

http://my.blog.tld  dc:coverage a period from Juli-August 2007
(this is about the 1st page of the blog)

Same for statements about who commented there etc. - many can be true
for only one of the two information resources that are addressed by
http://my.blog.tld

To get around this, my original assumption was that before using a URI
to name something, I should check if its suitable by narrowing the
"information resource" as much as possible: take the representation you
get, take all possible interpretations of what it represents (a blog,
the first 10 postings, the author himself) and always take the
narrowest. What you end up with is, almost always, only a little more
than "the document". I like this approach for its simplicity, but it
breaks a lot. Take SIOC as an example. sioc:forum/sioc:site is exactly
what we're talking about here; they always refer to it via a URI that
is, in fact, "the first page of the collection". This is not wrong as
such, it just creates ambiguity, which UIRs should not have.

(In fact it was my current work on a SIOC-export that confronted my with
this boring question again after so many years.)

Now I go the steep way and say that http://my.blog.tld, the blog, should
not be confused with http://my.blog.tld, the most recent posts. The blog
should have its own URI, as "10 most recent posts" is the narrower
construct. Next question:
I plan to use http://my.blog.tld/ID/names#thisblog as sioc:site and have
an RDF/XML-document at ../ID/names to further define #thisblog. Now how
do I point to the preferred link/bookmark/"entry point" (which is, of
course, http://my.blog.tld/) with a well-known vocabulary? I was tempted
to use rss:link, but am very unsure about it... (Not finding a usable
hint on Google made me even more uneasy with the whole topic, as this
suggests nobody on this planet ever thought of *not* using the URI of
the main page as the URI for the whole site.)

So you see, even though there might have been a misunderstanding about
the concept of a "blog", this wasn't the cause of my problems. Even when
following your 'collection of posts'='information resource' definition,
I get deeper and deeper into trouble.

You already got me on a better track once by pointing out the somewhat
vague definition of information resource - maybe you got some new input
for me to chew on ;)

Thanks,

Oskar
Received on Thursday, 30 August 2007 22:47:54 UTC