W3C home > Mailing lists > Public > semantic-web@w3.org > September 2007

Re: What if an URI also is a URL

From: Richard Cyganiak <richard@cyganiak.de>
Date: Fri, 14 Sep 2007 19:38:52 +0200
Message-Id: <6F128705-A320-45F3-AA46-0AE666726685@cyganiak.de>
Cc: semantic-web@w3.org, Edward Bryant <edward.bryant@gmail.com>
To: Oskar Welzl <lists@welzl.info>

On 13 Sep 2007, at 23:36, Oskar Welzl wrote:
> What's a site, anyway? There's no such concept on the web. In fact, a
> so-called 'site' is only a collection of documents

A collection of resources, strictly speaking. And yes, the concept of  
a “site” does not exist in the web architecture.

> Right now I make up arbitrary URIs for sites, using sioc:/rss:link to
> point to the main page. While this works well in my own little  
> universe,
> its just a mess when comined with, say, foaf-data that expects
> foaf:weblog to be a foaf:document, not a madeup:Site that  
> sioc:links to
> a foaf:document.

I think this is the right approach. Mint a new URI for the site, and  
make it a hash URI or have it 303-redirect to some sitemap-style  
document listing its constituent resources.

Why would you want to link to an entire *site* in foaf:weblog? That  
seems useless to me. If I stumble upon your FOAF file, and see that  
there is a foaf:weblog link, then I don't care about the entire  
abstract collection of resources that make up your weblog; I care  
about the homepage.

(Granted, the documentation for foaf:weblog could be more specific  
about this, but you are not forbidden from applying common sense  
while reading a spec.)

> Best thing would be a well-established vocabulary that defines terms
> like web site (for collections of documents that somehow belong  
> together
> logically), web service (for services that are web sites, but have  
> some
> interactive functionality on top), main page for both web service and
> web site and a "belongsTo" to express that
> our.seconddomain.tld/products/ad45ffh.htm belongs to a web site  
> that has
> a main page of our.firstdomain.tld/

I think [1] does most of this.


[1] http://www.w3.org/TR/powder-grouping/

> Oskar
>> Best,
>> Richard
>>> I just started learning this myself, so someone please correct me
>>> if I am off base here.
>>> Ed
>>> On 8/30/07, Oskar Welzl <lists@welzl.info> wrote:
>>> Am Donnerstag, den 30.08.2007, 22:28 +0200 schrieb Reto Bachmann- 
>>> Gmür:
>>>> But talking about standards, why is this discussion on a list
>>> which has
>>>> been replaced by semantic-web@w3.org?
>>> dumb boy hit [reply] again; changed it now.
>>> Maybe we'll have to change to topic, too, soon: This is going to be
>>> somewhat
>>> like "What's the content of an information resource"?
>>> Am Donnerstag, den 30.08.2007, 22:30 +0200 schrieb Reto Bachmann- 
>>> Gmür:
>>>> Oskar Welzl wrote:
>>>>> Pity, though, that there hardly seems to be an agreement on how to
>>>>> handle this issue, so simply by choosing the above URI myself I
>>> will not
>>>>> prevent *others* making statements like
>>>>> <#thismail> mail:sender < http://oskar.twoday.net>
>>>>> when they refer to an update-notification they received from
>>> the weblog.
>>>> Reading this I think I misunderstood what you mean with "blog" I  
>>>> was
>>>> referring to a blog as a changing collection of articles not as
>>>> something that sends email. If we agree that an information  
>>>> resource
>>>> can't be the mail:sender of a mail then the statement
>>>> <#thismail> mail:sender <http://oskar.twoday.net>
>>>> is necessarily wrong, as a GET request to http:// 
>>>> oskar.twoday.net is
>>>> responded with a 2XX response and with this response the  
>>>> resource in
>>>> unambiguously an information[1]. resource.
>>> Well, the "sending mail"-example was certainly the outer limit of
>>> nonsense I could possibly construct to get the message through,  
>>> but I
>>> meanwhile think my confusion has a different cause (and it was  
>>> you who
>>> pointed me to it):
>>> Lets forget for a minute that a blog is more than just a  
>>> collection of
>>> posts and usually has properties like "allowsCommentsFrom",
>>> "offersFeedType", "Blogroll" etc.
>>> Assume that it *is* a mere collection of posts, sorted by date,  
>>> latest
>>> first, 10 per page. Period. You type http://my.blog.tld in your
>>> browser
>>> to go there, subsequent pages can be reached with
>>> http://my.blog.tld/?start=11 etc.
>>> In one of your previous posts you wrote:
>>> "A Blog is an Information Resource which could be described as
>>> an ordered collection of posts, the HTML returned by the  
>>> webserver is
>>> (or should be) a suitable representation of that thing."
>>> I didn't like this idea first (and said so, IIRC ;) ...), but it  
>>> seems
>>> logical to me now. *If* we think of a collection of posts and  
>>> nothing
>>> else, it would probably fit the concept of an "Information  
>>> resource".
>>> And what URI other than http://my.blog.tld would we have to name it?
>>> On the other hand, the very content of the 10-posts-list returned
>>> by the
>>> server (as what could be seen as the HTML-representation of the
>>> information resource "blog") is an information resource in its own
>>> right. Its "The 10 latest posts from my blog". No other way to
>>> refer to
>>> it than via http://my.blog.tld again. Even in this simple  
>>> construct, I
>>> can make statements about http://my.blog.tld  in one RDF-document  
>>> that
>>> contradict each other, like (in OTN, oskars triple notation):
>>> http://my.blog.tld  dc:coverage a period from 2003-2007
>>> (this was about the blog)
>>> http://my.blog.tld  dc:coverage a period from Juli-August 2007
>>> (this is about the 1st page of the blog)
>>> Same for statements about who commented there etc. - many can be  
>>> true
>>> for only one of the two information resources that are addressed by
>>> http://my.blog.tld
>>> To get around this, my original assumption was that before using  
>>> a URI
>>> to name something, I should check if its suitable by narrowing the
>>> "information resource" as much as possible: take the representation
>>> you
>>> get, take all possible interpretations of what it represents (a  
>>> blog,
>>> the first 10 postings, the author himself) and always take the
>>> narrowest. What you end up with is, almost always, only a little  
>>> more
>>> than "the document". I like this approach for its simplicity, but it
>>> breaks a lot. Take SIOC as an example. sioc:forum/sioc:site is  
>>> exactly
>>> what we're talking about here; they always refer to it via a URI  
>>> that
>>> is, in fact, "the first page of the collection". This is not  
>>> wrong as
>>> such, it just creates ambiguity, which UIRs should not have.
>>> (In fact it was my current work on a SIOC-export that confronted my
>>> with
>>> this boring question again after so many years.)
>>> Now I go the steep way and say that http://my.blog.tld, the blog,
>>> should
>>> not be confused with http://my.blog.tld, the most recent posts. The
>>> blog
>>> should have its own URI, as "10 most recent posts" is the narrower
>>> construct. Next question:
>>> I plan to use http://my.blog.tld/ID/names#thisblog as sioc:site and
>>> have
>>> an RDF/XML-document at ../ID/names to further define #thisblog. Now
>>> how
>>> do I point to the preferred link/bookmark/"entry point" (which  
>>> is, of
>>> course, http://my.blog.tld/) with a well-known vocabulary? I was
>>> tempted
>>> to use rss:link, but am very unsure about it... (Not finding a  
>>> usable
>>> hint on Google made me even more uneasy with the whole topic, as  
>>> this
>>> suggests nobody on this planet ever thought of *not* using the  
>>> URI of
>>> the main page as the URI for the whole site.)
>>> So you see, even though there might have been a misunderstanding  
>>> about
>>> the concept of a "blog", this wasn't the cause of my problems. Even
>>> when
>>> following your 'collection of posts'='information resource'
>>> definition,
>>> I get deeper and deeper into trouble.
>>> You already got me on a better track once by pointing out the  
>>> somewhat
>>> vague definition of information resource - maybe you got some new
>>> input
>>> for me to chew on ;)
>>> Thanks,
>>> Oskar
Received on Friday, 14 September 2007 17:39:20 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:02 UTC