Re: Role of URI and HTTP in Linked Data from Jiří Procházka on 2010-11-10 (public-lod@w3.org from November 2010)

From: Jiří Procházka <ojirio@gmail.com>
Date: Wed, 10 Nov 2010 22:26:01 +0100
To: nathan@webr3.org
CC: public-lod@w3.org
Message-ID: <4CDB0DE9.6090601@gmail.com>
On 11/10/2010 11:44 AM, Nathan wrote:
> Hi Jiří,
> 
> Jiří Procházka wrote:
>> Hi,
>> having read all of the past week and still ongoing discussion about HTTP
>> status codes, URIs and most importantly their meaning from Linked Data
>> perspective, I want share my thoughts on this topic.
>>
>> I don't mean to downplay anyone's work but I think the role of URI and
>> HTTP specifications (especially semantics) in Linked Data is
>> overemphasized, which unnecessarily complicates things.
> 
> The URI is what makes Linked Data, Linked Data, it's the only hook to
> the real world, and via the domain name system + domain registration
> process gives us a hook on accountability, which is critically
> important. 

I am by no means giving up these utilities by what I suggest.

> "#bar, as described by <http://example.com/foo>" resolves in
> two ways:
> (1) <http://example.com/foo> as a name for the literal description/graph
> (2) <http://example.com/foo> as a way of saying "the author of the
> description available at <http://example.com/foo>, stated X, and was
> responsible as delegated by the owners of example.com", where X is (1)
> and provable by the HTTP messages and logs. A status code of 200 vs 303
> to some other domain or URI vs 4xx or 5xx plays a big part in that chain
> of accountability / validity / trust.

I don't think Linked Data consumers should *have to* care about what
status codes HTTP request returns - it shouldn't be part of the core
Linked Data semantics. Of course it can be beneficial for clients to
listen to them to get more information, but treating HTTP library as a
simple function should be allowed (either it returns data or not).
Whether someone 303s (nice verb) to a different domain, it obviously
means he trusts it to maintain the description of his URI.

> Also never forget that Linked Data is just Links with literals, a Link
> as in a hyperlink, its the description of a relationship between two
> things (names or literals) which make a link a link, thus each link is a
> statement, statements form descriptions, descriptions are literal
> things. Triples are statements, Graphs are descriptions.
> 
> There's a lot more to the simple triple with http URIs than many
> realise, sure it makes a nice RDF data bus for us and gives us an almost
> universal data format, which we can exploit and bring to the fore via
> linked data, but that's just the tip of the iceberg, and ultimately of
> very little use without the URI and HTTP.
> 
> a few notes..
> 
>> I think we can all agree, that the core idea of Linked Data is that
>> information is expressed using unique identifiers (URIs) I can simply
>> use to get useful information about the thing the identifier represents
>> (thus mandated relatively simple, widely supported transfer protocol
>> HTTP).
> 
> as above, that's not the core of linked data, that's the surface.
> 
>> So lets stick with this. Lets just treat URIs as RDF does - as simple
>> names. When we dereference an URI we get back some useful data and
>> that's it.
> 
> So, that'll be like mailto: or pop: or tel: then..

I don't follow here. I don't know of any standardized ways of getting
structured data out of such URIs.

>> If we want to express, the data fetched are in fact a
>> document, we use the wdrs:isDefinedBy property. The data fetched are
>> just a data and any info about it should be contain in it.
> 
> Expressing that the data fetched is infact a document, is indeed
> optional, but any response is always a message, a description, a
> /literal/ thing, you can't pretend it doesn't exist, it does - to say a
> description is anything other than that is like me saying you're an
> apple and insisting everybody believe me. Literals are self identifying,
> self naming, things.

I don't get what you mean here either. Are you talking about RDF
semantics here or general ontological philosophy? If you are talking
about RDF, then be aware that literals can have names - URIs assigned to
literals. If talking about the latter, then I don't get you at all.
I am advocating making Linked Data as simple as possible, avoiding
abstract ontological definitions (in which I count the notion of
literal). The fact that what you say is incomprehensible to me further
strengthens me in my opinion.

>> Why? Why no Content-Location? There is no reason to require additional
>> complexity, building extra information layers. Publishing the document
>> information in the data itself most probably would be simpler for both
>> the publishing and the consuming party. Treating HTTP as a simple
>> blackbox is what is mostly done in practice anyway.
> 
> Read only world then?

Not really, writing can be simple too, but we probably would want to
draw the line somewhere unless we want Linked Data to require an
universal RPC framework specification.

>> What if someone doesn't publish the document data? Would it mean the URI
>> we dereferenced refers both to the thing described and the description
>> of it? Kind of.
> 
> There is no kind of. The description is a literal thing all of it's own,
> it's the same thing regardless of media type or whether you write it on
> a bit of paper, it's a self identifying literal thing.
> 
>> What I mean is the consumer side can add additional
>> information to the data about the document (when and how fast it was
>> fetched etc) and if the data doesn't contain info about the document
>> already, it could add it:
>>   <uri> wdrs:isDefinedBy [ wdsr:location "uri" ] . # or something like
>> this
>> Non-RDF data should use their equivalents.
>> That is the most important things I had to say - lets keep semantics in
>> the data.
>>
>> I believe it is quite important that the range of wdrs:isDefinedBy is a
>> document class, which should be domain of wdsr:location.
> 
> so one location / graph / description is a document, and the other isn't!?

You have for example foaf:Agent which you dereference and get back the
data amended with:
  foaf:Agent wdrs:isDefinedBy [ wdsr:location
"http://xmlns.com/foaf/0.1/Agent" ] .

If the data already contains:
  foaf:Agent wdrs:isDefinedBy foaf: .
  foaf: wdrs:location "http://xmlns.com/foaf/0.1/" .
great, we get better info.

Best,
Jiri

>> I am going to explain why I think so, but beware, at this point I get a
>> bit philosophical :)
>>
>> What is pretty awesome about RDF, which is something Linked Data could
>> learn, is how it dabbled the ontological (used as philosophical term)
>> issues - existence, being and reality. In order to support maximum
>> expressiveness and compatibility with various world-views it says the
>> least about it. Big part of that is dealing with identity - if a
>> caterpillar turns into butterfly, is it still the same thing? Am I still
>> I when I get older and change? RDF doesn't offer any answers to such
>> questions, neither if there are only information resources and other
>> resources. There are just names which identify objects or concepts,
>> which we describe with names and the final description matches some
>> number of objects or concepts we know, while the better the description
>> is, the lower the number is.
>>
>> RDFS classes are used to describe various aspects of objects or
>> concepts, which allow us to express ourselves much less ambiguously,
>> using properties with defined domain and range. On the other hand we can
>> describe those aspects separately if we consider them a separate entity.
>> For example someone can say I am averagely skilled as an English
>> speaker, or that my English skill is mediocre, or that I am one of
>> averagely skilled English speakers. Similarly one could say <book> is
>> long 30000 characters as its content, or that <book> is long 20
>> characters as its title, or that <book> is long 3000 characters as the
>> description received on dereferencing. It shouldn't matter if I consider
>> a book name as part of it or not, if I use as unambiguously defined
>> properties as possible. However vocabularies with not very well defined
>> terms (consider an example "length" property), which generally mimic
>> natural language properties, are used widely, which is why we should
>> have wdrs:isDefinedBy.
>> The point of this philosophical exercise was to say, that shouldn't be
>> saying "an URI represents one resource" or trying to define what
>> resources are or what existence is, but recognizing the context of the
>> original information when modifying it (especially amending).
> 
> indeed, we should just realise that all we can do is describe things by
>  making statements about them, and then provide a way to say how one
> described thing relates to another.
> 
> Best,
> 
> Nathan
>
Received on Wednesday, 10 November 2010 21:26:41 UTC