W3C home > Mailing lists > Public > public-webid@w3.org > November 2012

Re: Hash vs Hashless URIs

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 20 Nov 2012 10:05:48 -0500
Message-ID: <50AB9C4C.9060700@openlinksw.com>
To: public-webid@w3.org
On 11/20/12 9:11 AM, Henry Story wrote:
> On 20 Nov 2012, at 00:36, Stéphane Corlosquet <scorlosquet@gmail.com 
> <mailto:scorlosquet@gmail.com>> wrote:
>> I don't deny the fact that hash URIs have their advantages and I 
>> personally prefer them too for WebID, but I don't see the need to set 
>> that in stone wrt to WebID URIs. Like I said before, who knows what 
>> new mechanism will come out of the TAG or elsewhere 2 years down the 
>> road? Mandating hash URIs means that any kind of innovation in the 
>> realm of WebID will be impossible without breaking the spec.
>> Can't we agree on the following compromise? => only use hash URIs in 
>> the non-nominative examples. This is leave for innovation down the 
>> road, in the meantime most people can follow the hash routes unless 
>> they prefer some other way.
>> Does mandating "hash URIs only" provide any advantage in terms of 
>> implementing a WebID verifier? A verifier would still rely on HTTP to 
>> dereference the WebID URI, and follow any redirect if necessary. What 
>> are the advantages from a verifier standpoint? How does it make is 
>> simpler than just any kind of URI?
> I think Tim Berners Lee had a number of objections on the of tracking 
> 303 redirected URIs. He said it massively slowed down the Tabulator code.

The issues aren't new. Now to TimBL's credit (please note what follows 
carefully) he pointed out the issues and implementation challenges. At 
no point did he imply a mandate to make this part of the WebID 
definition. Again, he explained his concerns backed up with specific 
experience with Tabulator.

> It also makes our explanation more complex in the explanation in the spec.

It doesn't.

Is TimBL's Linked Data meme more complex? Is DBpedia more complex?

http://dbpedia.org/resource/Linked_Data is the HTTP URI that denotes the 
concept 'Linked Data' .
http://dbpedia.org/page/Linked_Data is the HTTP URI that denotes the 
document that describes the concept 'Linked Data'.

You can bookmark either URI and end up with the same document comprised 
of content that describes the concept 'Linked Data' .

> As I see we all have consensus that #uri are WebIDs, and there is lack 
> of consensus on whether
> 303s non has uris should also be.

Untrue. When did you actually poll the participants in this project. Why 
not run a poll right now so that a broad audience of Linked Data savvy 
folks can participate?

> The new principle has to be: wherever we can make things simpler we do 
> so - as long as we don't close options further down.

That's a contradictory statement. This isn't what you are doing right 
now by pushing this effort.

I understand where TimBL is coming from, I understand how HTTP URI 
denoting real world entities got mucked up etc.. The trouble is there 
has been consensus within the W3C re. HTTP URIs . The cat is out of the 
bag and we can't use this effort to simply add more problems down the 
line. Hashless URIs are in broad use re. Linked Data, simply look at the 
LOD cloud. Do you seriously want to consider dislocating the LOD cloud 
from this endeavor? If you are unclear re. the last sentence it means: 
Facebook doesn't become a nice Linked Data source (data space) for 
exploitation via WebID over TLS.

> By defining WebIDs to be just #uris, we don't close options further 
> down, and we leave it to other people who want to join to make their 
> case when we go to W3C WG.

What a waste of time. They join the WG to start debating instead of 
enjoying the euphoria of compatibility via dexterous architecture. Come 
on now!

> The idea is make it simple for people to join.

No, the idea is to stay true to the "deceptive simple" principle that 
makes the Web what it is.

> My problem with 303's vocabs such as foaf is that you don't know when 
> a uri is defined in a document
> until you dereference each URI. For example each of
> http://xmlns.com/foaf/0.1/knows
> http://xmlns.com/foaf/0.1/mbox
> http://xmlns.com/foaf/0.1/Person
> http://xmlns.com/foaf/0.1/Agent
> Are all defined by the
> http://xmlns.com/foaf/spec/
> but you don't know that until you have done an HTTP GET for each of 
> those resources.

You seek and find knowledge. Basically, the act of dereferencing an 
identifier that denotes something. Knowledge doesn't have to be 

> This is important because understanding which document is defining a 
> term, is key to
> WebID-TLS authentication.

No, the entity relationship semantics from the retrieved profile graph 
is what's important.

> A document that defines a term is making something close
> to a necessarily true assertion, as you know from mathematics.

I don't think that's relevant at all. The authentication protocol is 
about machine and human comprehensible entity relationship semantics and 
the logical reasoning they facilitate.
> That means that even though you may already have fetched all the foaf 
> definitions,
> you still have to interact with the server to find out if you have the 
> definitions.

We are using HTTP URIs. Cache invalidation is baked into HTTP.

> Notice the security hole that breaks up if you are not careful otherwise:
> I create a WebID+ for </ppl/joe>
> this redirects to
>    /ppl
> which describes ( and seems to define) also
>   </ppl/smith>
>   </people/smith>
> You can know imagine a server implementing WebID protocol badly that 
> on the
> authentication of /people/smith thinks it already has the definition 
> for that WebID in
> store...

Anything can be implemented badly. You know that. A spec isn't supposed 
to teach engineering.
>   There are a lot of things that end up needing explanation suddenly 
> which we can
> cut out by keeping our definition rigorously simple.

HTTP URI is "deceptively simple" enough, if it wasn't then why didn't 
TimBL mandate hash style of HTTP URIs when minting the LInked Data meme?

> Henry
>> Steph.
>> On Mon, Nov 19, 2012 at 6:16 PM, Melvin Carvalho 
>> <melvincarvalho@gmail.com <mailto:melvincarvalho@gmail.com>> wrote:
>>     On 19 November 2012 23:58, Kingsley Idehen
>>     <kidehen@openlinksw.com <mailto:kidehen@openlinksw.com>> wrote:
>>         All,
>>         To understand this old problem please read:
>>         http://www.w3.org/TR/2007/WD-cooluris-20071217/#hashuri .
>>         Important point to note, this matter ultimately becomes a
>>         permathread whenever a spec attempts to pick one style over
>>         the other.
>>         The solution to these kinds of problems stem back to biblical
>>         stories, such as the one illustrating the wisdom of Solomon
>>         re. splitting a disputed baby in half.
>>         HTTP URIs are "horses for course" compliant. It is always
>>         best to keep them that way when designing specs for HTTP
>>         based solutions.
>>     Thanks
>>     "Conclusion.
>>         Hash URIs should be preferred for rather small and stable
>>         sets of resources that evolve together. An ideal case are RDF
>>         Schema vocabularies and OWL ontologies, where the terms are
>>         often used together, and the number of terms is unlikely to
>>         grow much in the future.
>>         Hash URIs without content negotiation can be implemented by
>>         simply uploading static RDF files to a Web server, without
>>         any special server configuration. This makes them popular for
>>         quick-and-dirty RDF publication.
>>         303 URIs should be used for large sets of data that are, or
>>         may grow, beyond the point where it is practical to serve all
>>         related resources in a single document.
>>         If in doubt, it's better to use the more flexible 303 URI
>>         approach.
>>     "
>>     Will try and digest this a bit more.  I may still be missing
>>     something but if you have a paradigm of one data item per page
>>     and call it #, like facebook do, I'm still trying to see the
>>     advantage of 303s.  As pointed out, facebook is not a small data
>>     set.
>>         -- 
>>         Regards,
>>         Kingsley Idehen
>>         Founder & CEO
>>         OpenLink Software
>>         Company Web: http://www.openlinksw.com
>>         <http://www.openlinksw.com/>
>>         Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>         <http://www.openlinksw.com/blog/%7Ekidehen>
>>         Twitter/Identi.ca <http://Identi.ca> handle: @kidehen
>>         Google+ Profile:
>>         https://plus.google.com/112399767740508618350/about
>>         LinkedIn Profile: http://www.linkedin.com/in/kidehen
>> -- 
>> Steph.
> Social Web Architect
> http://bblfish.net/



Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Received on Tuesday, 20 November 2012 15:06:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:05:45 UTC