Re: Hash vs Hashless URIs from Henry Story on 2012-11-20 (public-webid@w3.org from November 2012)

From: Henry Story <henry.story@bblfish.net>
Date: Tue, 20 Nov 2012 22:58:26 +0100
To: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Cc: public-webid <public-webid@w3.org>
Message-Id: <3E93D6B7-681B-4205-9039-3C33010C8454@bblfish.net>
On 20 Nov 2012, at 22:49, Ted Thibodeau Jr <tthibodeau@openlinksw.com> wrote:

> 
> On Nov 20, 2012, at 10:05 AM, Kingsley Idehen wrote:
> 
>> On 11/20/12 9:11 AM, Henry Story wrote:
>>> 
>>> On 20 Nov 2012, at 00:36, Stéphane Corlosquet <scorlosquet@gmail.com>           wrote:
>>> 
>>>> I don't deny the fact that hash URIs have their advantages and I personally prefer them too for WebID, but I don't see the need to set that in stone wrt to WebID URIs. Like I said before, who knows what new mechanism will come out of the TAG or elsewhere 2 years down the road? Mandating hash URIs means that any kind of innovation in the realm of WebID will be impossible without breaking the spec.          
>>>> 
>>>> Can't we agree on the following compromise? => only use hash URIs in the non-nominative examples. This is leave for innovation down the road, in the meantime most people can follow the hash routes unless they prefer some other way.
>>>> 
>>>> Does mandating "hash URIs only" provide any advantage in terms of implementing a WebID verifier? A verifier would still rely on HTTP to dereference the WebID URI, and follow any redirect if necessary. What are the advantages from a verifier standpoint? How does it make is simpler than just any kind of URI?
>>> 
>>> I think Tim Berners Lee had a number of objections on the of tracking 303 redirected URIs. He said it massively slowed down the Tabulator code.
>> 
>> The issues aren't new. Now to TimBL's credit (please note what follows carefully) he pointed out the issues and implementation challenges. At no point did he imply a mandate to make this part of the WebID definition. Again, he explained his concerns backed up with specific experience with Tabulator. 
>> 
>>> It also makes our explanation more complex in the explanation in the spec.
>> 
>> It doesn't. 
>> 
>> Is TimBL's Linked Data meme more complex? Is DBpedia more complex? 
>> 
>> http://dbpedia.org/resource/Linked_Data is the HTTP URI that denotes the concept 'Linked Data' . 
>> http://dbpedia.org/page/Linked_Data is the HTTP URI that denotes the document that describes the concept 'Linked Data'. 
>> 
>> You can bookmark either URI and end up with the same document comprised of content that describes the concept 'Linked Data' . 
>> 
>> 
>>> 
>>> As I see we all have consensus that #uri are WebIDs, and there is lack of consensus on whether 
>>> 303s non has uris should also be.
>> 
>> Untrue. When did you actually poll the participants in this project. Why not run a poll right now so that a broad audience of Linked Data savvy folks can participate? 
>> 
>>> The new principle has to be: wherever we can make things simpler we do so - as long as we don't close options further down.
>> 
>> That's a contradictory statement. This isn't what you are doing right now by pushing this effort. 
>> 
>> I understand where TimBL is coming from, I understand how HTTP URI denoting real world entities got mucked up etc.. The trouble is there has been consensus within the W3C re. HTTP URIs. 
> 
> I think Kingsley meant, "The trouble is there has *not* been 
> consensus within the W3C re. HTTP URIs" here.
> 
> 
>> The cat is out of the bag and we can't use this effort to simply add more problems down the line. Hashless URIs are in broad use re. Linked Data, simply look at the LOD cloud. Do you seriously want to consider dislocating the LOD cloud from this endeavor? If you are unclear re. the last sentence it means: Facebook doesn't become a nice Linked Data source (data space) for exploitation via WebID over TLS. 
>> 
>>> By defining WebIDs to be just #uris, we don't close options further down,
> 
> Henry, do you really not see the explicit contradiction in your own sentence?
> 
> Do you really mean to say, "By [closing options down], we don't 
> close options [...] down"?

yes. We don't close options to later extend the meaning of a WebID. We would
be closing options down if we did something now that made it impossible to
later have a WebID defined more broadly. Clearly if everybody now uses WebID
to mean the #URI things and that works, then everything they do will still 
work later too if the WebID is extended to encompass more.

> 
> 
>>> and we leave it to other people who want to join to make their case when we go to W3C WG.
>> 
>> What a waste of time. They join the WG to start debating instead of enjoying the euphoria of compatibility via dexterous architecture. Come on now!
>> 
>>> The idea is make it simple for people to join.
> 
> We make it simpler to join by leaving more paths open, and by
> saying "this path is easier to walk in these cases, and that
> path is easier to walk in those cases."

Does leaving any path you leave open make things simpler to join?
Every twist you add makes more work writing the spec, writing test
cases and writing implementations.

> 
> In other words, we make it easier to join by saying "Hashed URIs 
> have these pros and cons, and may be better and/or easier to use 
> than hashless when you're starting from xyz.  Hashless URIs have 
> these pros and cons, and may be better and/or easier to use than
> hashed URIs use when your situation is thus and such." 

Documents that speak about the pros and cons of hash uris exist.
We are defining WebID here and so we can choose it to mean one or 
the other or both. 

Everybody agrees that #uris are WebIDs. Not everybody agrees
that the others should be. Conensus for the moment is therefore
that #uris are WebIDs.

> 
> Ted
> 
> 
>> No, the idea is to stay true to the "deceptive simple" principle that makes the Web what it is. 
>> 
>>> My problem with 303's vocabs such as foaf is that you don't know when a uri is defined in a document until you dereference each URI. For example each of
>>> 
>>>   http://xmlns.com/foaf/0.1/knows
>>>   http://xmlns.com/foaf/0.1/mbox
>>>   http://xmlns.com/foaf/0.1/Person
>>>   http://xmlns.com/foaf/0.1/Agent
>>> 
>>> Are all defined by the 
>>> 
>>>  http://xmlns.com/foaf/spec/
>>> 
>>> but you don't know that until you have done an HTTP GET for each of those resources. 
>> 
>> You seek and find knowledge. Basically, the act of dereferencing an identifier that denotes something. Knowledge doesn't have to be presumptuous. 
>> 
>>> This is important because understanding which document is defining a term, is key to WebID-TLS authentication.
>> 
>> No, the entity relationship semantics from the retrieved profile graph is what's important. 
>> 
>>> A document that defines a term is making something close to a necessarily true assertion, as you know from mathematics.
>> 
>> I don't think that's relevant at all. The authentication protocol is about machine and human comprehensible entity relationship semantics and the logical reasoning they facilitate. 
>> 
>>> That means that even though you may already have fetched all the foaf definitions, you still have to interact with the server to find out if you have the definitions. 
>> 
>> We are using HTTP URIs. Cache invalidation is baked into HTTP. 
>> 
>>> Notice the security hole that breaks up if you are not careful otherwise:
>>> 
>>> I create a WebID+ for </ppl/joe> 
>>> this redirects to 
>>> 
>>>   /ppl
>>> 
>>> which describes ( and seems to define) also 
>>> 
>>>  </ppl/smith>
>>>  </people/smith>
>>> 
>>> You can know imagine a server implementing WebID protocol badly that on the authentication of /people/smith thinks it already has the definition for that WebID in store...
>> 
>> Anything can be implemented badly. You know that. A spec isn't supposed to teach engineering. 
>> 
>>>  There are a lot of things that end up needing explanation suddenly which we can cut out by keeping our definition rigorously simple.
>> 
>> HTTP URI is "deceptively simple" enough, if it wasn't then why didn't TimBL mandate hash style of HTTP URIs when minting the LInked Data meme? 
>> 
>> Kingsley 
>>> 
>>> Henry
>>> 
>>>> 
>>>> Steph.
>>>> 
>>>> On Mon, Nov 19, 2012 at 6:16 PM, Melvin Carvalho <melvincarvalho@gmail.com> wrote:
>>>> 
>>>> 
>>>> On 19 November 2012 23:58, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>>>>> All,
>>>>> 
>>>>> To understand this old problem please read: http://www.w3.org/TR/2007/WD-cooluris-20071217/#hashuri .
>>>>> 
>>>>> Important point to note, this matter ultimately becomes a permathread whenever a spec attempts to pick one style over the other.
>>>>> 
>>>>> The solution to these kinds of problems stem back to biblical stories, such as the one illustrating the wisdom of Solomon re. splitting a disputed baby in half.
>>>>> 
>>>>> HTTP URIs are "horses for course" compliant. It is always best to keep them that way when designing specs for HTTP based solutions.
>>>> 
>>>> Thanks
>>>> "Conclusion.
>>>> Hash URIs should be preferred for rather small and stable sets of resources that evolve together. An ideal case are RDF Schema vocabularies and OWL ontologies, where the terms                         are often used together, and the number of terms is unlikely to grow much in the future.
>>>> Hash URIs without content negotiation can be implemented by simply uploading static RDF files to a Web server, without any special server configuration. This makes them popular for quick-and-dirty RDF publication.
>>>> 
>>>> 303 URIs should be used for large sets of data that are, or may grow, beyond the point where it is practical to serve all related resources in a single document.
>>>> 
>>>> If in doubt, it's better to use the more flexible 303 URI approach.
>>>> 
>>>> "
>>>> Will try and digest this a bit more.  I may still be missing something but if you have a paradigm of one data item per page and call it #, like facebook do, I'm still trying to see the advantage                       of 303s.  As pointed out, facebook is not a small data set. 
>>>> 
>>>> -- 
>>>> 
>>>> Regards,
>>>> 
>>>> Kingsley Idehen 
>>>> Founder & CEO
>>>> OpenLink Software
>>>> Company Web: http://www.openlinksw.com
>>>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>>> Twitter/Identi.ca handle: @kidehen
>>>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>>>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Steph.
>>> 
>>> Social Web Architect
>>> http://bblfish.net/
>>> 
>> 
>> 
>> -- 
>> 
>> Regards,
>> 
>> Kingsley Idehen	      
>> Founder & CEO 
>> OpenLink Software     
>> Company Web: 
>> http://www.openlinksw.com
>> 
>> Personal Weblog: 
>> http://www.openlinksw.com/blog/~kidehen
>> 
>> Twitter/Identi.ca handle: @kidehen
>> Google+ Profile: 
>> https://plus.google.com/112399767740508618350/about
>> 
>> LinkedIn Profile: 
>> http://www.linkedin.com/in/kidehen
>> 
>> 
>> 
>> 
>> 
>> 
> 
> --
> A: Yes.                      http://www.guckes.net/faq/attribution.html
> | Q: Are you sure?
> | | A: Because it reverses the logical flow of conversation.
> | | | Q: Why is top posting frowned upon?
> 
> Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
> Senior Support & Evangelism  //        mailto:tthibodeau@openlinksw.com
>                             //              http://twitter.com/TallTed
> OpenLink Software, Inc.      //              http://www.openlinksw.com/
>         10 Burlington Mall Road, Suite 265, Burlington MA 01803
>     Weblog   -- http://www.openlinksw.com/blogs/
>     LinkedIn -- http://www.linkedin.com/company/openlink-software/
>     Twitter  -- http://twitter.com/OpenLink
>     Google+  -- http://plus.google.com/100570109519069333827/
>     Facebook -- http://www.facebook.com/OpenLinkSoftware
> Universal Data Access, Integration, and Management Technology Providers
> 
> 
> 
> 
> 
> 
> 

Social Web Architect
http://bblfish.net/
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Tuesday, 20 November 2012 21:59:00 UTC