Re: Address Bar URI from Kingsley Idehen on 2011-10-18 (public-lod@w3.org from October 2011)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 18 Oct 2011 07:28:05 -0400
To: public-lod@w3.org
Message-ID: <4E9D62C5.6040500@openlinksw.com>
On 10/18/11 1:53 AM, Michael Smethurst wrote:
>
> Hi Richard
>
> (Again top post courtesy of webmail. sorry)
>
> I'm saying dbpedia is missing the concept of a *generic* information 
> resource URI and it's that URI that should show up in the address bar 
> and be used in link targets. Ignoring the linked data aspect for a 
> moment if you publish your data in various serialisations like:
>
> - /foo.html
> - /foo.xhtml-mp (mobile profile xhtml for feature (non-smart) phones)
> - /foo.json
> - /foo.xml
>

As I said, it depends on your context lenses re. DBpedia.

If you are looking via the Information Space dimension, where Name and 
Address disambiguation doesn't matter. Then simply stick to: 
http://dbpedia.org/page/Linked_Data. That resource at the aforementioned 
address is HTML+RDFa based, it leverages Web Linking Patterns, and also 
putting <link/> to use re. discovery of alternative addresses that 
return alternative representations.

Put http://dbpedia.org/page/Linked_Data in your browser and patterns for 
URL cut and paste or bookmarking remain intact since there is no use of 
indirection that manifests in the address bar etc..

> you want to allow people to copy and paste the address bar into email 
> / twitter etc and for someone clicking the resulting link to get back 
> an appropriate representation (depending on their accept headers + a 
> bit of messy device detection in the case of the html and xhtml-mp)
>
> So you need a generic IR URI that does the conneg / device detection 
> and sends back the appropriate serialisation without a redirect. The 
> generic IR URI (/foo) stays in the address bar and the full location 
> (/foo.json etc) is only exposed in the content location header (not in 
> the address bar)
>
> All links then target the generic IR resource (not the NIR and NOT the 
> specific representation (.html etc))
>

Why do you need a generic resource URL when the publisher handles all 
the heuristics for discovering alternatives, subject to the desires of 
the user agent? Remember, DBpedia URIs are slash terminated. Of course, 
if using hash terminated URIs there is a benefit to the generic resource 
URL pattern you describe.
>
>
> So link targets are to generic ir uri and the address bar always shows 
> the generic ir uri. Which gives you two benefits:
> - you only expose one set of uris to crawlers (google etc)
> - the uri in the address bar becomes universally sharable with copy + 
> paste
>
> It's reasonable / necessary to expect publishers to take a conneg / 
> device detection hit for every request because you want your content 
> shared and the ability to send back an appropriate representation and 
> it's all nicely cachable (even in cdn mode) with varies
>
> It's not reasonable / necessary to expect publishers to take an 
> uncachable 303 hit for every request
>

If a publisher chooses to use a slash terminated URI as an Object ID 
(Name) then said publisher has two choices:

1. 303 as indirection mechanism that deals with Name and Address 
disambiguation

2. Assume that user agents grok introspection i.e., they can live with a 
202 and then leverage self describing nature of a Data Object (e.g. an 
RDF resource) en roue to Name and Address disambiguation.

Every choice comes with its own set of consequences.
>
>
> When you start writing rdf you just need the ability to talk about 
> something that can't be sent down the wires. So you add in the nir 
> uri. If someone requests the nir then:
>
> nir > 303 > *generic* ir > conneg > ir representation (url only 
> exposed as location header)
>
> lots of linked data seems to do the 303 and conneg as one step but 
> they're not happening for the same reason. the job of the conneg is to 
> return an appropriate representation from the ir; the job of the 303 
> is to say "i can't send you that but here's some information that will 
> hopefully be useful". conneg is needed regardless of whether you're 
> doing linked data and linked data only adds in the 303 when the nir is 
> requested. i think the two steps tend to get conflated in linked data 
> publishing patterns and we should attempt to separate them
>

I think conflation has arisen from warped narratives. There is nothing 
confusing (when you look outside Semantic Web and Linked Data 
literature) about data access by reference whereby the following items 
are distinct:

1. Object ID -- your choice of identifier style (re. Name) affects data 
access mechanics e.g. if slash then 303 comes into play.
2. Object Address -- ditto. in your case, if you choose to have a 
generic address that's slash terminated then you can 303 here too .
3. Object Representation -- target of the de-reference irrespective of 
levels of indirection above.


Again, re. DBpedia, if you put http://dbpedia.org/page/Linked_Data in 
your address bar, everything is fine i.e., the Linked Data heuristics 
don't manifest as anti-patterns.

There are many dimensions to the Web, that's the nature of its 
architecture.

Kingsley
>
>
> hth
> michael
>
>
> -----Original Message-----
> From: Richard Cyganiak [mailto:richard@cyganiak.de]
> Sent: Mon 10/17/2011 7:58 PM
> Cc: Kingsley Idehen; public-lod@w3.org
> Subject: Re: Address Bar URI
>
> References: <E338F3E7-C131-471A-AD5D-5B3882361AE9@ecs.soton.ac.uk> 
> <EMEW3|36eea0fa457b931f9e7cd84ba7ed0101n9DD8902hg|ecs.soton.ac.uk|E338F3E7-C131-> 
> <DDD289A9-060D-498A-9F83-68541B8DDF52@astro.gla.ac.uk> 
> <3EC98AA0-13D5-4F49-B953-E00E945C4F10@ecs.soton.ac.uk> 
> <EMEW3|3def92ab1d8be3210126f0dc285bc7c0n9DGMw02hg|ecs.soton.ac.uk|3EC98AA0-13D5-> 
> <7A44633A0AA27A4A98B94B10BDF0AC3554C420@bbcxues27.national.core.bbc.co.uk> 
> <D86820FF-B024-4CB5-9C19-43C36F973966@ecs.soton.ac.uk> 
> <EMEW3|84acc2133619e6a53989a00825adf640n9EEiB02hg|ecs.soton.ac.uk|D86820FF-B024-> 
> <7A44633A0AA27A4A98B94B10BDF0AC3554C426@bbcxues27.national.core.bbc.co.uk> 
> <4E9ADEF3.8000603@openlinksw.com> 
> <7A44633A0AA27A4A98B94B10BDF0AC3554C42A@bbcxues27.national.core.bbc.co.uk> 
> <4E9C16A3.70409@openlinksw.com> 
> <7A44633A0AA27A4A98B94B10BDF0AC3554C43D@bbcxues27.national.core.bbc.co.uk> 
> <4E9C575F.7080000@openlinksw.co!
>  ! m> 
> <7A44633A0AA27A4A98B94B10BDF0AC3554C43F@bbcxues27.national.core.bbc.co.uk>
> To: Michael Smethurst <Michael.Smethurst@bbc.co.uk>
> X-Mailer: Apple Mail (2.1084)
> Return-Path: richard@cyganiak.de
> X-OriginalArrivalTime: 17 Oct 2011 18:58:43.0232 (UTC) 
> FILETIME=A685A00:01CC8CFE]
>
> Hi Michael,
>
> I take it your proposed design is:
>
> - .html representations only link to other .html representations
> - entity identifiers 303-redirect to .html/.ttl based on conneg
>
> You say this is better because it avoids the extra 303 roundtrip that 
> you'd get by pointing the HTML links straight at the entity 
> identifier. Right?
>
> What would you do in the .rdf representations? If you have a triple 
> that links Alice to Bob using some property, would that triple connect 
> the entity identifiers? (I take it that the answer has to be yes, 
> since the triple connects Alice and Bob, and not the two RDF documents 
> that describe them.)
>
> I think a reason why so much LOD data uses the entity URIs (rather 
> than redirect-avoiding .html URIs) in their .html representations is 
> for symmetry. The RDF triples in the .rdf representations connect 
> entity URIs. So it's kinda natural to have the HTML links in the .html 
> representations do the same.
>
> Now whether symmetry is a good reason for taking the 303 redirect hit, 
> is another question.
>
> Best,
> Richard
>
>
>
> On 17 Oct 2011, at 18:51, Michael Smethurst wrote:
> > for the record the correct answers were:
> >
> > - because the conneg is broke
> > - no
> > - yes
> > - because the links are broke
> > - you wouldn't, the links are broke
> > - yes
> > - good lord, no, the links are broke
> >
> > > - why do the URblahs that end up in the address bar for dbpedia
> > > contain /page/ or /data/?
> > > - do you think the address bar should ever show .html when browsing
> > > myexperiment.org?
> > > - isn't the question of whether i want / get html or data dependent on
> > > what I accept and not on the URblah I request?
> > > - why do dbpedia links target /resource/?
> > > - why would you ever use a href to point to something you can't GET?
> > > - isn't that what 'about' is for?
> > > - do you expect any sane publisher to take a 303 hit for every 
> request?
>
>
> http://www.bbc.co.uk
> This e-mail (and any attachments) is confidential and may contain 
> personal views which are not the views of the BBC unless specifically 
> stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in 
> reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this. 


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Tuesday, 18 October 2011 11:28:43 UTC