RE: Address Bar URI from Michael Smethurst on 2011-10-18 (public-lod@w3.org from October 2011)

From: Michael Smethurst <Michael.Smethurst@bbc.co.uk>
Date: Tue, 18 Oct 2011 06:53:58 +0100
To: "Richard Cyganiak" <richard@cyganiak.de>
Cc: "Kingsley Idehen" <kidehen@openlinksw.com>, <public-lod@w3.org>
Message-ID: <7A44633A0AA27A4A98B94B10BDF0AC3554C444@bbcxues27.national.core.bbc.co.uk>
Hi Richard

(Again top post courtesy of webmail. sorry)

I'm saying dbpedia is missing the concept of a *generic* information resource URI and it's that URI that should show up in the address bar and be used in link targets. Ignoring the linked data aspect for a moment if you publish your data in various serialisations like:

- /foo.html
- /foo.xhtml-mp (mobile profile xhtml for feature (non-smart) phones)
- /foo.json
- /foo.xml

you want to allow people to copy and paste the address bar into email / twitter etc and for someone clicking the resulting link to get back an appropriate representation (depending on their accept headers + a bit of messy device detection in the case of the html and xhtml-mp)

So you need a generic IR URI that does the conneg / device detection and sends back the appropriate serialisation without a redirect. The generic IR URI (/foo) stays in the address bar and the full location (/foo.json etc) is only exposed in the content location header (not in the address bar)

All links then target the generic IR resource (not the NIR and NOT the specific representation (.html etc))

So link targets are to generic ir uri and the address bar always shows the generic ir uri. Which gives you two benefits:
- you only expose one set of uris to crawlers (google etc)
- the uri in the address bar becomes universally sharable with copy + paste

It's reasonable / necessary to expect publishers to take a conneg / device detection hit for every request because you want your content shared and the ability to send back an appropriate representation and it's all nicely cachable (even in cdn mode) with varies

It's not reasonable / necessary to expect publishers to take an uncachable 303 hit for every request

When you start writing rdf you just need the ability to talk about something that can't be sent down the wires. So you add in the nir uri. If someone requests the nir then:

nir > 303 > *generic* ir > conneg > ir representation (url only exposed as location header)

lots of linked data seems to do the 303 and conneg as one step but they're not happening for the same reason. the job of the conneg is to return an appropriate representation from the ir; the job of the 303 is to say "i can't send you that but here's some information that will hopefully be useful". conneg is needed regardless of whether you're doing linked data and linked data only adds in the 303 when the nir is requested. i think the two steps tend to get conflated in linked data publishing patterns and we should attempt to separate them

hth
michael


-----Original Message-----
From: Richard Cyganiak [mailto:richard@cyganiak.de]
Sent: Mon 10/17/2011 7:58 PM
Cc: Kingsley Idehen; public-lod@w3.org
Subject: Re: Address Bar URI
 
References: <E338F3E7-C131-471A-AD5D-5B3882361AE9@ecs.soton.ac.uk> <EMEW3|36eea0fa457b931f9e7cd84ba7ed0101n9DD8902hg|ecs.soton.ac.uk|E338F3E7-C131-> <DDD289A9-060D-498A-9F83-68541B8DDF52@astro.gla.ac.uk> <3EC98AA0-13D5-4F49-B953-E00E945C4F10@ecs.soton.ac.uk> <EMEW3|3def92ab1d8be3210126f0dc285bc7c0n9DGMw02hg|ecs.soton.ac.uk|3EC98AA0-13D5-> <7A44633A0AA27A4A98B94B10BDF0AC3554C420@bbcxues27.national.core.bbc.co.uk> <D86820FF-B024-4CB5-9C19-43C36F973966@ecs.soton.ac.uk> <EMEW3|84acc2133619e6a53989a00825adf640n9EEiB02hg|ecs.soton.ac.uk|D86820FF-B024-> <7A44633A0AA27A4A98B94B10BDF0AC3554C426@bbcxues27.national.core.bbc.co.uk> <4E9ADEF3.8000603@openlinksw.com> <7A44633A0AA27A4A98B94B10BDF0AC3554C42A@bbcxues27.national.core.bbc.co.uk> <4E9C16A3.70409@openlinksw.com> <7A44633A0AA27A4A98B94B10BDF0AC3554C43D@bbcxues27.national.core.bbc.co.uk> <4E9C575F.7080000@openlinksw.co!
 ! m> <7A44633A0AA27A4A98B94B10BDF0AC3554C43F@bbcxues27.national.core.bbc.co.uk>
To: Michael Smethurst <Michael.Smethurst@bbc.co.uk>
X-Mailer: Apple Mail (2.1084)
Return-Path: richard@cyganiak.de
X-OriginalArrivalTime: 17 Oct 2011 18:58:43.0232 (UTC) FILETIME=A685A00:01CC8CFE]

Hi Michael,

I take it your proposed design is:

- .html representations only link to other .html representations
- entity identifiers 303-redirect to .html/.ttl based on conneg

You say this is better because it avoids the extra 303 roundtrip that you'd get by pointing the HTML links straight at the entity identifier. Right?

What would you do in the .rdf representations? If you have a triple that links Alice to Bob using some property, would that triple connect the entity identifiers? (I take it that the answer has to be yes, since the triple connects Alice and Bob, and not the two RDF documents that describe them.)

I think a reason why so much LOD data uses the entity URIs (rather than redirect-avoiding .html URIs) in their .html representations is for symmetry. The RDF triples in the .rdf representations connect entity URIs. So it's kinda natural to have the HTML links in the .html representations do the same.

Now whether symmetry is a good reason for taking the 303 redirect hit, is another question.

Best,
Richard



On 17 Oct 2011, at 18:51, Michael Smethurst wrote:
> for the record the correct answers were:
> 
> - because the conneg is broke
> - no
> - yes
> - because the links are broke
> - you wouldn't, the links are broke
> - yes
> - good lord, no, the links are broke
> 
> > - why do the URblahs that end up in the address bar for dbpedia
> > contain /page/ or /data/?
> > - do you think the address bar should ever show .html when browsing
> > myexperiment.org?
> > - isn't the question of whether i want / get html or data dependent on
> > what I accept and not on the URblah I request?
> > - why do dbpedia links target /resource/?
> > - why would you ever use a href to point to something you can't GET?
> > - isn't that what 'about' is for?
> > - do you expect any sane publisher to take a 303 hit for every request?


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
Received on Tuesday, 18 October 2011 05:57:16 UTC