Re: Address Bar URI from Hugh Glaser on 2011-10-19 (public-lod@w3.org from October 2011)

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Wed, 19 Oct 2011 23:35:42 +0000
To: Michael Smethurst <Michael.Smethurst@bbc.co.uk>
CC: Linking Open Data <public-lod@w3.org>
Message-ID: <EMEW3|0730dab79bc211a1a467725705d98293n9J0a702hg|ecs.soton.ac.uk|B680846F-18E4->
On 18 Oct 2011, at 14:49, Michael Smethurst wrote:

> 
> 
> 
> On 18/10/2011 11:30, "Hugh Glaser" <hg@ecs.soton.ac.uk> wrote:
> 
<snip>
>> So can I infer from this?:
>> In a world where I only have one of animals (1) and (2) (despite this possibly
>> or definitely in your view being the wrong way to do it),
>> I should not expose animal 3 anywhere in other than content location headers.
>> Which means the only thing I can expose is, in the Linked Data 303 world,
>> animal 1.
> 
> Personally that makes me quite queasy. It doesn't make sense to have the uri
> of a 'thing' in the address bar because the browser's job is to show
> information not things
Yes, the browser's job is to show information in the page.
So it makes me queasy too.
However, the address bar (should the browser options choose to show it) is to give some idea of where the information came from or what I am looking at (the page) is about.
As I have only recently found it, this is not just a Linked Data issue, but it does relate to Linked Data very much:
http://www.w3.org/QA/2010/04/why_does_the_address_bar_show.html
If the "Address Bar" had been called the "Topic Bar", or whatever, or we changed its name to that, it might be harder to have this bit of the discussion.
> 
>> So using the "NIR" in the address bar is the best of a bad job.
> 
> Or you could make the generic information resource. It's just some code ;-)
I could do that, but then the users would still need to understand the NIR/IR distinction.
But I finessed that - I said you only had the choice of animal 3 or one of animal 1 or 2 :-)
>> 
>> By the way, in my world the html associated with the NIRs is not really of
>> interest I would quite happily dispense with it and just serve RDF.
> 
> Aye, that's not really a luxury we have. Don't think Eastenders would sign
> off the RDF. Unless we could squeeze a big colourful banner in there :-)
True.
But actually the html you present as the result of resolving the Eastenders URI is seriously different from the RDF you deliver (I think).
It is much more like the result of a Semantic Web service that gives information about Eastenders, when given the URI as an argument.
Your html page must give at least the labels of actors, and at least the Director and other programmes, and yes, colourful banners.
The idea that these (the rdf and the html) are equivalent representations (IRs) of the same NIR is to me a Big Lie, at least for most sites.
Eg.
http://www.bbc.co.uk/programmes/b0074tnd.html
and
http://www.bbc.co.uk/programmes/b0074tnd.rdf
In fact it is the Big Lie of conneg, almost the elephant in the room that people don't talk about.
Conneg was meant to be about equivalent representations.
In rdf v. html they are definitely not.
Oh dear, I seem to have strayed from the topic, but that hasn't stopped others :-)

Very best
Hugh

> 
> Cheers
> michael
> 
> 
>> Developers want/need it so they can find out what is in the store when they
>> are building things.
>> Then the question of address bar does not arise at all (although your
>> questions do still arise).
>> In general, interesting things for users are not at the end of NIRs - I see
>> the value of Linked Data being delivered as the results of lots of smarts over
>> it being packaged up as services delivered into conventional web interactions,
>> or possibly smarter web applications.
>> Of course, the conventional web interactions should have their own NIRs, but
>> that is another story.
>> 
>> (Well you did throw in the dbpedia bit at the end of yours!)
>> 
>> Best
>> Hugh
>> 
>>> 
>>> ===
>>> 
>>> A couple more thoughts to save me the trouble of writing a blog post:
>>> 
>>> I think (and I might be wrong) that some linked data people see conneg (in
>>> the accept header sense) as being a peculiarity particular to linked data.
>>> But it's no more a linked data peculiarity than HTTP
>>> 
>>> Because it's seen as a peculiarity it tends to get lumped in with the usual
>>> linked data talking points around http-range-14 and 303s. And because it gets
>>> talked about as one thing it tends to get implemented as one thing
>>> 
>>> But http-range-14 / nir and conneg are doing completely different jobs. The
>>> first one is just about saying, "this thing i've been talking about can't be
>>> sent down the wires but here's some information." And the second is about
>>> sending back a representation that's appropriate to the needs of the user (as
>>> specified in their accept headers). Or saying, "Sorry, I don't have / can't
>>> generate a representation that suits your needs" (406). (Again, in our case,
>>> with some messy device detection to cope with feature phones and smart phones
>>> and twonkPads and laptops and possibly TV set top boxes). There’s a real
>>> separation of concerns that a lot of linked data publishers aren’t
>>> acknowledging. Which imo is just storing up trouble for the future
>>> 
>>> All of the problems mentioned in this thread could be solved with the
>>> addition of a *generic* information resource URI that does the conneg
>>> separately from the 303. Target the *generic* information resource in your
>>> links and expose that in the address bar, keep the details of the specific
>>> representation URL tucked away in content location headers and just use the
>>> non-information resource as something to talk about. So you don't split the
>>> URIs you expose to the web and don't bounce every request through a 303 and
>>> don't need to use replaceState to replace the representation URL with
>>> something more sharable
>>> 
>>> In the absence of a generic information resource URI you've only got two
>>> choices about what ends up in the address bar: the NIR URI or the specific
>>> representation URL. IMO it should be none of the above. The latter breaks
>>> sharing and the former doesn’t make sense
>>> 
>>> Also to note that the dbpedia publishing pattern is problematic for consumers
>>> as well as publishers [1]. NOTE: it's not the 303 that's actually harmful
>>> here; it's the lack of a *generic* information resource URI that leads to
>>> being constantly and unnecessarily bounced through a 303 for every request
>>> 
>>> Have to say that if we had implemented linked data following the dbpedia
>>> pattern and exposed a URL per serialisation / language in the address bar /
>>> to the web AND made our content unshareable AND inadvertently caused a 303
>>> hit for every request to bbc.co.uk... we'd probably have lost our jobs by
>>> now. And I tend to consider anything that loses me my job an anti-pattern :-/
>>> 
>>> 
>>> 
>>> 
>>> ps. Talking about dbpedia URIs I should probably also bring up the more
>>> harmful problem. Basing dbpedia URI slugs on wikipedia URI slugs which are in
>>> turn based on wikipedia page titles means URIs change every time someone
>>> changes the wikipedia page title. Which is definitely *the* major problem
>>> when working with dbpedia. Every time I see the LOD cloud diagram with all
>>> those links pointing to dbpedia I wonder how many of those links will still
>>> work today / tomorrow / etc. Is there any likelihood of dbpedia moving to /
>>> supporting something more dbpedia lite [2] like with URI slugs based on
>>> wikipedia row numbers (which we're told are guaranteed stable)? Probably a
>>> question for another thread...
>>> 
>>> [1] http://nevali.net/post/11228142010/303-considered-harmful
>>> [2] http://dbpedialite.org/
>>> 
>>> 
>>> On 18/10/2011 09:51, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:
>>> 
>>>> Hi Michael
>>>> 
>>>> Let me try to write down your case as I understand it, trying to avoid
>>>> Capitalized Buzzwords ;-)
>>>> Seems a good idea to me, although it introduces yet another level of
>>>> indirection in the picture, but maybe we need it.
>>>> 
>>>> We have three different types of animals to identify by URI
>>>> 
>>>> 1. Something known as 'foo' in the "real" (or not) world :
>>>> http://example.org/thing/foo
>>>> 2. A generic information resource binding the various representations of
>>>> 'foo' on my server(s) : http://example.org/resource/foo
>>>> 3. Representations/renderings of 'foo' in various formats (html, rdf, xml,
>>>> json, ...) / languages etc : http://example.org/resource/foo.html
>>>> 
>>>> The first URI is used in RDF descriptions of the thing, that I get for
>>>> example at http://example.org/resource/foo.rdf
>>>> The second URI is not used in the RDF descriptions whatsoever. It's a webby
>>>> trick enabling easy copy-paste, caching, display in address bar, whatever
>>>> deal with Web conversation only interested in information resources. It's my
>>>> IR proxy to 1.
>>>> 
>>>> The conneg for 1 is a systematic 303 to 2, whatever the query.
>>>> The conneg for 2 indirects to the desired type of representation.
>>>> 
>>>> Using 2 in Web dialogue avoids confusion : the URI in the browser is not
>>>> misleading. You've asked for an IR, here it is, and in the format you've
>>>> asked. 
>>>> 
>>>> Do I get your point correctly?
>>>> 
>>>> Bernard
>>>> 
>>>> 2011/10/18 Michael Smethurst <Michael.Smethurst@bbc.co.uk>
>>>>> Hi Richard
>>>>> 
>>>>> (Again top post courtesy of webmail. sorry)
>>>>> 
>>>>> I'm saying dbpedia is missing the concept of a *generic* information
>>>>> resource URI and it's that URI that should show up in the address bar and
>>>>> be used in link targets. Ignoring the linked data aspect for a moment if
>>>>> you publish your data in various serialisations like:
>>>>> 
>>>>> - /foo.html
>>>>> - /foo.xhtml-mp (mobile profile xhtml for feature (non-smart) phones)
>>>>> - /foo.json
>>>>> - /foo.xml
>>>>> 
>>>>> you want to allow people to copy and paste the address bar into email /
>>>>> twitter etc and for someone clicking the resulting link to get back an
>>>>> appropriate representation (depending on their accept headers + a bit of
>>>>> messy device detection in the case of the html and xhtml-mp)
>>>>> 
>>>>> So you need a generic IR URI that does the conneg / device detection and
>>>>> sends back the appropriate serialisation without a redirect. The generic IR
>>>>> URI (/foo) stays in the address bar and the full location (/foo.json etc)
>>>>> is only exposed in the content location header (not in the address bar)
>>>>> 
>>>>> All links then target the generic IR resource (not the NIR and NOT the
>>>>> specific representation (.html etc))
>>>>> 
>>>>> So link targets are to generic ir uri and the address bar always shows the
>>>>> generic ir uri. Which gives you two benefits:
>>>>> - you only expose one set of uris to crawlers (google etc)
>>>>> - the uri in the address bar becomes universally sharable with copy + paste
>>>>> 
>>>>> It's reasonable / necessary to expect publishers to take a conneg / device
>>>>> detection hit for every request because you want your content shared and
>>>>> the ability to send back an appropriate representation and it's all nicely
>>>>> cachable (even in cdn mode) with varies
>>>>> 
>>>>> It's not reasonable / necessary to expect publishers to take an uncachable
>>>>> 303 hit for every request
>>>>> 
>>>>> When you start writing rdf you just need the ability to talk about
>>>>> something that can't be sent down the wires. So you add in the nir uri. If
>>>>> someone requests the nir then:
>>>>> 
>>>>> nir > 303 > *generic* ir > conneg > ir representation (url only exposed as
>>>>> location header)
>>>>> 
>>>>> lots of linked data seems to do the 303 and conneg as one step but they're
>>>>> not happening for the same reason. the job of the conneg is to return an
>>>>> appropriate representation from the ir; the job of the 303 is to say "i
>>>>> can't send you that but here's some information that will hopefully be
>>>>> useful". conneg is needed regardless of whether you're doing linked data
>>>>> and linked data only adds in the 303 when the nir is requested. i think the
>>>>> two steps tend to get conflated in linked data publishing patterns and we
>>>>> should attempt to separate them
>>>>> 
>>>>> hth
>>>>> michael
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> http://www.bbc.co.uk
>>> This e-mail (and any attachments) is confidential and may contain personal
>>> views which are not the views of the BBC unless specifically stated.
>>> If you have received it in error, please delete it from your system.
>>> Do not use, copy or disclose the information in any way nor act in reliance
>>> on it and notify the sender immediately.
>>> Please note that the BBC monitors e-mails sent or received.
>>> Further communication will signify your consent to this.
> 
> 
> http://www.bbc.co.uk/
> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this.
> 					

-- 
Hugh Glaser,  
              Web and Internet Science
              Electronics and Computer Science,
              University of Southampton,
              Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/
Received on Wednesday, 19 October 2011 23:36:34 UTC