Re: Address Bar URI from Michael Smethurst on 2011-10-18 (public-lod@w3.org from October 2011)

From: Michael Smethurst <michael.smethurst@bbc.co.uk>
Date: Tue, 18 Oct 2011 14:54:35 +0100
To: Hugh Glaser <hg@ecs.soton.ac.uk>
CC: Bernard Vatant <bernard.vatant@mondeca.com>, Linking Open Data <public-lod@w3.org>
Message-ID: <CAC343AB.2921A%michael.smethurst@bbc.co.uk>
On 18/10/2011 11:30, "Hugh Glaser" <hg@ecs.soton.ac.uk> wrote:

> Hi.
> On 18 Oct 2011, at 10:57, Michael Smethurst wrote:
> 
>> Hi Bernard
>> 
>> Glad to hear I¹m finally making sense to someone... :-/
> I think I might be still with you ;-)
> And finding the discussion very helpful - thanks.
> 
> And I'm not disagreeing - I have lots of concerns about how we do things, as
> we have discussed in the past.
> 
>>> 1. Something known as 'foo' in the "real" (or not) world :
>>> http://example.org/thing/foo
>>> 2. A generic information resource binding the various representations of
>>> 'foo' on my server(s) : http://example.org/resource/foo
>>> 3. Representations/renderings of 'foo' in various formats (html, rdf, xml,
>>> json, ...) / languages etc : http://example.org/resource/foo.html
> 
>> 
>> What you said. Only additions would be:
>> 
>>> The first URI is used in RDF descriptions of the thing, that I get for
>>> example at http://example.org/resource/foo.rdf
>> 
>> For completeness: and / or in rdfa at http://example.org/resource/foo.html
>> :-)
>> 
>>> The second URI is not used in the RDF descriptions whatsoever. It's a webby
>>> trick enabling easy copy-paste, caching, display in address bar, whatever
>>> deal with Web conversation only interested in information resources. It's my
>>> IR proxy to 1.
>> 
>> It could be used in the rdf but not to talk about Œthings¹. But for something
>> like seeAlso to point to more information (which might happen to be available
>> as rdf). It¹s kind of a trick but it¹s a common trick many publishers are
>> already doing regardless of whether they publish linked data (so links their
>> can be shared)
>> 
>>> The conneg for 1 is a systematic 303 to 2, whatever the query.
>> 
>> I guess you need to check that nir is a ³thing² known to your system before
>> 303ing but yes. Tho I wonder if ³conneg² is the right label for that?!?
>> 
>>> The conneg for 2 indirects to the desired type of representation.
>> 
>> Yes, and the representation URblah (.html, .rdf, .json) is only exposed in
>> content location headers (unless forced by the user into the address bar).
>> Which was your indirects. But just to be clear :-)
> So can I infer from this?:
> In a world where I only have one of animals (1) and (2) (despite this possibly
> or definitely in your view being the wrong way to do it),
> I should not expose animal 3 anywhere in other than content location headers.
> Which means the only thing I can expose is, in the Linked Data 303 world,
> animal 1.

Does feel a bit wrong tho. The browsers job is about showing information not
'things'

> So using the "NIR" in the address bar is the best of a bad job.

Or you could make generic information resource uris? It's just some code :-)
> 
> By the way, in my world the html associated with the NIRs is not really of
> interest I would quite happily dispense with it and just serve RDF.

Aye, but that's not really an option for us. Don't think Eastenders would
sign off on just rdf unless we could squeeze a big colourful banner in there
:-)

Cheers
Michael

 
> Developers want/need it so they can find out what is in the store when they
> are building things.
> Then the question of address bar does not arise at all (although your
> questions do still arise).
> In general, interesting things for users are not at the end of NIRs - I see
> the value of Linked Data being delivered as the results of lots of smarts over
> it being packaged up as services delivered into conventional web interactions,
> or possibly smarter web applications.
> Of course, the conventional web interactions should have their own NIRs, but
> that is another story.
> 
> (Well you did throw in the dbpedia bit at the end of yours!)
> 
> Best
> Hugh
> 
>> 
>> ===
>> 
>> A couple more thoughts to save me the trouble of writing a blog post:
>> 
>> I think (and I might be wrong) that some linked data people see conneg (in
>> the accept header sense) as being a peculiarity particular to linked data.
>> But it's no more a linked data peculiarity than HTTP
>> 
>> Because it's seen as a peculiarity it tends to get lumped in with the usual
>> linked data talking points around http-range-14 and 303s. And because it gets
>> talked about as one thing it tends to get implemented as one thing
>> 
>> But http-range-14 / nir and conneg are doing completely different jobs. The
>> first one is just about saying, "this thing i've been talking about can't be
>> sent down the wires but here's some information." And the second is about
>> sending back a representation that's appropriate to the needs of the user (as
>> specified in their accept headers). Or saying, "Sorry, I don't have / can't
>> generate a representation that suits your needs" (406). (Again, in our case,
>> with some messy device detection to cope with feature phones and smart phones
>> and twonkPads and laptops and possibly TV set top boxes). There¹s a real
>> separation of concerns that a lot of linked data publishers aren¹t
>> acknowledging. Which imo is just storing up trouble for the future
>> 
>> All of the problems mentioned in this thread could be solved with the
>> addition of a *generic* information resource URI that does the conneg
>> separately from the 303. Target the *generic* information resource in your
>> links and expose that in the address bar, keep the details of the specific
>> representation URL tucked away in content location headers and just use the
>> non-information resource as something to talk about. So you don't split the
>> URIs you expose to the web and don't bounce every request through a 303 and
>> don't need to use replaceState to replace the representation URL with
>> something more sharable
>> 
>> In the absence of a generic information resource URI you've only got two
>> choices about what ends up in the address bar: the NIR URI or the specific
>> representation URL. IMO it should be none of the above. The latter breaks
>> sharing and the former doesn¹t make sense
>> 
>> Also to note that the dbpedia publishing pattern is problematic for consumers
>> as well as publishers [1]. NOTE: it's not the 303 that's actually harmful
>> here; it's the lack of a *generic* information resource URI that leads to
>> being constantly and unnecessarily bounced through a 303 for every request
>> 
>> Have to say that if we had implemented linked data following the dbpedia
>> pattern and exposed a URL per serialisation / language in the address bar /
>> to the web AND made our content unshareable AND inadvertently caused a 303
>> hit for every request to bbc.co.uk... we'd probably have lost our jobs by
>> now. And I tend to consider anything that loses me my job an anti-pattern :-/
>> 
>> 
>> 
>> 
>> ps. Talking about dbpedia URIs I should probably also bring up the more
>> harmful problem. Basing dbpedia URI slugs on wikipedia URI slugs which are in
>> turn based on wikipedia page titles means URIs change every time someone
>> changes the wikipedia page title. Which is definitely *the* major problem
>> when working with dbpedia. Every time I see the LOD cloud diagram with all
>> those links pointing to dbpedia I wonder how many of those links will still
>> work today / tomorrow / etc. Is there any likelihood of dbpedia moving to /
>> supporting something more dbpedia lite [2] like with URI slugs based on
>> wikipedia row numbers (which we're told are guaranteed stable)? Probably a
>> question for another thread...
>> 
>> [1] http://nevali.net/post/11228142010/303-considered-harmful
>> [2] http://dbpedialite.org/
>> 
>> 
>> On 18/10/2011 09:51, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:
>> 
>>> Hi Michael
>>> 
>>> Let me try to write down your case as I understand it, trying to avoid
>>> Capitalized Buzzwords ;-)
>>> Seems a good idea to me, although it introduces yet another level of
>>> indirection in the picture, but maybe we need it.
>>> 
>>> We have three different types of animals to identify by URI
>>> 
>>> 1. Something known as 'foo' in the "real" (or not) world :
>>> http://example.org/thing/foo
>>> 2. A generic information resource binding the various representations of
>>> 'foo' on my server(s) : http://example.org/resource/foo
>>> 3. Representations/renderings of 'foo' in various formats (html, rdf, xml,
>>> json, ...) / languages etc : http://example.org/resource/foo.html
>>> 
>>> The first URI is used in RDF descriptions of the thing, that I get for
>>> example at http://example.org/resource/foo.rdf
>>> The second URI is not used in the RDF descriptions whatsoever. It's a webby
>>> trick enabling easy copy-paste, caching, display in address bar, whatever
>>> deal with Web conversation only interested in information resources. It's my
>>> IR proxy to 1.
>>> 
>>> The conneg for 1 is a systematic 303 to 2, whatever the query.
>>> The conneg for 2 indirects to the desired type of representation.
>>> 
>>> Using 2 in Web dialogue avoids confusion : the URI in the browser is not
>>> misleading. You've asked for an IR, here it is, and in the format you've
>>> asked. 
>>> 
>>> Do I get your point correctly?
>>> 
>>> Bernard
>>> 
>>> 2011/10/18 Michael Smethurst <Michael.Smethurst@bbc.co.uk>
>>>> Hi Richard
>>>> 
>>>> (Again top post courtesy of webmail. sorry)
>>>> 
>>>> I'm saying dbpedia is missing the concept of a *generic* information
>>>> resource URI and it's that URI that should show up in the address bar and
>>>> be used in link targets. Ignoring the linked data aspect for a moment if
>>>> you publish your data in various serialisations like:
>>>> 
>>>> - /foo.html
>>>> - /foo.xhtml-mp (mobile profile xhtml for feature (non-smart) phones)
>>>> - /foo.json
>>>> - /foo.xml
>>>> 
>>>> you want to allow people to copy and paste the address bar into email /
>>>> twitter etc and for someone clicking the resulting link to get back an
>>>> appropriate representation (depending on their accept headers + a bit of
>>>> messy device detection in the case of the html and xhtml-mp)
>>>> 
>>>> So you need a generic IR URI that does the conneg / device detection and
>>>> sends back the appropriate serialisation without a redirect. The generic IR
>>>> URI (/foo) stays in the address bar and the full location (/foo.json etc)
>>>> is only exposed in the content location header (not in the address bar)
>>>> 
>>>> All links then target the generic IR resource (not the NIR and NOT the
>>>> specific representation (.html etc))
>>>> 
>>>> So link targets are to generic ir uri and the address bar always shows the
>>>> generic ir uri. Which gives you two benefits:
>>>> - you only expose one set of uris to crawlers (google etc)
>>>> - the uri in the address bar becomes universally sharable with copy + paste
>>>> 
>>>> It's reasonable / necessary to expect publishers to take a conneg / device
>>>> detection hit for every request because you want your content shared and
>>>> the ability to send back an appropriate representation and it's all nicely
>>>> cachable (even in cdn mode) with varies
>>>> 
>>>> It's not reasonable / necessary to expect publishers to take an uncachable
>>>> 303 hit for every request
>>>> 
>>>> When you start writing rdf you just need the ability to talk about
>>>> something that can't be sent down the wires. So you add in the nir uri. If
>>>> someone requests the nir then:
>>>> 
>>>> nir > 303 > *generic* ir > conneg > ir representation (url only exposed as
>>>> location header)
>>>> 
>>>> lots of linked data seems to do the 303 and conneg as one step but they're
>>>> not happening for the same reason. the job of the conneg is to return an
>>>> appropriate representation from the ir; the job of the 303 is to say "i
>>>> can't send you that but here's some information that will hopefully be
>>>> useful". conneg is needed regardless of whether you're doing linked data
>>>> and linked data only adds in the 303 when the nir is requested. i think the
>>>> two steps tend to get conflated in linked data publishing patterns and we
>>>> should attempt to separate them
>>>> 
>>>> hth
>>>> michael
>>>> 
>>>> 
>>> 
>>  
>> 
>> http://www.bbc.co.uk
>> This e-mail (and any attachments) is confidential and may contain personal
>> views which are not the views of the BBC unless specifically stated.
>> If you have received it in error, please delete it from your system.
>> Do not use, copy or disclose the information in any way nor act in reliance
>> on it and notify the sender immediately.
>> Please note that the BBC monitors e-mails sent or received.
>> Further communication will signify your consent to this.


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
Received on Tuesday, 18 October 2011 13:54:49 UTC