Re: Address Bar URI from Michael Smethurst on 2011-10-20 (public-lod@w3.org from October 2011)

From: Michael Smethurst <michael.smethurst@bbc.co.uk>
Date: Thu, 20 Oct 2011 11:29:52 +0100
To: Hugh Glaser <hg@ecs.soton.ac.uk>
CC: Linking Open Data <public-lod@w3.org>
Message-ID: <CAC5B6B0.2949F%michael.smethurst@bbc.co.uk>
On 20/10/2011 00:35, "Hugh Glaser" <hg@ecs.soton.ac.uk> wrote:

> 
> On 18 Oct 2011, at 14:49, Michael Smethurst wrote:
> 
>> 
>> 
>> 
>> On 18/10/2011 11:30, "Hugh Glaser" <hg@ecs.soton.ac.uk> wrote:
>> 
> <snip>
>>> So can I infer from this?:
>>> In a world where I only have one of animals (1) and (2) (despite this
>>> possibly
>>> or definitely in your view being the wrong way to do it),
>>> I should not expose animal 3 anywhere in other than content location
>>> headers.
>>> Which means the only thing I can expose is, in the Linked Data 303 world,
>>> animal 1.
>> 
>> Personally that makes me quite queasy. It doesn't make sense to have the uri
>> of a 'thing' in the address bar because the browser's job is to show
>> information not things
> Yes, the browser's job is to show information in the page.
> So it makes me queasy too.
> However, the address bar (should the browser options choose to show it) is to
> give some idea of where the information came from or what I am looking at (the
> page) is about.

Which are very different things? At least if httprange-14 is true

> As I have only recently found it, this is not just a Linked Data issue, but it
> does relate to Linked Data very much:
> http://www.w3.org/QA/2010/04/why_does_the_address_bar_show.html
> If the "Address Bar" had been called the "Topic Bar", or whatever, or we
> changed its name to that, it might be harder to have this bit of the
> discussion.

Isn't that a chocolate bar :-)
>> 
>>> So using the "NIR" in the address bar is the best of a bad job.
>> 
>> Or you could make the generic information resource. It's just some code ;-)
> I could do that, but then the users would still need to understand the NIR/IR
> distinction.

Yup, or at least some set of users would need to understand it...

> But I finessed that - I said you only had the choice of animal 3 or one of
> animal 1 or 2 :-)

Which is cheating :-)

>>> 
>>> By the way, in my world the html associated with the NIRs is not really of
>>> interest I would quite happily dispense with it and just serve RDF.
>> 
>> Aye, that's not really a luxury we have. Don't think Eastenders would sign
>> off the RDF. Unless we could squeeze a big colourful banner in there :-)
> True.
> But actually the html you present as the result of resolving the Eastenders
> URI is seriously different from the RDF you deliver (I think).

Yes. And no

I can only really speak for bbc.co.uk/programmes cos she's my baby :-)

But the new /programmes (being released in dribs and drabs) is built on top
of the data views from the old /programmes. So is pretty much an exercise in
dog fooding

The data views (rdf-xml, xml, json, yaml) should be pretty much one for one.
If they're not that's probably just a bug rather than a design difference.
HTML views should also be one for one with data views except:

1. there'll always be emphemeral stuff around radio / tv that just isn't
worth modelling because it's too unstructured and will be gone next week. (A
radio competition to "rate my pumpkin" springs to mind). So we do inject
some additional stuff into the html colloquially known as "random crap". But
like I say... ephemeral

2. we try to make <definition of done> include all representations (for all
platforms / data views). This gets a little easier as more people care more
about mobile etc. But sometimes <definition of done> slips for expedience /
to ship something to a broadcast deadline. So there are some views on
/programmes that are just data with no html. And some views that are html
with no data representation. Although many fewer of those these days

3. for "user experience" reasons you tend to transclude some subsidiary
resources into their "parent" resources (sometimes inline, sometimes as
ajax). So a programme page might be made up of a (limited) list of upcoming
programmes (../broadcasts/upcoming) and a (limited) list of vod available
programmes (../episodes/player) etc. But all the transcluded views are
available as data. So there may be some differences between html rdfa and
rdf but they're mainly as a result of transclusion. It would be good if
there was some way to indicate in html / rdfa that this branch of the html
was a (item limited) transclusion of that resource over there. Which might
already exist and I should just read the manual :-)

4. As Dave Reynolds pointed out the licencing. And I'm not going to go there
but does bring up "open" rdfa in a copyrighted page....

But mainly, given some xslt, some css and some javascript you could take the
/programmes data feeds and make an html version without team:eastenders even
noticing. Random crap aside.


> It is much more like the result of a Semantic Web service that gives
> information about Eastenders, when given the URI as an argument.
> Your html page must give at least the labels of actors, and at least the
> Director and other programmes,

Yes, contribution data should be in all views (or transcluded into them). Eg
on mobile you might want smaller chunked views so you put contributors a
click away but make it inline on the desktop episode page

But all the modelling we do (including upcoming replacement of contribution
model) is done with all views in mind

> and yes, colourful banners.

Possibly a bit of a red herring. In the new designs all that stuff is just
css which doesn't really count :-)

> The idea that these (the rdf and the html) are equivalent representations
> (IRs) of the same NIR is to me a Big Lie, at least for most sites.
> Eg.
> http://www.bbc.co.uk/programmes/b0074tnd.html
> and
> http://www.bbc.co.uk/programmes/b0074tnd.rdf
> In fact it is the Big Lie of conneg, almost the elephant in the room that
> people don't talk about.
> Conneg was meant to be about equivalent representations.
> In rdf v. html they are definitely not.

Not sure it's a lie. The transclusion stuff in particular makes it an
approximation perhaps, but isn't everything :-)

> Oh dear, I seem to have strayed from the topic, but that hasn't stopped others
> :-)

Mailing lists exist to head off topic ;-)
> 
> Very best
> Hugh
> 
>> 
>> Cheers
>> michael
>> 
>> 
>>> Developers want/need it so they can find out what is in the store when they
>>> are building things.
>>> Then the question of address bar does not arise at all (although your
>>> questions do still arise).
>>> In general, interesting things for users are not at the end of NIRs - I see
>>> the value of Linked Data being delivered as the results of lots of smarts
>>> over
>>> it being packaged up as services delivered into conventional web
>>> interactions,
>>> or possibly smarter web applications.
>>> Of course, the conventional web interactions should have their own NIRs, but
>>> that is another story.
>>> 
>>> (Well you did throw in the dbpedia bit at the end of yours!)
>>> 
>>> Best
>>> Hugh
>>> 
>>>> 
>>>> ===
>>>> 
>>>> A couple more thoughts to save me the trouble of writing a blog post:
>>>> 
>>>> I think (and I might be wrong) that some linked data people see conneg (in
>>>> the accept header sense) as being a peculiarity particular to linked data.
>>>> But it's no more a linked data peculiarity than HTTP
>>>> 
>>>> Because it's seen as a peculiarity it tends to get lumped in with the usual
>>>> linked data talking points around http-range-14 and 303s. And because it
>>>> gets
>>>> talked about as one thing it tends to get implemented as one thing
>>>> 
>>>> But http-range-14 / nir and conneg are doing completely different jobs. The
>>>> first one is just about saying, "this thing i've been talking about can't
>>>> be
>>>> sent down the wires but here's some information." And the second is about
>>>> sending back a representation that's appropriate to the needs of the user
>>>> (as
>>>> specified in their accept headers). Or saying, "Sorry, I don't have / can't
>>>> generate a representation that suits your needs" (406). (Again, in our
>>>> case,
>>>> with some messy device detection to cope with feature phones and smart
>>>> phones
>>>> and twonkPads and laptops and possibly TV set top boxes). There¹s a real
>>>> separation of concerns that a lot of linked data publishers aren¹t
>>>> acknowledging. Which imo is just storing up trouble for the future
>>>> 
>>>> All of the problems mentioned in this thread could be solved with the
>>>> addition of a *generic* information resource URI that does the conneg
>>>> separately from the 303. Target the *generic* information resource in your
>>>> links and expose that in the address bar, keep the details of the specific
>>>> representation URL tucked away in content location headers and just use the
>>>> non-information resource as something to talk about. So you don't split the
>>>> URIs you expose to the web and don't bounce every request through a 303 and
>>>> don't need to use replaceState to replace the representation URL with
>>>> something more sharable
>>>> 
>>>> In the absence of a generic information resource URI you've only got two
>>>> choices about what ends up in the address bar: the NIR URI or the specific
>>>> representation URL. IMO it should be none of the above. The latter breaks
>>>> sharing and the former doesn¹t make sense
>>>> 
>>>> Also to note that the dbpedia publishing pattern is problematic for
>>>> consumers
>>>> as well as publishers [1]. NOTE: it's not the 303 that's actually harmful
>>>> here; it's the lack of a *generic* information resource URI that leads to
>>>> being constantly and unnecessarily bounced through a 303 for every request
>>>> 
>>>> Have to say that if we had implemented linked data following the dbpedia
>>>> pattern and exposed a URL per serialisation / language in the address bar /
>>>> to the web AND made our content unshareable AND inadvertently caused a 303
>>>> hit for every request to bbc.co.uk... we'd probably have lost our jobs by
>>>> now. And I tend to consider anything that loses me my job an anti-pattern
>>>> :-/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ps. Talking about dbpedia URIs I should probably also bring up the more
>>>> harmful problem. Basing dbpedia URI slugs on wikipedia URI slugs which are
>>>> in
>>>> turn based on wikipedia page titles means URIs change every time someone
>>>> changes the wikipedia page title. Which is definitely *the* major problem
>>>> when working with dbpedia. Every time I see the LOD cloud diagram with all
>>>> those links pointing to dbpedia I wonder how many of those links will still
>>>> work today / tomorrow / etc. Is there any likelihood of dbpedia moving to /
>>>> supporting something more dbpedia lite [2] like with URI slugs based on
>>>> wikipedia row numbers (which we're told are guaranteed stable)? Probably a
>>>> question for another thread...
>>>> 
>>>> [1] http://nevali.net/post/11228142010/303-considered-harmful
>>>> [2] http://dbpedialite.org/
>>>> 
>>>> 
>>>> On 18/10/2011 09:51, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:
>>>> 
>>>>> Hi Michael
>>>>> 
>>>>> Let me try to write down your case as I understand it, trying to avoid
>>>>> Capitalized Buzzwords ;-)
>>>>> Seems a good idea to me, although it introduces yet another level of
>>>>> indirection in the picture, but maybe we need it.
>>>>> 
>>>>> We have three different types of animals to identify by URI
>>>>> 
>>>>> 1. Something known as 'foo' in the "real" (or not) world :
>>>>> http://example.org/thing/foo
>>>>> 2. A generic information resource binding the various representations of
>>>>> 'foo' on my server(s) : http://example.org/resource/foo
>>>>> 3. Representations/renderings of 'foo' in various formats (html, rdf, xml,
>>>>> json, ...) / languages etc : http://example.org/resource/foo.html
>>>>> 
>>>>> The first URI is used in RDF descriptions of the thing, that I get for
>>>>> example at http://example.org/resource/foo.rdf
>>>>> The second URI is not used in the RDF descriptions whatsoever. It's a
>>>>> webby
>>>>> trick enabling easy copy-paste, caching, display in address bar, whatever
>>>>> deal with Web conversation only interested in information resources. It's
>>>>> my
>>>>> IR proxy to 1.
>>>>> 
>>>>> The conneg for 1 is a systematic 303 to 2, whatever the query.
>>>>> The conneg for 2 indirects to the desired type of representation.
>>>>> 
>>>>> Using 2 in Web dialogue avoids confusion : the URI in the browser is not
>>>>> misleading. You've asked for an IR, here it is, and in the format you've
>>>>> asked. 
>>>>> 
>>>>> Do I get your point correctly?
>>>>> 
>>>>> Bernard
>>>>> 
>>>>> 2011/10/18 Michael Smethurst <Michael.Smethurst@bbc.co.uk>
>>>>>> Hi Richard
>>>>>> 
>>>>>> (Again top post courtesy of webmail. sorry)
>>>>>> 
>>>>>> I'm saying dbpedia is missing the concept of a *generic* information
>>>>>> resource URI and it's that URI that should show up in the address bar and
>>>>>> be used in link targets. Ignoring the linked data aspect for a moment if
>>>>>> you publish your data in various serialisations like:
>>>>>> 
>>>>>> - /foo.html
>>>>>> - /foo.xhtml-mp (mobile profile xhtml for feature (non-smart) phones)
>>>>>> - /foo.json
>>>>>> - /foo.xml
>>>>>> 
>>>>>> you want to allow people to copy and paste the address bar into email /
>>>>>> twitter etc and for someone clicking the resulting link to get back an
>>>>>> appropriate representation (depending on their accept headers + a bit of
>>>>>> messy device detection in the case of the html and xhtml-mp)
>>>>>> 
>>>>>> So you need a generic IR URI that does the conneg / device detection and
>>>>>> sends back the appropriate serialisation without a redirect. The generic
>>>>>> IR
>>>>>> URI (/foo) stays in the address bar and the full location (/foo.json etc)
>>>>>> is only exposed in the content location header (not in the address bar)
>>>>>> 
>>>>>> All links then target the generic IR resource (not the NIR and NOT the
>>>>>> specific representation (.html etc))
>>>>>> 
>>>>>> So link targets are to generic ir uri and the address bar always shows
>>>>>> the
>>>>>> generic ir uri. Which gives you two benefits:
>>>>>> - you only expose one set of uris to crawlers (google etc)
>>>>>> - the uri in the address bar becomes universally sharable with copy +
>>>>>> paste
>>>>>> 
>>>>>> It's reasonable / necessary to expect publishers to take a conneg /
>>>>>> device
>>>>>> detection hit for every request because you want your content shared and
>>>>>> the ability to send back an appropriate representation and it's all
>>>>>> nicely
>>>>>> cachable (even in cdn mode) with varies
>>>>>> 
>>>>>> It's not reasonable / necessary to expect publishers to take an
>>>>>> uncachable
>>>>>> 303 hit for every request
>>>>>> 
>>>>>> When you start writing rdf you just need the ability to talk about
>>>>>> something that can't be sent down the wires. So you add in the nir uri.
>>>>>> If
>>>>>> someone requests the nir then:
>>>>>> 
>>>>>> nir > 303 > *generic* ir > conneg > ir representation (url only exposed
>>>>>> as
>>>>>> location header)
>>>>>> 
>>>>>> lots of linked data seems to do the 303 and conneg as one step but
>>>>>> they're
>>>>>> not happening for the same reason. the job of the conneg is to return an
>>>>>> appropriate representation from the ir; the job of the 303 is to say "i
>>>>>> can't send you that but here's some information that will hopefully be
>>>>>> useful". conneg is needed regardless of whether you're doing linked data
>>>>>> and linked data only adds in the 303 when the nir is requested. i think
>>>>>> the
>>>>>> two steps tend to get conflated in linked data publishing patterns and we
>>>>>> should attempt to separate them
>>>>>> 
>>>>>> hth
>>>>>> michael
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> http://www.bbc.co.uk
>>>> This e-mail (and any attachments) is confidential and may contain personal
>>>> views which are not the views of the BBC unless specifically stated.
>>>> If you have received it in error, please delete it from your system.
>>>> Do not use, copy or disclose the information in any way nor act in reliance
>>>> on it and notify the sender immediately.
>>>> Please note that the BBC monitors e-mails sent or received.
>>>> Further communication will signify your consent to this.
>> 
>> 
>> http://www.bbc.co.uk/
>> This e-mail (and any attachments) is confidential and may contain personal
>> views which are not the views of the BBC unless specifically stated.
>> If you have received it in error, please delete it from your system.
>> Do not use, copy or disclose the information in any way nor act in reliance
>> on it and notify the sender immediately.
>> Please note that the BBC monitors e-mails sent or received.
>> Further communication will signify your consent to this.
>> 


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
Received on Thursday, 20 October 2011 10:30:32 UTC