RE: Address Bar URI from Michael Smethurst on 2011-10-17 (public-lod@w3.org from October 2011)

From: Michael Smethurst <Michael.Smethurst@bbc.co.uk>
Date: Mon, 17 Oct 2011 16:22:40 +0100
To: "Kingsley Idehen" <kidehen@openlinksw.com>, <public-lod@w3.org>
Message-ID: <7A44633A0AA27A4A98B94B10BDF0AC3554C43D@bbcxues27.national.core.bbc.co.uk>
Hi Kingsley, he sighed

I have to admit I'm not sure what any of this means. I've tried reading through several times but it just seems to be the same set of random buzzwords assembled in fairly random order.

As a relative outsider / lurker on this list it does seem that your main contribution is an attempt to change the language of every mail sent to match a language that only you seem to speak. And I know you'll reply with something like, "in the wider IT industry the language I use is common." I do wonder if you're confusing the set of people that make up the wider IT industry with the set of people who know / care about the internal workings of odbc. I'm not sure they're the same thing...

I know you're only trying to be helpful and add clarity but I not convinced that that tends to be the end result.

That aside some questions:

- why do the URblahs that end up in the address bar for dbpedia contain /page/ or /data/?

- do you think the address bar should ever show .html when browsing myexperiment.org?

- isn't the question of whether i want / get html or data dependent on what I accept and not on the URblah I request?

- why do dbpedia links target /resource/?

- why would you ever use a href to point to something you can't GET?

- isn't that what 'about' is for?

- do you expect any sane publisher to take a 303 hit for every request?

cheers
michael


-----Original Message-----
From: public-lod-request@w3.org on behalf of Kingsley Idehen
Sent: Mon 10/17/2011 12:50 PM
To: public-lod@w3.org
Subject: Re: Address Bar URI
 
On 10/17/11 1:48 AM, Michael Smethurst wrote:
>
> Hi Kingsley
>
> I've heard you make this argument several times in the past. But I 
> don't understand why. How does it benefit publishers to expose the 
> representation address?
>

I am saying that data representation is what a browser retrieves from an 
Address. a URL != representation, its is the address from which you 
access data in a representation negotiated between the client and the 
server.

In the Web's information space dimension, hyperlinks are an Address / 
Name conflation. Ambiguity isn't harmful.
Browsers work with hyperlink based Addresses / Locators (URLs).

This is how everyone publishes data on the Web. Basically, there is only 
a single level of indirection between a hyperlink based address and the 
HTML based data objects (resources) to which they resolve.

It's safe to conclude that if this is what everyone does on the 
massively successful WWW then benefit is self evident.

> How does it benefit consumers?
>

A publisher exposes data via an Address, with one level of indirection 
via a hyperlink based address as outlined above. A consumer simply 
de-references the hyperlink or places it via cut&paste (if known and 
remembered) into a Browser's address bar, again via single level of 
indirection, and they have their data -- typically an  HTML based Web Page.
>
>
> Ignoring linked data for a moment.... If I ask for some information 
> (an ir uri) over HTTP what I get back back depends on what I ask for 
> and what I choose to accept (serialisation, language). There are 
> benefits to this:
>
> - the same links work for everyone (or as wide a set of people as 
> possible)
> - you don't expose multiple uris to the web
> - you don't split your google juice
>

I don't believe anything I am saying (or said) contradicts any of the 
items above. Google juice is driven primarily by hyperlinks that have a 
single level of indirection re. actual resource (data object) access. 
Google and everyone else grabs resources (data objects) from addresses, 
uses address (without penalty) as prime identifiers for the resources, 
then indexes, and applies its special ranking algorithms etc..

>
> This is just the way http works and what leads to its most important 
> aspect: universal access to information.
>

Yes, that's why I said: Information Space dimension at the top of this post.

>
> (It's also won as an argument. The "one web" approach won as soon as 
> the first product manager eagerly clicked a link in twitter and got 
> bounced to the homepage of their "product" because different 
> serialisations sat on different uris)
>
> The only thing linked data adds to this is the realisation that people 
> want to talk about things that can't be sent over http. So we end up 
> with nir uris to give us something to talk about (in all senses :-) ) 
> and 303s in case anyone is silly enough to request them
>

There is a cleaner way (IMHO) of talking about URI abstraction and the 
indirection implicit in said abstraction. The information space 
dimension doesn't care about the distinction between hyperlink based 
resource (object) addresses and hyperlink based resource (object) names.

Put differently, in the Web's  information space dimension a URL (a 
Location Name or Address) serves as the Data Source Name (a DSN in 
ODBC/JDBC parlance) of focus. In the data space dimension, on the other 
hand, there is a distinction between a hyperlink based data object 
(resource) name and a hyperlink based data object (resource) address. 
This distinction is achieved, honored, protected, and exploited via 
indirection.

Where you 303 or not, the fundamental act is one of indirection such 
that a data object (resource) name is distinct from its address. Thus, 
you can exploit two age-old operations that are fundamental to computer 
science:

1. de-reference
2. address-of.

>
> Agree completely that the address bar should never expose the nir uri 
> because that doesn't make any sense.
>

Yes. But even worse, the term Non Information Resource simply compounds 
all that's wrong with the use of Resource when we should be referring to 
Object. I say this with a much broader community in mind, one that is 
much larger than the Semantic Web or Linked Open Data communities (even 
when combined).

DBMS (relational and object) find the term: Resource and its use obscure 
at best. Folks from this community (plus their millions of horizontal 
and vertical application developers) understand indirection, but simply 
don't recognize it when they encounter typical Linked Data collateral 
and discourse.
>
> Web browsers are information browsers so should expose the uri of the 
> information. Just not the specific serialisation / language
>

Yes, but I prefer to say: Web browsers are information browsers so they 
should expose the URLs (Location as per "Location:" response header) of 
the the documents they retrieve.  Of course, a URL is a kindOf URI, but 
as stated at the top of this post, the distinction doesn't matter when 
discourse is all about the information space dimension, not so when we 
move into the data space dimension :-)
>
>
>
> My 2p
>
> Michael
>
>
> -----Original Message-----
> From: public-lod-request@w3.org on behalf of Kingsley Idehen
> Sent: Sun 10/16/2011 2:41 PM
> To: public-lod@w3.org
> Subject: Re: Address Bar URI
>
> On 10/16/11 8:50 AM, Michael Smethurst wrote:
> >
> > Hi Hugh
> >
> > Apologies for top post; blame webmail :-/
> >
> > (Using labels as they appear in my head; feel free to translate to
> > labels as they appear in your head)
> >
> > If you're publishing linked data using 303s *and* the links in your
> > html are targeting the nir uri (as per dbpedia):
> >
> > <a class="uri" rel="dbpedia-owl:composer"
> > xmlns:dbpedia-owl="http://dbpedia.org/ontology/"
> > 
> href="http://dbpedia.org/resource/Simon_May"><small>dbpedia</small>:Simon_May</a>
> >
> > then, yes, you  are already exposing 2 sets of uris to the web and to
> > google. A google bot crawling your pages is going to see all your
> > internal links pointing to .../thing/... whilst a user of you site is
> > going to see the result of the 303 (.../page/... or whatever) in their
> > address bar. If they want to blog about your stuff chances are they'll
> > copy and paste into their post the URblah they see in their address
> > bar. So when a google bot crawls their blog it'll see links pointing
> > to .../page/... Google doesn't consolidate pagerank for the two sets
> > of uris for 303s
> >
> > But if you're publishing with 303s and linking internally to nir uris
> > you've already got a problem. I can just about imagine attempting to
> > convince people that it's worthwhile having a .../thing/... and 303ing
> > it to a .../page/... for the .2% of people who care about the
> > distinction / consume rdf. Attempting to convince the BBC platform
> > people that every request a user makes as they browse round the site
> > should be routed through a 303 would probably get my coffee spike with
> > arsenic. It just isn't going happen. Ever. If the advice from this
> > community is to target html links to the nir uri I think that's going
> > to cause a lot of problems for a lot of publishers...
> >
> > For BBC linked data stuff we deal with 3 classes of URblah:
> >
> > 1. representation urls (http://www.bbc.co.uk/programmes/b015v0nh.html,
> > http://www.bbc.co.uk/programmes/b015v0nh.mp,
> > http://www.bbc.co.uk/programmes/b015v0nh.json,
> > http://www.bbc.co.uk/programmes/b015v0nh.rdf etc)
> >
> > 2. information resource uris (http://www.bbc.co.uk/programmes/b015v0nh)
> >
> > 3. nir uris (http://www.bbc.co.uk/programmes/b015v0nh#programme)
> >
> > The first set are never exposed except as content location headers.
> > And the links inside the html all *target* the second set:
> >
> > <a href="/programmes/b015j0ng" typeof="po:Episode"
> > about="/programmes/b015j0ng#programme"><span class="title"
> > property="dc:title">Episode 3</span></a>
> >
> > The second set just conneg (with some added device detection) to the
> > first set with no redirection. These are the only uris ever really
> > exposed. And they're the main benefits of doing any of this. Because
> > we don't expose (in the address bar) the representation urls, users
> > link to and share the ir uris. And the people they share with get back
> > a representation appropriate to their needs (for device,
> > serialisation, accessibility, language...)
> >
> > The third set is only there to be talked about in the rdf/rdfa, for
> > people in the world (possibly confined to this list :-) ) who think
> > the distinction between nirs and irs matters. But it's never exposed
> > in the address bar
> >
> > So our internal (html) links point to the second set. And the address
> > bar points to the second set. So other people link to us using the
> > second set and we don't leak google juice all over our shoes. And the
> > 3rd set is there for those folks who care
> >
> > Comparing that to dbedia: all internal links point to the first set,
> > there's then a conneg / 303 dance and the thing that ends up in your
> > address bar is something that's half way between an ir uri and an ir
> > representation url (.../page/... or .../data/...). So definitely
> > exposing 2 sets of URblahs to the web and not exposing the most
> > important one: the information *resource* uri. Which is the most
> > important one because that's the one you want people to be able to
> > share without worrying about whether the representation is appropriate
> > to the needs of the people they're sharing with
> >
> > IMO anywhere were you end up with /html or .html or .rdf or .mp or .cy
> > or /page or /data exposed in the address bar is broken because the
> > representation returned should be dependent on your accept headers,
> > not the (information) resource you request. Or we're forgetting
> > everything uncle roy ever taught us. Exposing representation in the
> > address bar means your content / stuff can't be shared universally.
> > And seo is just a side effect of being able to share / link universally
> >
> > So at the risk of being controversial I think the dbpedia publishing
> > pattern is a bit of an anti-pattern and we shouldn't be encouraging
> > other publishers / developers to adopt it
> >
>
> In the Web's information space dimension, yes, you could say DBpedia's
> approach is an anti-pattern, that's basically another way of
> articulating what I tried to convey via the 1-3 sequence in one of my
> earlier posts. In the Web's data space dimension DBpedia's approach is
> natural and obvious. Trouble is, a majority of Web users and Developers
> are only gradually beginning to sense the aforementioned data space
> dimension.
>
> We can make the manifestation of the Web's data space dimension
> unobtrusive if indirection is introduced properly which ultimately means:
>
> 1. Addresses (URLs) stay in the Address Bar.
> 2. Actual Object (Resource) Identifiers (generic de-referencable URI
> based Names) are discovered by introspection (human or machine)
> 3. Accessing Data Objects (Resources) by Name or Address becomes
> optional and should be driven by UX patterns.
>
> We can never negate Name and Address disambiguation, once we are in the
> Web's data space dimension. The use of indirection to solve problems in
> computer science is as old as the subject matter itself :-)
>
>
> > [SNIP]
> >
>
> > Michael
> >
>
> Kingsley
> >
> >
> >
> > -----Original Message-----
> > From: Hugh Glaser [mailto:hg@ecs.soton.ac.uk]
> > Sent: Sat 10/15/2011 2:43 PM
> > To: Michael Smethurst
> > Cc: Norman Gray; Linking Open Data; Don Cruickshank
> > Subject: Re: Address Bar URI
> >
> > Thanks Michael.
> > Very helpful to bring in the SEO perspective, even on a Friday evening.
> >
> > On 14 Oct 2011, at 21:28, Michael Smethurst wrote:
> >
> > > Have to say from a pragmatic point of view that using replaceState
> > to switch between IR and NIR (or whatever we're supposed to call them)
> > URIs feels like bad advice for most developers
> > >
> > > Users in older browsers are going to see (and copy and paste) one
> > set of URIs whilst users of more modern browsers are going to see (and
> > copy and paste) another
> > Maybe now is not the time to do it - but always being backwards
> > compatible is not great.
> > Actually, users of the old browsers are currently disallowed from
> > copying and pasting the address bar, if what they are after is the NIR
> > or whatever we call it.
> > The myexperiment.org site has a real problem with this, and on a real
> > system.
> > I think currently they have to accept that users do it, and then patch
> > up afterwards (by removing the .html).
> > >
> > > So you end up exposing two sets of URIs to the web and to Google et
> > al. Google only consolidates page rank for inbound links on 301s (and
> > not 302s or 303s) so you'd end up throwing your findability away for
> > an esoteric distinction that no-one quite understands. Or understands
> > but doesn't quite agree with :-)
> > But I think we are currently exposing two sets of URIs.
> > If we do the rewrite we will only be exposing one set of URIs to the
> > users.
> >
> > At first I thought "Oh no", we mustn't compromise SEO, and you
> > describe how rewriting the address bar does.
> > But now I am afraid I don't understand why it does.
> > The only change is what the user sees in the Bar - so how would that
> > affect the SEO?
> > Can you elaborate on how it affects SEO please?
> > I see that, for example googling '"Hugh Glaser" site:semanticweb.org'
> > gets me
> > http://data.semanticweb.org/person/hugh-glaser as the top hit, and
> > seems to ignore
> > http://data.semanticweb.org/person/hugh-glaser/html
> > >
> > > For now cross browser support for pushState and replaceState is
> > pretty shonky [1]. It's useful when product managers demand an "app
> > like experience" because you can do all the shiny ajax stuff without
> > nasty ajax #s and it all looks good on their iDevices. They don't need
> > to know that's not what most people see :-)
> > >
> > > With apologies for bringing up S*E*O on a Friday evening. And that
> > aside it just feels like asking people to add more complexity to
> > sidestep existing complexity that they don't understand / see the need
> > for in the first place...
> > Remember it was a developer who asked me in the first place, who saw
> > it as an answer to a serious problem he has with the users' 
> interactions.
> > We should always incline to pushing just a bit more complexity onto
> > the few developers, rather than onto the many, many more users, I think.
> >
> > Best
> > Hugh
> > >
> > >
> > > [1] http://caniuse.com/#search=replaceState
> > >
> > >
> > > -----Original Message-----
> > > From: public-lod-request@w3.org on behalf of Hugh Glaser
> > > Sent: Fri 10/14/2011 4:22 PM
> > > To: Norman Gray
> > > Cc: Linking Open Data; Don Cruickshank
> > > Subject: Re: Address Bar URI
> > >
> > > I am really no expert - really, so showing my ignorance here.
> > > I understand:
> > >
> > > JS:
> > > window.history.replaceState('Object', 'Title', '/another-new-url');
> > > will do it happily, but I guess HTML5 is required.
> > > You can use it to change path and search strings, but not protocol
> > or domain, I understand.
> > >
> > >
> > > On 14 Oct 2011, at 15:26, Norman Gray wrote:
> > >
> > > >
> > > > Hugh, greetings.
> > > >
> > > > On 2011 Oct 14, at 13:08, Hugh Glaser wrote:
> > > >
> > > >> My colleague, Don Cruickshank asked me if it was good practice to
> > rewrite the URI in the Address Bar to be the NIR, rather than the IR.
> > > >> I was surprised, but he tells me that it is permitted in HTML5.
> > > >
> > > > Can you expand on this a little?
> > > >
> > > > Is this some HTML5 cleverness that lets one declare in the HTML
> > what the address bar should display?  Or is it some Javascript
> > kludge^Wgadget that does it, in which case what is the sense in which
> > this is 'permitted' in HTML5 and wasn't before?
> > > >
> > > > All the best,
> > > >
> > > > Norman
> > > >
> > > >
> > > > --
> > > > Norman Gray  : http://nxg.me.uk
> > > > SUPA School of Physics and Astronomy, University of Glasgow, UK
> > > >
> > >
> > > --
> > > Hugh Glaser,
> > >               Web and Internet Science
> > >               Electronics and Computer Science,
> > >               University of Southampton,
> > >               Southampton SO17 1BJ
> > > Work: +44 23 8059 3670, Fax: +44 23 8059 3045
> > > Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
> > > http://www.ecs.soton.ac.uk/~hg/ 
> <http://www.ecs.soton.ac.uk/%7Ehg/> <http://www.ecs.soton.ac.uk/%7Ehg/>
> > >
> > >
> > >
> > >
> > >
> > >
> > > http://www.bbc.co.uk
> > > This e-mail (and any attachments) is confidential and may contain
> > personal views which are not the views of the BBC unless specifically
> > stated.
> > > If you have received it in error, please delete it from your system.
> > > Do not use, copy or disclose the information in any way nor act in
> > reliance on it and notify the sender immediately.
> > > Please note that the BBC monitors e-mails sent or received.
> > > Further communication will signify your consent to this.
> >
> > --
> > Hugh Glaser,
> >               Web and Internet Science
> >               Electronics and Computer Science,
> >               University of Southampton,
> >               Southampton SO17 1BJ
> > Work: +44 23 8059 3670, Fax: +44 23 8059 3045
> > Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
> > http://www.ecs.soton.ac.uk/~hg/ <http://www.ecs.soton.ac.uk/%7Ehg/> 
> <http://www.ecs.soton.ac.uk/%7Ehg/>
> >
> >
> >
> >
> > http://www.bbc.co.uk
> > This e-mail (and any attachments) is confidential and may contain
> > personal views which are not the views of the BBC unless specifically
> > stated.
> > If you have received it in error, please delete it from your system.
> > Do not use, copy or disclose the information in any way nor act in
> > reliance on it and notify the sender immediately.
> > Please note that the BBC monitors e-mails sent or received.
> > Further communication will signify your consent to this.
>
>
> --
>
> Regards,
>
> Kingsley Idehen
> President&  CEO
> OpenLink Software
> Web: http://www.openlinksw.com
> Weblog: http://www.openlinksw.com/blog/~kidehen 
> <http://www.openlinksw.com/blog/%7Ekidehen>
> Twitter/Identi.ca: kidehen
>
>
>
>
>
>
> http://www.bbc.co.uk
> This e-mail (and any attachments) is confidential and may contain 
> personal views which are not the views of the BBC unless specifically 
> stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in 
> reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this. 


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen







http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
Received on Monday, 17 October 2011 15:24:26 UTC