- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Mon, 17 Oct 2011 12:27:11 -0400
- To: public-lod@w3.org
- Message-ID: <4E9C575F.7080000@openlinksw.com>
On 10/17/11 11:22 AM, Michael Smethurst wrote: > > Hi Kingsley, he sighed > > I have to admit I'm not sure what any of this means. > Okay. > > I've tried reading through several times but it just seems to be the > same set of random buzzwords assembled in fairly random order. > Okay, so I'll try one more time to reduce the confusion. There is nothing random about what I am saying. As for buzzwords, I can't control your perceptions :-) Please read on.. > > > As a relative outsider / lurker on this list it does seem that your > main contribution is an attempt to change the language of every mail > sent to match a language that only you seem to speak. And I know > you'll reply with something like, "in the wider IT industry the > language I use is common." I do wonder if you're confusing the set of > people that make up the wider IT industry with the set of people who > know / care about the internal workings of odbc. I'm not sure they're > the same thing... > > I know you're only trying to be helpful and add clarity but I not > convinced that that tends to be the end result. > > That aside some questions: > > - why do the URblahs that end up in the address bar for dbpedia > contain /page/ or /data/? > Because: 1. http://dbpedia.org/page/Linked_Data -- an HTML based data object (resource) that describes 'Linked Data' 2. http://dbpedia.org/data/Linked_Data.n3 -- an N3 based data object (resource) that describes 'Linked Data' etc.. Note, a Document is a compound Object. Now bearing in mind my involvement with DBpedia Linked Data deployment (i.e., ensuring the URIs deliver on Linked Data principles), you can decode what's happening via: 1. http://goo.gl/hYBH9 -- using URI debugger to decode what I am trying to explain here re. Data Object Names, Addresses, and actual Representation (what goes across the wire from server to client, courtesy of varying levels of URI indirection). Note what exposed via: 1. <head/> using <link/> based relations 2. HTTP response metadata (basically, what the response headers deliver). > > - do you think the address bar should ever show .html when browsing > myexperiment.org? > The address bar should display whatever the server sends back as *its* resource (object) address. This is user agent discernible via HTTP response headers. In doing so, you have a bookmark friendly URL and no confusion that will arise via indirection (i.e., flipping stuff around in the address bar). > > - isn't the question of whether i want / get html or data dependent on > what I accept and not on the URblah I request? > You user agent can negotiate the representation of data it seeks from the server. The server will always send back an address of the data object (resource) in question irrespective of actual data representation. > > - why do dbpedia links target /resource/? > DBpedia is a linked data space grounded in the Web's data space dimension. Thus, it uses: http://dbpedia.org/resource/Linked_Data as a Data Object ID (Name). This Name resolves to actual data via indirection. Basically you have this route to actual data (actual data format sent to client remains negotiable): [Object Name e.g. /resource/ ]-->[Object Address e.g. /page/ ]-->[Actual Data Object in a variety of negotiable representations] . The above showcases distinction of the following: 1. Data Object Name - an indirect reference that serves as a name 2. Data Object Address - a direct access reference that serves as an address 3. Actual Data Object Representation -- the EAV/SPO graph pictorial based data (representing Object Description) that gets serialized from server to client. DBpedia (as you stated in an earlier post) can be seen as an anti-pattern when you look via the information space context lenses. This is the very reason why I said: we could have emphasized /page/ as the coherent segue to /resource/ since this association is both human and machine discernible. For humans, that's why we have the "About: XYZ" pattern in DBpedia HTML pages so a human can easily discern the subject of the description (delivered in HTML format to the browser). As stated earlier re., data space dimension of the Web, DBpedia isn't an anti-pattern. It simply showcases the use of URIs (hyperlinks) in a manner that distinguishes an Object Name from an Object Address. You refer to an Object by Name, but access it via an Address. What you access is a chunk of structured data streamed across the wire from server to client. > > - why would you ever use a href to point to something you can't GET? > An HTTP scheme URI is an Identifier first, it can be used to name anything. Is it the most intuitive identifier in all naming scenarios? Of course not. When constructing Linked Data URIs have to resolve, and the resolution has specific expectations i.e., return a descriptive representation of a URIs referent in structured form i.e., an EAV/SPO based graph pictorial. The ubiquity of HTTP and its core architectural prowess make it a powerful and low cost identifier choice for InterWeb scale Linked Data. > > > - isn't that what 'about' is for? > Where did that come from? > > > - do you expect any sane publisher to take a 303 hit for every request? > That statement is delving into the mechanics without first reaching agreement on the fundamental concept. Best we question sanity post agreement about core concepts :-) Remember, you think I am tossing out buzzwords. That indicates concept fogginess since the terms I use predate the WWW or Semantic Web -- see reference sections at the end of the post. Please do understand, and btw. TimBL has never claimed this overtly or covertly, the WWW and Semantic Web aren't the starting points for concepts such indirection, linked data structures, graphs etc.. Every time I bring up the issue of WWW and Objects, I provide a links to reference material. Thus, here are some links (cited in the past on a number of occasions) that can at least help you put the buzzword misconceptions to bed: 1. http://goo.gl/Ez3CC -- a G+ post about this matter 2. http://www.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html -- about Object Identity 3. http://www.w3.org/Addressing/rfc1630.txt --Universal Resource Identifiers and WWW by TimBL 4. http://lists.w3.org/Archives/Public/www-tag/2009Aug/0000.html -- account of (by TimBL) of how "Resource" became part of the URI narrative 5. http://www.w3.org/People/Connolly/9703-web-apps-essay.html -- old essay by Dan Connolly circa. 1994 . Kingsley > > cheers > michael > > > -----Original Message----- > From: public-lod-request@w3.org on behalf of Kingsley Idehen > Sent: Mon 10/17/2011 12:50 PM > To: public-lod@w3.org > Subject: Re: Address Bar URI > > On 10/17/11 1:48 AM, Michael Smethurst wrote: > > > > Hi Kingsley > > > > I've heard you make this argument several times in the past. But I > > don't understand why. How does it benefit publishers to expose the > > representation address? > > > > I am saying that data representation is what a browser retrieves from an > Address. a URL != representation, its is the address from which you > access data in a representation negotiated between the client and the > server. > > In the Web's information space dimension, hyperlinks are an Address / > Name conflation. Ambiguity isn't harmful. > Browsers work with hyperlink based Addresses / Locators (URLs). > > This is how everyone publishes data on the Web. Basically, there is only > a single level of indirection between a hyperlink based address and the > HTML based data objects (resources) to which they resolve. > > It's safe to conclude that if this is what everyone does on the > massively successful WWW then benefit is self evident. > > > How does it benefit consumers? > > > > A publisher exposes data via an Address, with one level of indirection > via a hyperlink based address as outlined above. A consumer simply > de-references the hyperlink or places it via cut&paste (if known and > remembered) into a Browser's address bar, again via single level of > indirection, and they have their data -- typically an HTML based Web > Page. > > > > > > Ignoring linked data for a moment.... If I ask for some information > > (an ir uri) over HTTP what I get back back depends on what I ask for > > and what I choose to accept (serialisation, language). There are > > benefits to this: > > > > - the same links work for everyone (or as wide a set of people as > > possible) > > - you don't expose multiple uris to the web > > - you don't split your google juice > > > > I don't believe anything I am saying (or said) contradicts any of the > items above. Google juice is driven primarily by hyperlinks that have a > single level of indirection re. actual resource (data object) access. > Google and everyone else grabs resources (data objects) from addresses, > uses address (without penalty) as prime identifiers for the resources, > then indexes, and applies its special ranking algorithms etc.. > > > > > This is just the way http works and what leads to its most important > > aspect: universal access to information. > > > > Yes, that's why I said: Information Space dimension at the top of this > post. > > > > > (It's also won as an argument. The "one web" approach won as soon as > > the first product manager eagerly clicked a link in twitter and got > > bounced to the homepage of their "product" because different > > serialisations sat on different uris) > > > > The only thing linked data adds to this is the realisation that people > > want to talk about things that can't be sent over http. So we end up > > with nir uris to give us something to talk about (in all senses :-) ) > > and 303s in case anyone is silly enough to request them > > > > There is a cleaner way (IMHO) of talking about URI abstraction and the > indirection implicit in said abstraction. The information space > dimension doesn't care about the distinction between hyperlink based > resource (object) addresses and hyperlink based resource (object) names. > > Put differently, in the Web's information space dimension a URL (a > Location Name or Address) serves as the Data Source Name (a DSN in > ODBC/JDBC parlance) of focus. In the data space dimension, on the other > hand, there is a distinction between a hyperlink based data object > (resource) name and a hyperlink based data object (resource) address. > This distinction is achieved, honored, protected, and exploited via > indirection. > > Where you 303 or not, the fundamental act is one of indirection such > that a data object (resource) name is distinct from its address. Thus, > you can exploit two age-old operations that are fundamental to computer > science: > > 1. de-reference > 2. address-of. > > > > > Agree completely that the address bar should never expose the nir uri > > because that doesn't make any sense. > > > > Yes. But even worse, the term Non Information Resource simply compounds > all that's wrong with the use of Resource when we should be referring to > Object. I say this with a much broader community in mind, one that is > much larger than the Semantic Web or Linked Open Data communities (even > when combined). > > DBMS (relational and object) find the term: Resource and its use obscure > at best. Folks from this community (plus their millions of horizontal > and vertical application developers) understand indirection, but simply > don't recognize it when they encounter typical Linked Data collateral > and discourse. > > > > Web browsers are information browsers so should expose the uri of the > > information. Just not the specific serialisation / language > > > > Yes, but I prefer to say: Web browsers are information browsers so they > should expose the URLs (Location as per "Location:" response header) of > the the documents they retrieve. Of course, a URL is a kindOf URI, but > as stated at the top of this post, the distinction doesn't matter when > discourse is all about the information space dimension, not so when we > move into the data space dimension :-) > > > > > > > > My 2p > > > > Michael > > > > > > -----Original Message----- > > From: public-lod-request@w3.org on behalf of Kingsley Idehen > > Sent: Sun 10/16/2011 2:41 PM > > To: public-lod@w3.org > > Subject: Re: Address Bar URI > > > > On 10/16/11 8:50 AM, Michael Smethurst wrote: > > > > > > Hi Hugh > > > > > > Apologies for top post; blame webmail :-/ > > > > > > (Using labels as they appear in my head; feel free to translate to > > > labels as they appear in your head) > > > > > > If you're publishing linked data using 303s *and* the links in your > > > html are targeting the nir uri (as per dbpedia): > > > > > > <a class="uri" rel="dbpedia-owl:composer" > > > xmlns:dbpedia-owl="http://dbpedia.org/ontology/" > > > > > > href="http://dbpedia.org/resource/Simon_May"><small>dbpedia</small>:Simon_May</a> > > > > > > then, yes, you are already exposing 2 sets of uris to the web and to > > > google. A google bot crawling your pages is going to see all your > > > internal links pointing to .../thing/... whilst a user of you site is > > > going to see the result of the 303 (.../page/... or whatever) in their > > > address bar. If they want to blog about your stuff chances are they'll > > > copy and paste into their post the URblah they see in their address > > > bar. So when a google bot crawls their blog it'll see links pointing > > > to .../page/... Google doesn't consolidate pagerank for the two sets > > > of uris for 303s > > > > > > But if you're publishing with 303s and linking internally to nir uris > > > you've already got a problem. I can just about imagine attempting to > > > convince people that it's worthwhile having a .../thing/... and 303ing > > > it to a .../page/... for the .2% of people who care about the > > > distinction / consume rdf. Attempting to convince the BBC platform > > > people that every request a user makes as they browse round the site > > > should be routed through a 303 would probably get my coffee spike with > > > arsenic. It just isn't going happen. Ever. If the advice from this > > > community is to target html links to the nir uri I think that's going > > > to cause a lot of problems for a lot of publishers... > > > > > > For BBC linked data stuff we deal with 3 classes of URblah: > > > > > > 1. representation urls (http://www.bbc.co.uk/programmes/b015v0nh.html, > > > http://www.bbc.co.uk/programmes/b015v0nh.mp, > > > http://www.bbc.co.uk/programmes/b015v0nh.json, > > > http://www.bbc.co.uk/programmes/b015v0nh.rdf etc) > > > > > > 2. information resource uris > (http://www.bbc.co.uk/programmes/b015v0nh) > > > > > > 3. nir uris (http://www.bbc.co.uk/programmes/b015v0nh#programme) > > > > > > The first set are never exposed except as content location headers. > > > And the links inside the html all *target* the second set: > > > > > > <a href="/programmes/b015j0ng" typeof="po:Episode" > > > about="/programmes/b015j0ng#programme"><span class="title" > > > property="dc:title">Episode 3</span></a> > > > > > > The second set just conneg (with some added device detection) to the > > > first set with no redirection. These are the only uris ever really > > > exposed. And they're the main benefits of doing any of this. Because > > > we don't expose (in the address bar) the representation urls, users > > > link to and share the ir uris. And the people they share with get back > > > a representation appropriate to their needs (for device, > > > serialisation, accessibility, language...) > > > > > > The third set is only there to be talked about in the rdf/rdfa, for > > > people in the world (possibly confined to this list :-) ) who think > > > the distinction between nirs and irs matters. But it's never exposed > > > in the address bar > > > > > > So our internal (html) links point to the second set. And the address > > > bar points to the second set. So other people link to us using the > > > second set and we don't leak google juice all over our shoes. And the > > > 3rd set is there for those folks who care > > > > > > Comparing that to dbedia: all internal links point to the first set, > > > there's then a conneg / 303 dance and the thing that ends up in your > > > address bar is something that's half way between an ir uri and an ir > > > representation url (.../page/... or .../data/...). So definitely > > > exposing 2 sets of URblahs to the web and not exposing the most > > > important one: the information *resource* uri. Which is the most > > > important one because that's the one you want people to be able to > > > share without worrying about whether the representation is appropriate > > > to the needs of the people they're sharing with > > > > > > IMO anywhere were you end up with /html or .html or .rdf or .mp or .cy > > > or /page or /data exposed in the address bar is broken because the > > > representation returned should be dependent on your accept headers, > > > not the (information) resource you request. Or we're forgetting > > > everything uncle roy ever taught us. Exposing representation in the > > > address bar means your content / stuff can't be shared universally. > > > And seo is just a side effect of being able to share / link > universally > > > > > > So at the risk of being controversial I think the dbpedia publishing > > > pattern is a bit of an anti-pattern and we shouldn't be encouraging > > > other publishers / developers to adopt it > > > > > > > In the Web's information space dimension, yes, you could say DBpedia's > > approach is an anti-pattern, that's basically another way of > > articulating what I tried to convey via the 1-3 sequence in one of my > > earlier posts. In the Web's data space dimension DBpedia's approach is > > natural and obvious. Trouble is, a majority of Web users and Developers > > are only gradually beginning to sense the aforementioned data space > > dimension. > > > > We can make the manifestation of the Web's data space dimension > > unobtrusive if indirection is introduced properly which ultimately > means: > > > > 1. Addresses (URLs) stay in the Address Bar. > > 2. Actual Object (Resource) Identifiers (generic de-referencable URI > > based Names) are discovered by introspection (human or machine) > > 3. Accessing Data Objects (Resources) by Name or Address becomes > > optional and should be driven by UX patterns. > > > > We can never negate Name and Address disambiguation, once we are in the > > Web's data space dimension. The use of indirection to solve problems in > > computer science is as old as the subject matter itself :-) > > > > > > > [SNIP] > > > > > > > > Michael > > > > > > > Kingsley > > > > > > > > > > > > -----Original Message----- > > > From: Hugh Glaser [mailto:hg@ecs.soton.ac.uk] > > > Sent: Sat 10/15/2011 2:43 PM > > > To: Michael Smethurst > > > Cc: Norman Gray; Linking Open Data; Don Cruickshank > > > Subject: Re: Address Bar URI > > > > > > Thanks Michael. > > > Very helpful to bring in the SEO perspective, even on a Friday > evening. > > > > > > On 14 Oct 2011, at 21:28, Michael Smethurst wrote: > > > > > > > Have to say from a pragmatic point of view that using replaceState > > > to switch between IR and NIR (or whatever we're supposed to call them) > > > URIs feels like bad advice for most developers > > > > > > > > Users in older browsers are going to see (and copy and paste) one > > > set of URIs whilst users of more modern browsers are going to see (and > > > copy and paste) another > > > Maybe now is not the time to do it - but always being backwards > > > compatible is not great. > > > Actually, users of the old browsers are currently disallowed from > > > copying and pasting the address bar, if what they are after is the NIR > > > or whatever we call it. > > > The myexperiment.org site has a real problem with this, and on a real > > > system. > > > I think currently they have to accept that users do it, and then patch > > > up afterwards (by removing the .html). > > > > > > > > So you end up exposing two sets of URIs to the web and to Google et > > > al. Google only consolidates page rank for inbound links on 301s (and > > > not 302s or 303s) so you'd end up throwing your findability away for > > > an esoteric distinction that no-one quite understands. Or understands > > > but doesn't quite agree with :-) > > > But I think we are currently exposing two sets of URIs. > > > If we do the rewrite we will only be exposing one set of URIs to the > > > users. > > > > > > At first I thought "Oh no", we mustn't compromise SEO, and you > > > describe how rewriting the address bar does. > > > But now I am afraid I don't understand why it does. > > > The only change is what the user sees in the Bar - so how would that > > > affect the SEO? > > > Can you elaborate on how it affects SEO please? > > > I see that, for example googling '"Hugh Glaser" site:semanticweb.org' > > > gets me > > > http://data.semanticweb.org/person/hugh-glaser as the top hit, and > > > seems to ignore > > > http://data.semanticweb.org/person/hugh-glaser/html > > > > > > > > For now cross browser support for pushState and replaceState is > > > pretty shonky [1]. It's useful when product managers demand an "app > > > like experience" because you can do all the shiny ajax stuff without > > > nasty ajax #s and it all looks good on their iDevices. They don't need > > > to know that's not what most people see :-) > > > > > > > > With apologies for bringing up S*E*O on a Friday evening. And that > > > aside it just feels like asking people to add more complexity to > > > sidestep existing complexity that they don't understand / see the need > > > for in the first place... > > > Remember it was a developer who asked me in the first place, who saw > > > it as an answer to a serious problem he has with the users' > > interactions. > > > We should always incline to pushing just a bit more complexity onto > > > the few developers, rather than onto the many, many more users, I > think. > > > > > > Best > > > Hugh > > > > > > > > > > > > [1] http://caniuse.com/#search=replaceState > > > > > > > > > > > > -----Original Message----- > > > > From: public-lod-request@w3.org on behalf of Hugh Glaser > > > > Sent: Fri 10/14/2011 4:22 PM > > > > To: Norman Gray > > > > Cc: Linking Open Data; Don Cruickshank > > > > Subject: Re: Address Bar URI > > > > > > > > I am really no expert - really, so showing my ignorance here. > > > > I understand: > > > > > > > > JS: > > > > window.history.replaceState('Object', 'Title', '/another-new-url'); > > > > will do it happily, but I guess HTML5 is required. > > > > You can use it to change path and search strings, but not protocol > > > or domain, I understand. > > > > > > > > > > > > On 14 Oct 2011, at 15:26, Norman Gray wrote: > > > > > > > > > > > > > > Hugh, greetings. > > > > > > > > > > On 2011 Oct 14, at 13:08, Hugh Glaser wrote: > > > > > > > > > >> My colleague, Don Cruickshank asked me if it was good practice to > > > rewrite the URI in the Address Bar to be the NIR, rather than the IR. > > > > >> I was surprised, but he tells me that it is permitted in HTML5. > > > > > > > > > > Can you expand on this a little? > > > > > > > > > > Is this some HTML5 cleverness that lets one declare in the HTML > > > what the address bar should display? Or is it some Javascript > > > kludge^Wgadget that does it, in which case what is the sense in which > > > this is 'permitted' in HTML5 and wasn't before? > > > > > > > > > > All the best, > > > > > > > > > > Norman > > > > > > > > > > > > > > > -- > > > > > Norman Gray : http://nxg.me.uk > > > > > SUPA School of Physics and Astronomy, University of Glasgow, UK > > > > > > > > > > > > > -- > > > > Hugh Glaser, > > > > Web and Internet Science > > > > Electronics and Computer Science, > > > > University of Southampton, > > > > Southampton SO17 1BJ > > > > Work: +44 23 8059 3670, Fax: +44 23 8059 3045 > > > > Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 > > > > http://www.ecs.soton.ac.uk/~hg/ <http://www.ecs.soton.ac.uk/%7Ehg/> > > <http://www.ecs.soton.ac.uk/%7Ehg/> <http://www.ecs.soton.ac.uk/%7Ehg/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://www.bbc.co.uk > > > > This e-mail (and any attachments) is confidential and may contain > > > personal views which are not the views of the BBC unless specifically > > > stated. > > > > If you have received it in error, please delete it from your system. > > > > Do not use, copy or disclose the information in any way nor act in > > > reliance on it and notify the sender immediately. > > > > Please note that the BBC monitors e-mails sent or received. > > > > Further communication will signify your consent to this. > > > > > > -- > > > Hugh Glaser, > > > Web and Internet Science > > > Electronics and Computer Science, > > > University of Southampton, > > > Southampton SO17 1BJ > > > Work: +44 23 8059 3670, Fax: +44 23 8059 3045 > > > Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 > > > http://www.ecs.soton.ac.uk/~hg/ > <http://www.ecs.soton.ac.uk/%7Ehg/> <http://www.ecs.soton.ac.uk/%7Ehg/> > > <http://www.ecs.soton.ac.uk/%7Ehg/> > > > > > > > > > > > > > > > http://www.bbc.co.uk > > > This e-mail (and any attachments) is confidential and may contain > > > personal views which are not the views of the BBC unless specifically > > > stated. > > > If you have received it in error, please delete it from your system. > > > Do not use, copy or disclose the information in any way nor act in > > > reliance on it and notify the sender immediately. > > > Please note that the BBC monitors e-mails sent or received. > > > Further communication will signify your consent to this. > > > > > > -- > > > > Regards, > > > > Kingsley Idehen > > President& CEO > > OpenLink Software > > Web: http://www.openlinksw.com > > Weblog: http://www.openlinksw.com/blog/~kidehen > <http://www.openlinksw.com/blog/%7Ekidehen> > > <http://www.openlinksw.com/blog/%7Ekidehen> > > Twitter/Identi.ca: kidehen > > > > > > > > > > > > > > http://www.bbc.co.uk > > This e-mail (and any attachments) is confidential and may contain > > personal views which are not the views of the BBC unless specifically > > stated. > > If you have received it in error, please delete it from your system. > > Do not use, copy or disclose the information in any way nor act in > > reliance on it and notify the sender immediately. > > Please note that the BBC monitors e-mails sent or received. > > Further communication will signify your consent to this. > > > -- > > Regards, > > Kingsley Idehen > President& CEO > OpenLink Software > Web: http://www.openlinksw.com > Weblog: http://www.openlinksw.com/blog/~kidehen > <http://www.openlinksw.com/blog/%7Ekidehen> > Twitter/Identi.ca: kidehen > > > > > > > > http://www.bbc.co.uk > This e-mail (and any attachments) is confidential and may contain > personal views which are not the views of the BBC unless specifically > stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in > reliance on it and notify the sender immediately. > Please note that the BBC monitors e-mails sent or received. > Further communication will signify your consent to this. -- Regards, Kingsley Idehen President& CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Monday, 17 October 2011 16:28:01 UTC