- From: Mark Fallu <m.fallu@griffith.edu.au>
- Date: Wed, 30 Jul 2014 15:38:22 +1000
- To: Linked Data community <public-lod@w3.org>
- Message-ID: <CACbbyygSVYvjUz0CLP0ukvduOv51kZR5wTPsVN958MTf3bEBiw@mail.gmail.com>
Hi Michael, You asked: How can URIs from sparql endpoints or OAI-PMH contribute to page rank? > If party A: - produces a system that uses 303 based cooluri to describe their content, and in addition to webpages expose it to the world via sparql endpoint or oai-pmh. and party B: - harvests information via sparql enpoint or oai-pmh and produces a public representation of that content that links back to party A. If the link back is the cooluri that resolves to a page via a 303 redirect and content negotiation, web spiders etc will not be able to follow that inbound link. This means that some of the advantage of being machine harvest-able is lost. Sure your content is indexed, but the "authority" that comes from other people/systems citing your content, reusing your content is greatly diluted. Cheers, Mark On Sat, Jul 19, 2014 at 1:52 AM, Michael Brunnbauer <brunni@netestate.de> wrote: > > Hello Mark, > > I cannot remember this important topic coming up earlier - which is a bit > disturbing. > > The problem would be migitated by people using the URI they see for > linking. > > Why not use the HTML URLs in the HTML pages for internal page rank flow? > > How can URIs from sparql endpoints or OAI-PMH contribute to page rank? > > A real problem would be RDFa where href also sets the object of a triple. > > Regards, > > Michael Brunnbauer > > On Fri, Jul 18, 2014 at 10:05:17PM +1000, Mark Fallu wrote: > > If the links we present to the outside world for harvesting eg. via > sparql > > endpoint, OAI-PMH or open social widget etc is the canonical "individual" > > URI, clients will be able to get to the "display" url, but the google > page > > rank that would normally flow from these external links will not. > > > > > > > The specification of a 303 redirect describes it as: > > http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html > > > > > "The response to the request can be found under a different URI and > SHOULD > > > be retrieved using a GET method on that resource. This method exists > > > primarily to allow the output of a POST-activated script to redirect > the > > > user agent to a selected resource. *The new URI is not a substitute > > > reference for the originally requested resource*. The 303 response MUST > > > NOT be cached, but the response to the second (redirected) request > might be > > > cacheable. > > > > > > > > > The different URI SHOULD be given by the Location field in the response. > > > Unless the request method was HEAD, the entity of the response SHOULD > > > contain a short hypertext note with a hyperlink to the new URI(s)." > > > > > > Google correctly implements the specification and does not assign the > page > > rank of the "individual" URI to the "display" URL as it is "*not a > > substitute reference for the originally requested resource".* > > > > The same is true of internal links, a high page rank home page will not > > pass page rank on to "display" urls if the pathway to those urls is via > > "individual" uri links. > > > > I am not sure what the solution is here as it seems the realms of SEO and > > the conventions of the web they are built on are not a good fit for > > semantic web best practice. > > > > The most minimal compromise I can think of is to move away from the use > of > > a 303 redirect to a redirect that conserves the flow of google page rank. > > > > - "302 Found" redirect is the recommended replacement for 303 for > > clients that do not support HTTP 1.1 and it does allow a certain > amount of > > google page rank to flow. > > - "301 Moved Permanently" is a poor fit for the Cool URI pattern, but > > passes on the full page rank of the links. > > - rewriting all URIs the URL would also work, but would break the > > coolURI pattern. > > > > The pragmatist in me feels that if we are going to make a change for the > > purposes of SEO, it might as well be the one with best return, i.e. 301 > > redirect. > > > > Note: Indexing is not the problem here, content is indexed. The issue > > relates to page rank not flowing through a 303 redirect. > > > > I have tested and can confirm that 303 redirects are an issue for a > number > > of reasons: > > > > - page rank does not flow through a 303 redirect > > - page rank can not be assigned from a url to a uri with a > rel=canonical > > tag if URI does a 303 redirect (preventing aggregation of pagerank > from > > external links to URL) > > - URI and URL are indexed separately > > - rdfa schema.org representations of URIs do not translate to URL > (ie. > > representation described at URL A, talking about URI B, does not get > > connected to representation described at URL B) > > - url parameters are not passed by a 303 redirect. > > - impact on functinality of google analytics tracking eg. traversing > the > > site is seen as a series of direct page visits. > > > > Essentially - as far as search engines are concerned - every URL and URI > is > > an island, with no connections between them. At best a URL can express a > > rel=canonical back to it's corresponding URI, no pagerank will flow > through > > links. > > > > Any guidance you can provide would be appreciated. > > > > -- > > > > o-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > | Mark Fallu > > | Manager, Research Data (Acting) > > | Office for Research > > | Bray Centre (N54) 0.10E > > | Griffith University, Nathan Campus > > | Queensland 4111 AUSTRALIA > > | > > | E-mail: m.fallu@griffith.edu.au > > | Mobile: 04177 69778 > > | Phone: +61 (07) 373 52069 > > o-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > -- > ++ Michael Brunnbauer > ++ netEstate GmbH > ++ Geisenhausener Straße 11a > ++ 81379 München > ++ Tel +49 89 32 19 77 80 > ++ Fax +49 89 32 19 77 89 > ++ E-Mail brunni@netestate.de > ++ http://www.netestate.de/ > ++ > ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) > ++ USt-IdNr. DE221033342 > ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer > ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel > -- o-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- | Mark Fallu | Manager, Research Data (Acting) | Office for Research | Bray Centre (N54) 0.10E | Griffith University, Nathan Campus | Queensland 4111 AUSTRALIA | | E-mail: m.fallu@griffith.edu.au | Mobile: 04177 69778 | Phone: +61 (07) 373 52069 o-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Received on Wednesday, 30 July 2014 05:39:08 UTC