- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Mon, 4 Apr 2011 13:51:02 +0530
- To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Francisco Javier López Pellicer <fjlopez@unizar.es>, semantic-web <semantic-web@w3c.org>
Hi Martin, On 4 Apr 2011, at 13:44, Martin Hepp wrote: > Since Semantic Sitemaps don't validate in Google tools, it is hard to convince site-owners to use them. > > However, there is a work-around: You can publish BOTH a regular sitemap and a semantic sitemap for your site and list both in the robots.txt file. > > Google should accept the regular one (you could also submit this to them manually) and ignore the semantic sitemap. RDF-aware crawlers would find both and could prefer the semantic sitemap. Yes, this works AFAIK. But this style of using Semantic Sitemaps loses their main advantage: being a simple extension of an established format that many webmasters already use. Best, Richard > > The downside of this approach is that you risk to increase the crawling load on your site. But I would assume you could minimize the overlap of URIs in both - e.g., you do not need to tell Google of your compressed RDF dump file resources. > > Best wishes > > Martin > > On Apr 4, 2011, at 8:53 AM, Richard Cyganiak wrote: > >> Hi Giovanni, >> >> Semanitc Sitemaps seemed like a good idea because it was a very simple extension to standard XML Sitemaps, which are a widely adopted format supported by Google and other major search engines. >> >> What killed Semantic Sitemaps for me is the fact that adding *any* extension element, even a single line, makes Google reject the Sitemap. >> >> In practice, XML Sitemaps are not an extensible format. >> >> On the question of complexity of Sitemaps and VoID: Publishers will get it right if and only if there is a) some serious consumption of the data that publishers actually care about and b) a validator. At the moment neither a) nor b) is given, neither for Semantic Sitemaps nor for VoID. >> >> Best, >> Richard >> >> >> On 3 Apr 2011, at 18:16, Giovanni Tummarello wrote: >> >>> With the Sitemap extension called Semantic Web Sitemap we did indeed >>> give a very simple alternative. >>> It was also partially adopted >>> >>> http://www.arnetminer.org/viewpub.do?pid=190125 >>> >>> but what breaks it for that protocol is the part about explaining (to >>> a machine) how to go from a dump to "linked data publishing" which is >>> a very fuzzy concent as fuzzy as "describe" >>> >>> the chances of someone getting that file actually right were slim to >>> begin with (we had to correct several times those who tried) and as >>> far as my reports go the chances of getting void right >>> (which is in RDF therefore much less intuitive for human editing than >>> a simple XML like sitemaps) cant get much better. >>> >>> i personally think a single line in the sitemap.xml file is really >>> what'sneeded so wrt this this part of the extention really does its >>> job. however until there is someone seriously consuming this there >>> wont be a need to standardize. >>> >>> Gio >>> >>> >>> >>> >>> On Sun, Apr 3, 2011 at 11:06 AM, Francisco Javier López Pellicer >>> <fjlopez@unizar.es> wrote: >>>> >>>>> >>>>> A related question is SPARQL endpoint fingerprinting... Which >>>>> is not necessarily straightforward as often people put them >>>>> behind HTTP reverse proxies that stomp on identifiable >>>>> headers... In principle it would be interesting to do a >>>>> survey to see the relative prevalence of different SPARQL >>>>> implementations. >>>> >>>> Agree. >>>> >>>> SPARQL endpoint discovery and SPARQL endpoints fingerprinting could be two >>>> research lines related with the architecture of SemWeb: >>>> >>>> - Indexing SPARQL enpoint (with/without the help of vocabularies such as >>>> void) -> A hint for knowing the effective size of the SemWeb initiatives >>>> >>>> - SPARQL endpoint fingerprint identification -> "Market share" analysis of >>>> SPARQL technology pervalence >>>> >>>> -- fjlopez >>>> >>>> >>> >> >> >
Received on Monday, 4 April 2011 08:21:48 UTC