Re: The king is dressed in void from Giovanni Tummarello on 2008-06-12 (public-lod@w3.org from June 2008)

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Fri, 13 Jun 2008 00:56:12 +0100
To: "Peter Ansell" <ansell.peter@gmail.com>
Cc: public-lod@w3.org
Message-ID: <210271540806121656x4f165440j14777375bc9fc2a6@mail.gmail.com>
XML is a step forward. The thing started in RDF with something called
"semantic crawling ontology" (sorry the link is broken, will have it
fixed tomorrow http://www.sindice.com/semantic-crawling-ontology.html
) which had all the terms fo the sitemap in RDF already.Originally we
wanted to propose a "srobots.rdf" :-)

Then I posted example on the lod list many months ago and requests for
comments. The reaction was very clear "RDF is way too complicated for
hand editing, this is Meta Metadata and will need to be hand edited,
sitemaps are MADE to do this and they're extendible" thus many weeks
of more work and the xml version was done (it did took that time).

Please read the posts way back (e.g. stefano mazzocchi's) .
Giovanni

> Also, it would be an improvement to specify this information in RDF
> rather than in XML as sitemaps does if people are expecting to be able
> to utilise RDF stores for querying of the information. If we were just
> stuck with sitemaps.xml as the final point we also couldn't extend it
> easily to provide rdf vocabulary terms such as dc and skos. Have
> sitemaps.xml extensions been used for more than providing links to
> data dumps? Ie, has someone actually used the slicing and sparql graph
> parts to describe a data set?
>
> Cheers,
>
> Peter
>
> 2008/6/13 Giovanni Tummarello <giovanni.tummarello@deri.org>:
>> All of your described functionalities are a subset of what semantic
>> sitemaps are for [1].
>>
>> Specs aside, the paper [2] might be of interest to some is that we
>> went to some distance in conceptually explaining what publishing and
>> retrieving rdf
>> data on the web means, e.g. using "linked open data" paradigm but not only.
>>
>> Giovanni
>>
>> [1] http://sw.deri.org/2007/07/sitemapextension/
>> [2] http://www.springerlink.com/content/t607305788356537/
>>
>>
>>
>> On Thu, Jun 12, 2008 at 11:19 PM, Peter Ansell <ansell.peter@gmail.com> wrote:
>>>
>>> I see VOID as going past the need to have a search engine in order to
>>> decide which sparql endpoints you need to use to effectively make
>>> particular queries based on sample graphs provided by either SPARQL
>>> construct queries on the dataset sparql end point or by way of an
>>> example document specifying typical rdf statements that are contained
>>> in a data set.
>>>
>>
>>> Your method does not currently enable someone to backtrack to which
>>> endpoints they should use to get more information about someone if
>>> they don't just want to do a text search, or rely on someone pre
>>> indexing billions of rdf statements for them. Linked Data should be
>>> about self-discovery. If someone ever finds a URI used on the semantic
>>
>>> web they can now find a way to a sparql end point with more related
>>> information. Ie, if you found
>>> http://dbpedia.org/resource/Bioinformatics as a URI on the web and
>>> sindice.com had not yet indexed it for you then you could discover it
>>> by chopping off the path and accessing the robots.txt file provided by
>>> that domain, a method currently used by web 1.0 search engine
>>> crawlers. Then you can discover the end_point and a typical example,
>>> along with provenance and subject information by using the void method
>>> (robots.txt->sitemap.xml->void rdf->either SPARQL CONSTRUCT or example
>>> file). Of course, this mechanism will not work directly for purl.org
>>> users.... but it is a good start as far as I can tell for datasets
>>> which utilise their own domains for naming and hosting. Purl.org users
>>> could be supported if you acknolwedge that the real provider is the
>>> one redirected to by the 302 redirect from purl.org and you could
>>> start the discovery process there.
>>>
>>> The method of discovering this information,  don't require an extra
>>> level of complexity as the void rdf simply adds the sparql endpoint
>>> information, example information, and provenance information.
>>>
>>> I don't see VOID as having more than one class and two or three
>>> properties when it is eventually created.
>>>
>>> Class: void:Dataset
>>>
>>> Property: void:sparql_end_point, void:example_file, void:example_uri
>>> (which you can use with a SPARQL construct on the
>>> void:sparql_end_point in order to get the same as a void:example_file)
>>>
>>> The rest of the descriptions seem to be allowed for by current
>>> vocabularies such as foaf and dc so the actual specification will be
>>> very highly modular and hence easy to implement and agree on IMO.
>>>
>>> Cheers,
>>>
>>> Peter
>>>
>>> 2008/6/12 Giovanni Tummarello <giovanni.tummarello@deri.org>:
>>>>
>>>> Wasnt RDF all aabout being self describing?
>>>>
>>>>  if i say "giovanni works in research" .. do i really need a
>>>> vucabolary that says "this rdf contains informations that describe
>>>> what people claim to be working on" that's a suicide. If this is the
>>>> case (which i totally dont believe) then the king is seriously naked
>>>> and there is no hope whatsoever that RDF is going to have any
>>>> relevance (and there i say it)
>>>>
>>>> to find one such file, instead of having to invent agree and markup
>>>> i'd say its much easier to do something like [1] or [2].
>>>> this is not marketing. its a plea to NOT jump on more layers of stuff
>>>> when the previous layers have really to show there value and
>>>> adoptability still. Solve some simple use cases first then jump to the
>>>> more complex one.
>>>>
>>>> Giovanni
>>>>
>>>> [1] http://demo.sindice.com/search?q=*+%3Chttp%3A%2F%2Fwww.w3.org%2F2006%2Fvcard%2Fns%23title%3E+%27research%27&qt=advanced
>>>>
>>>> or http://sindice.com/search?q=http%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fknows&qv=http%3A%2F%2Frichard.cyganiak.de%2Ffoaf.rdf%23cygri&qt=ifp
>>>>  (documents which contain statements in which someone claims to be
>>>> knowing richard)
>>>>
>>>> [2] http://forum.sindice.com/showthread.php?t=10
>>>>
>>>>
>>>
>>>
>>
>
>
Received on Thursday, 12 June 2008 23:56:52 UTC