Re: Possible choice for the RDFa default profile prefixes

On Mon, May 2, 2011 at 1:02 AM, Ivan Herman <ivan@w3.org> wrote:

> Guys,
>
> I have been working, on and off, with the Sindice guys the past few weeks
> to see if we can extract a suitable list for vocabularies for a default
> profile. The issue was to find a proper and objective way to determine what
> should be in the list of those vocabularies. Here is what we came up with:
>
> - Any vocabulary that is defined via a W3C Recommendation or a W3C WG/IG
> Note is automatically added to the list. This include obviously rdf, rdfs,
> skos, but also void.
> - For the rest we rely on a search results and some processing on the
> search results, as performed by the Sindice search engine.
> (- If I can also get a similar crawl result from other sources, like Yahoo,
> then we would be able to merge those results somehow. But, at the moment,
> that is not the case...)
>
> I have collected the results in a table on [1], and have also given some
> details on how the crawl results were used and processed[2].
>
> Looking at the results my proposal is, first of all, to rank along the last
> column (that is the default ranking on [1] and also the criteria to choose
> the top 100), because that gives a measure of the widespread usage (or not)
> of a particular vocabulary. (There are some interesting cases like the
> bio/0.1 one: large number of domains and a low number 2nd level domains:
> Giovanni's analysis is that this is based on places like my opera, that
> provides a large number of blogs for users with a local domain.).
> Furthermore, the proposal is to draw a line after the
> rdf.data-vocabulary.org/# one (that is the vocabulary for Google's rich
> snippet), there is indeed a drop in the numbers of the last column. One
> could argue in the case of a number of other vocabularies, and the most
> notable issue is that the good relations ontology, that has a significant
> traction out there, falls outside the list. However, I would like to avoid
> arbitrary choices and stick to objective numbers; such a discussion in the
> community might go on for ages and that is not what we want. Also, in my
> view, the number of prefixes on the default vocabulary should not be
> large...
>
> The crawl results do not say what the prefix should be, they only give the
> vocabulary. The choice of the prefix should probably be based on the
> documentation of the vocabulary. In disputed cases we could of course
> contact the vocabulary authors.
>

prefix.cc should also help to decide on prefixes based on popular usage,
e.g. http://prefix.cc/?q=http://xmlns.com/foaf/0.1/

Steph.


>
> Opinions?
>
> Ivan
>
> P.S. I want to publicly express my thanks to the Sindice team. They did the
> work, I was, mostly, nagging only...:-)
>
> [1] http://www.w3.org/2010/02/rdfa/profile/Sindice-crawl.html
> [2] http://www.w3.org/2010/02/rdfa/profile/Sindice-crawl.html#method
>
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
>
>

Received on Monday, 2 May 2011 15:58:43 UTC