W3C home > Mailing lists > Public > public-csv-wg@w3.org > September 2014

Re: Using schema.org Dataset metadata properties

From: Ivan Herman <ivan@w3.org>
Date: Sun, 14 Sep 2014 09:27:33 +0200
Cc: Jeni Tennison <jeni@jenitennison.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-Id: <AA34395B-9005-41B2-8434-DFCFEF9597CA@w3.org>
To: Yakov Shafranovich <yakov-ietf@shaftek.org>

On 14 Sep 2014, at 03:32 , Yakov Shafranovich <yakov-ietf@shaftek.org> wrote:

> Is there any way to know which one of the two has greater adoption?

Great question, Yakov. And I do not think there is a clear cut answer.

Chris Bizer & co have made some crawls, although with different questions in mind:

http://bit.ly/1qTQrBj
http://webdatacommons.org/structureddata/2013-11/stats/stats.html

The results seem to concentrate on individual terms rather than vocabularies, and they also seem to show a large majority of schema.org terms. However, there is a caveat: the crawl is on HTML files; these are the primary 'target' of schema.org so far (I know this is changing). Webmasters follow the search engines' advise in that and they use schema.org vocabulary terms. Also, HTML files still use, in majority, microformats to mark up files; microformats have turned, essentially, into a schema.org syntax. One more reasons for webmasters to favour schema.org.

However, CSV files are not HTML. Of course, if indexed by search engines through schema.org terms, this will become yet another incentive to use them as part of the metadata, but that is still out in the future. But CSV is also data that may be used in different setting on the Web of Data, where the situation may be different. Ie, the comparison should also be with other data sets out there, regardless of format. For that purpose, another source of statistics may be what the LOV people have done:

http://lov.okfn.org/dataset/lov/stats/

which, though referring to terms again, seems to give DCTERM as a winner. But there is again a caveat: the data comes from, essentially, RDF data spread around the world where, so far, the schema.org vocabulary is not widely used but, in my personal prediction, that may change a lot over the years as schema.org becomes larger.

Also, none of these data say anything about the usage of metadata vocabularies in a non-RDF setting (I consider microdata as RDF in this sense). I suspect that DCTERM is pretty much a clear winner in that domain (e.g., for metadata attached to electronic books in terms of good-old <meta> elements, to cite another example I know a bit better). But I do not have any data to back this up.

Ivan


> 
> Yakov
> 
> On Sat, Sep 13, 2014 at 12:28 PM, Jeni Tennison <jeni@jenitennison.com> wrote:
>> Hi,
>> 
>> In the current metadata document here:
>> 
>>  http://w3c.github.io/csvw/metadata/#common-properties
>> 
>> the spec maps adopts the list of Dublin Core properties for describing tables etc. As ISSUE 6 says, this might not be the right choice: there might be other standard vocabularies that should be used instead or as well.
>> 
>> On the call this week, Dan suggested using schema.org instead, namely the properties on Dataset here:
>> 
>>  http://schema.org/Dataset
>> 
>> The properties there are informed by DCAT which itself was informed by Dublin Core.
>> 
>> Any thoughts?
>> 
>> Jeni
>> 
>> -----Original Message-----
>> From: CSV on the Web Working Group Issue Tracker <sysbot+tracker@w3.org>
>> Reply: CSV on the Web Working Group <public-csv-wg@w3.org>>
>> Date: 10 September 2014 at 13:23:37
>> To: jeni@jenitennison.com <jeni@jenitennison.com>>
>> Subject:  ACTION-26: Write to mailing list re using schema.org rather than dublin core for metadata about csv files, then binding decision on following telcon (CSV on the Web Working Group)
>> 
>>> ACTION-26: Write to mailing list re using schema.org rather than dublin core for metadata
>>> about csv files, then binding decision on following telcon (CSV on the Web Working Group)
>>> 
>>> http://www.w3.org/2013/csvw/track/actions/26
>>> 
>>> On: Jeni Tennison
>>> Due: 2014-09-17
>>> 
>>> If you do not want to be notified on new action items for this group, please update your
>>> settings at:
>>> http://www.w3.org/2013/csvw/track/users/33715#settings
>>> 
>>> 
>>> 
>> 
>> --
>> Jeni Tennison
>> http://www.jenitennison.com/
>> 
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me






Received on Sunday, 14 September 2014 07:28:07 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:27:42 UTC