W3C home > Mailing lists > Public > public-vocabs@w3.org > July 2013

Re: schema.org growth what are the limits?

From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Date: Mon, 29 Jul 2013 18:09:23 +0200
Cc: "public-vocabs@w3.org" <public-vocabs@w3.org>
Message-Id: <E063EE6B-3A64-42F8-881C-D46B57FEC12E@ebusiness-unibw.org>
To: Bernard Vatant <bernard.vatant@mondeca.com>
Hi Bernard,
I do currently not find a better reference than [1], but I already said on this list that I think the schema.org-approach will scale only up to ca. 1,000 types. Otherwise, navigating the type hierarchy and learning how to use the standard will become too burdensome, and reaching consensus will become too difficult.

See also [2] on the assumed effects between vocabulary size and adoption.

One could likely push the boundaries a little bit by adopting a strictly frame-based paradigm with properties officially attached only to a type or its subtypes (i.e. no global identifiers, resp. no common meaning for properties across types). This would free us from the need to find catchy, intuitive, yet generally valid names for properties (e.g. "effect" for a MedicalTreatment could mean something different than "effect" for WebService; all property names and types made up in this example).

Then schema.org could maybe grow to a somewhat bigger, rather "flat" collection of types and associated properties.

Personally I am convinced that 1,000 well-chosen types in combination with the additionalType property will be sufficient for very, very powerful modeling. On the other hand, I would be very hesitant to accept big bulk imports of types from external schemas. Let's delegate the more specific (and also more frequently changing, see [3]) specializations to Wikipedia-based services, like www.productontology.org or Wikidata.

Martin

[1] http://lists.w3.org/Archives/Public/public-vocabs/2013Jan/0059.html
[2] http://www.heppnetz.de/files/IEEE-IC-PossibleOntologies-published.pdf
[3] http://www.heppnetz.de/files/ConceptualDynamics-EKAW2008-CRC-final6.pdf

On Jul 26, 2013, at 4:13 PM, Bernard Vatant wrote:

> Hello all
> 
> This is a question I has been wanting to push here for quite a while. 
> If my counting are correc, schema.org latest version has 428 classes + 582 properties = 1010 elements.
> The number of candidate and potential extensions is likely to grow at a steady pace. Now that a handful of early adopter industries and communities have successfully pushed their vocabularies into schema.org, many others are likely to follow when they discover their obvious interest in doing so. And this when is now or quite soon, obviously.
> 
> This growth is a good thing, but it will, and actually has already hit known limits in this kind of exercise, which once again boils down to represent the whole world in a unique model, and a unique namespace.
> 
> The first point is not really an issue. The semantics of schema.org are "soft" enough to accomodate slight inconsistencies between various branches of the vocabulary, for exemple the same property used here and there with slightly different semantics will not really be an issue if those branches are unlikely to be used in the same context.
> 
> The unique namespace is another issue. Once a name has been used to identify a class or a property, it can't be reused for something else. New extensions will have to cope with the legacy. Suppose I want to use http://schema.org/study for something else than a MedicalEntity and MedicalStudy Suppose DDI people want to introduce their concept of Study [1]. What will be the negotiation process? 
> 
> More generally is there a limit one could set for a manageable sensible size of the vocabulary? 10,000? 100,000? 
> Is there a plan of any kind to put a limit in size or in time to the vocabulary growth? 
> 
> Thanks for your thoughts.
> 
> Bernard
> 
> [1] http://rdf-vocabulary.ddialliance.org/discovery
> 
> 
> 
> 
> -- 
> Bernard Vatant 
> Vocabularies & Data Engineering
> Tel :  + 33 (0)9 71 48 84 59
> Skype : bernard.vatant
> Blog : the wheel and the hub
> Linked Open Vocabularies : lov.okfn.org 
> --------------------------------------------------------
> Mondeca                             
> 3 cité Nollez 75018 Paris, France
> www.mondeca.com
> Follow us on Twitter : @mondecanews
> ----------------------------------------------------------
> Mondeca is co-chairing
> Long-term Preservation and Governance of RDF Vocabularies 
> at Dublin Core Conference
> <dc2013-Lisbon.jpg>

--------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail:  hepp@ebusiness-unibw.org
phone:   +49-(0)89-6004-4217
fax:     +49-(0)89-6004-4620
www:     http://www.unibw.de/ebusiness/ (group)
         http://www.heppnetz.de/ (personal)
skype:   mfhepp 
twitter: mfhepp

Check out GoodRelations for E-Commerce on the Web of Linked Data!
=================================================================
* Project Main Page: http://purl.org/goodrelations/
Received on Monday, 29 July 2013 16:09:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:28 UTC