Re: Proposal for Schema.org extension mechanism from Dan Brickley on 2015-02-15 (public-vocabs@w3.org from February 2015)

From: Dan Brickley <danbri@google.com>
Date: Sun, 15 Feb 2015 17:19:43 +0000
To: Bo Ferri <zazi@smiy.org>
Cc: W3C Web Schemas Task Force <public-vocabs@w3.org>
Message-ID: <CAK-qy=5rsM2x5XWM7xrpNHuEHBub8YeNtC7X25dK68H7E58QpA@mail.gmail.com>
On 15 February 2015 at 14:47, Bo Ferri <zazi@smiy.org> wrote:

> (sorry, I can't resist ;) )
>
> interesting and neat idea. Nevertheless nothing new at all (you know this!
> better than me ;) ). So how does this relate to the already existing, open
> approach of "simply" publishing a(n) ontology/vocabulary with a PURL and
> make use of it. Do we really need everything under the (cooperate (?))
> schema.org umbrella? You know* that the "one vocabulary rule them all"
> approach (even with extension mechanism) doesn't scale and couldn't make any
> domain (and webmaster who should apply it) happy (this is the world out
> there).

It would be a great thing if there were 100s or 1000s of RDF-based
vocabularies out there, with lots of publishers and consumers making
use of them all. Schema.org is a means to an end rather than an end in
itself - a practical project to help bootstrap this whole thing out of
the slow motion progress we've been making these last ~17+ years.

What we in the RDF community have seen since work began on
http://www.w3.org/TR/rdf-schema/ in 1997, is that while it is great to
have the option of an entirely decentralized composition mechanism,
there are also very practical costs for vocabularies being so weakly
coordinated.


Triples/graphs are not the easiest things to work with at the best of
times. The data model is so permissively flexible that creating
applications against it is difficult. These difficulties were in some
situations (e.g. web search, schema.org's origins) made worse by the
chaotic state of the vocabulary environment.

The schema.org extensions discussion I think makes clear that there is
a spectrum here. At one extreme are vocabularies are developed without
any communication or coordination whatsoever. Less extreme is for some
weak coordination and linking between vocabularies, e.g. foaf:focus is
defined in terms of skos:Concept
(http://xmlns.com/foaf/spec/#term_focus) or linked data vocabularies
that relate their terms to others with sub/supertype, equivalence etc
relationships, even if the designs are essentially independent. At the
other extreme would be a single vocabulary that attempted to model
everything in a monolithic way. It is important to understand that
schema.org is not so rigid. Schema.org is by practical necessity very
pragmatic, and e.g. supports for example both library-oriented and
bookshop-like ways of describing books. The pieces of schema.org that
came from Good Relations have some ways of talking about mass produced
objects via prototypes (http://schema.org/ProductModel) which could
also be applied to books, since books are mass produced; but we also
have adopted a use of isPartOf as an alternate model for addressing
FRBR-like use cases, without the rigidity of FRBR. The details don't
matter here - my point is that even within a single large vocabulary
you naturally have a kind of pluralism. Any vocabulary at schema.org's
scale will have situations where there are several ways of saying the
same thing, depending on perspective and context.

The gap that the extension model fills is between the relative chaos
of very loosely coupled linked data vocabularies (independent designs,
documentation, versioning, modeling styles) -vs- the relatively highly
integrated approach of core schema.org. We want a bit more chaos than
core schema.org but a lot less chaos than the total free-for-all of
the classic Semantic Web. For those who would rather pick and choose
from the entire range of diverse vocabularies, the LOV project (see
lov.okfn.org) offers a very useful directory. For those who want a bit
more consistency in terms of documentation / navigation / usage
examples, a common underlying core, and richer domain-oriented
extensions, we have schema.org and its approaches to extension as
discussed here.

Schema.org is an exploration of the idea that we'll get further,
faster by sharing a substantially sized common vocabulary as well as
underlying graph data model. But it remains a part of that larger
RDF-based framework and can be freely mixed with independently managed
vocabularies.

Dan

> I'm also raising this issue with the idea of web neutrality in mind. So yes,
> this is a community of general and specific domain modelling experts. And
> yes we want to get every webmaster at the end. So yes, we need a slightly
> lightweight base schema.org vocabulary (please stripe out every too domain
> specific stuff into extensions).
> Thus, I doubt that we'll need a corporate schema.org extension approach, but
> simply a mechanism, where I can plugin my more specific domain vocabulary.
> And this mechanism does already exists, i.e., make use of further namespaces
> and refer to dereferencable descriptions (for any kind of consumer; via
> (P)URLs) of  these vocabularies.
>
> Cheers,
>
>
> Bo
>
>
> *) we knew it before, however, tried it again (with much more success then
> ever)
>
>
> On 2/13/2015 10:34 PM, Guha wrote:
>>
>>
>> Schema.org extension mechanism
>>
>>
>>
>> Motivation
>>
>>     As schema.org <http://schema.org> adoption has grown, a number
>> groups with more specialized vocabularies have expressed interest in
>> extending schema.org <http://schema.org> with their terms. The most
>> prominent example of this is GS1 with product vocabularies. Other
>> examples include real estate, medical and bibliographic information.
>> Even in something as common as human names, there are groups interested
>> creating the vocabulary for representing all the intricacies of names.
>>
>>
>> Outline of solution
>>
>>
>> There are two kinds of extensions: reviewed extensions and external
>> extensions. Both kinds of extensions typically add subclasses and
>> properties to the core. Properties may be added to existing and/or new
>> classes. More generally, they are an overlay on top of the core, and so
>> they may add domains/ranges, superclasses, etc. as well. Extensions have
>> to be consistent with the core schema.org <http://schema.org>. Every
>> item in the core (i.e., www.schema.org <http://www.schema.org>) is also
>> in every extension. Extensions might overlap with each other in concepts
>> (e.g., two extensions defining terms for financial institutions, one
>> calling it FinancialBank and other calling it FinancialInstitution), but
>> we should not have the same term being reused to mean something
>> completely different (e.g., we should not have two extensions, one using
>> Bank to mean river bank and the other using Bank to mean financial
>> institution).
>>
>>
>> Reviewed Extensions
>>
>> Each reviewed extension (say, e1), gets its own chunk of schema.org
>> <http://schema.org> namespace: e1.schema.org <http://e1.schema.org>. The
>> items in that extension are created and maintained by the creators of
>> that extension.  Reviewed extensions are very different from proposals.
>> A proposal, if accepted, with modifications could either go into the
>> core or become a reviewed extension.
>>
>>
>> A reviewed extension is something that has been looked at and discussed
>> by the community, albeit not as much as something in the core. We also
>> expect a reviewed extension to have strong community support, preferably
>> in the form of a few deployments.
>>
>>
>> External Extensions
>>
>> Sometimes there might be a need for a third party (such as an app
>> developer) to create extensions specific to their application. For
>> example, Pinterest might want to extend the schema.org
>> <http://schema.org> concept of ‘Sharing’ with ‘Pinning’. In such a case,
>> they can create schema.pinterest.com <http://schema.pinterest.com> and
>> put up their extensions, specifying how it links with core schema.org
>> <http://schema.org>. We will refer to these as external extensions.
>>
>> How it works for webmasters
>>
>> All of Schema.org core and all of the reviewed extensions will be
>> available from the schema.org <http://schema.org> website. Each
>> extension will be linked to from each of the touch points it has with
>> the core. So, if an extension (say, having to do with Legal stuff)
>> creates legal.schema.org/LegalPerson
>> <http://legal.schema.org/LegalPerson> which is a subclass of
>> schema.org/Person <http://schema.org/Person>, the Person will link to
>> LegalPerson.  Typically, a webpage / email will use only a single
>> extension (e.g., legal), in which case, instead of ‘schema.org
>> <http://schema.org>’ they say ‘legal.schema.org
>> <http://legal.schema.org>’ and use all of the vocabulary in
>> legal.schema.org <http://legal.schema.org> and schema.org
>> <http://schema.org>.
>>
>>
>> As appropriate, the main schema.org <http://schema.org> site will also
>> link to relevant external extensions. With external extensions, the use
>> of multiple namespaces is unavoidable.
>>
>>
>> What does someone creating an extension need to do
>>
>>   We would like extension creators to not have to worry about running a
>> website for their extension. Once the extension is approved, they simply
>> upload a file with their extension into a certain directory on github.
>> Changes are made through the same mechanism.
>>
>>
>> Since the source code for schema.org <http://schema.org> is publicly
>> available, we encourage creators of external extensions to use the same
>> application.
>>
>>
>> Examples
>>
>>
>> Archives example in RDFa
>>
>>
>> This example uses a type that makes sense for archival and bibliographic
>> applications but which is not currently in the schema.org
>> <http://schema.org> core: Microform, defined as "Any form, either film
>> or paper, containing microreproductions of documents for transmission,
>> storage, reading, and printing. (Microfilm, microfiche, microcards, etc.)"
>>
>>
>> The extension type is taken from http://bibliograph.net/Microform,
>> (which on this proposed model would move to bib.schema.org
>> <http://bib.schema.org>) which is a version of the opensource schema.org
>> <http://schema.org> codebases that overlays bibliographic extras onto
>> the core schema.org <http://schema.org> types. The example is adapted
>>
>> from http://schema.org/workExample.
>>
>>
>>
>> <div vocab="http://bib.schema.org/">
>>
>>     <p typeof="Book" resource="http://www.freebase.com/m/0h35m">
>>
>>         <em property="name">The Fellowship of the Rings</em> was written
>> by
>>
>>         <span property="author">J.R.R Tolkien</span> and was originally
>> published
>>
>>         in the <span property="publisher" typeof="Organization">
>>
>>             <span property="location">United Kingdom</span> by
>>
>>             <span property="name">George Allen & Unwin</span>
>>
>>         </span> in <time property="datePublished">1954</time>.
>>
>>         The book has been republished many times, including editions by
>>
>>         <span property="workExample" typeof="Book">
>>
>>             <span property="publisher" typeof="Organization">
>>
>>                 <span property="name">HarperCollins</span>
>>
>>             </span> in <time property="datePublished">1974</time>
>>
>>             (ISBN: <span property="isbn">0007149212</span>)
>>
>>         </span> and by
>>
>>         <span property="workExample" typeof="Book Microform">
>>
>>             <span property="publisher" typeof="Organization">
>>
>>                 <span property="name">Microfiche Press</span>
>>
>>             </span> in <time property="datePublished">2016</time>
>>
>>             (ISBN: <span property="isbn">12341234</span>).
>>
>>         </span>
>>
>>     </p>
>>
>> </div>
>>
>>
>> Alternative RDFa:
>>
>>
>> The example above puts all data into the extension namespace. Although
>> this can be mapped back into normal schema.org <http://schema.org> it
>>
>> puts more work onto consumers. Here is how it would look using multiple
>> vocabularies:
>>
>>
>> <div vocab="http://schema.org/"prefix="bib: http://bib.schema.org/">
>>
>>     <p typeof="Book" resource="http://www.freebase.com/m/0h35m">
>>
>>         <em property="name">The Fellowship of the Rings</em> was written
>> by
>>
>>         <span property="author">J.R.R Tolkien</span> and was originally
>> published
>>
>>         in the <span property="publisher" typeof="Organization">
>>
>>             <span property="location">United Kingdom</span> by
>>
>>             <span property="name">George Allen & Unwin</span>
>>
>>         </span> in <time property="datePublished">1954</time>.
>>
>>         The book has been republished many times, including editions by
>>
>>         <span property="workExample" typeof="Book">
>>
>>             <span property="publisher" typeof="Organization">
>>
>>                 <span property="name">HarperCollins</span>
>>
>>             </span> in <time property="datePublished">1974</time>
>>
>>             (ISBN: <span property="isbn">0007149212</span>)
>>
>>         </span> and by
>>
>>         <span property="workExample" typeof="Book bib:Microform">
>>
>>             <span property="publisher" typeof="Organization">
>>
>>                 <span property="name">Microfiche Press</span>
>>
>>             </span> in <time property="datePublished">2016</time>
>>
>>             (ISBN: <span property="isbn">12341234</span>).
>>
>>         </span>
>>
>>     </p>
>>
>> </div>
>>
>>
>> Here is that last approach written in JSON-LD (it works today, but would
>> be even more concise if the schema.org <http://schema.org> JSON-LD
>>
>> context file was updated to declare the 'bib' extension):
>>
>>
>> <script type="application/ld+json">
>>
>> {
>>
>>   "@context": [ "http://schema.org/",
>>
>>        { "bib": "http://bib.schema.org/" } ],
>>
>>   "@id": "http://www.freebase.com/m/0h35m",
>>
>>   "@type": "Book",
>>
>>   "name": "The Fellowship of the Rings",
>>
>>   "author": "J.R.R Tolkien",
>>
>>   "publisher": {
>>
>>      "@type": "Organization",
>>
>>   },
>>
>>   "location": "United Kingdom",
>>
>>   "name": "George Allen & Unwin",
>>
>> },
>>
>>   "datePublished": "1954",
>>
>>   "workExample": {
>>
>>     "@type": "Book",
>>
>>     "name": "Harper Collins",
>>
>>     "datePublished": "1974",
>>
>>     "isbn": "0007149212"
>>
>>   },
>>
>>   "workExample": {
>>
>>     "@type": ["Book", "bib:Microform"],
>>
>>     "name": "Microfiche Press",
>>
>>     "datePublished": "2016",
>>
>>     "isbn": "12341234"
>>
>>   }
>>
>> }
>>
>> </script>
>>
>>
>>
>> GS1 Example
>>
>>
>> <script type="application/ld+json">
>>
>> {
>>
>>     "@context": "http://schema.org/",
>>
>>     "@vocab": "http://gs1.schema.org/",
>>
>>     "@id": "http://id.manufacturer.com/gtin/05011476100885",
>>
>>     "gtin13": "5011476100885",
>>
>>     "@type": "TradeItem",
>>
>>     "tradeItemDescription": "Deliciously crunchy Os, packed with 4 whole
>> grains. Say Yes to Cheerios",
>>
>>     "healthClaimDescription": "8 Vitamins & Iron, Source of Calcium &
>> High in Fibre",
>>
>>     "hasAllergenRelatedInformation": {
>>
>>         "@type": "gs1:AllergenRelatedInformation",
>>
>>         "allergenStatement": "May contain nut traces"
>>
>>     },
>>
>>     "hasIngredients": {
>>
>>         "@type": "gs1:FoodAndBeverageIngredient",
>>
>>         "hasIngredientDetail": [
>>
>>             {
>>
>>                 "@type": "Ingredient",
>>
>>                 "ingredientseq": "1",
>>
>>                 "ingredientname": "Cereal Grains",
>>
>>                 "ingredientpercentage": "77.5"
>>
>>             },
>>
>>             {
>>
>>                 "@type": "Ingredient",
>>
>>                 "ingredientseq": "2",
>>
>>                 "ingredientname": "Whole Grain OATS",
>>
>>                 "ingredientpercentage": "38.0"
>>
>>             }
>>
>>       ]
>>
>>     },
>>
>>     "nutrientBasisQuantity": {
>>
>>         "@type": "Measurement",
>>
>>         "value": "100",
>>
>>         "unit": "GRM"
>>
>>     },
>>
>>     "energyPerNutrientBasis": [
>>
>>         {
>>
>>             "@type": "Measurement",
>>
>>             "value": "1615",
>>
>>             "unit": "KJO"
>>
>>         },
>>
>>         {
>>
>>             "@type": "Measurement",
>>
>>             "value": "382",
>>
>>             "unit": "E14"
>>
>>         }
>>
>>     ],
>>
>>     "proteinPerNutrientBasis": {
>>
>>         "@type": "Measurement",
>>
>>         "value": "8.6",
>>
>>         "unit": "GRM"
>>
>>     }
>>
>> }
>>
>>
>> </script>
>>
>>
>> This example shows a possible encoding of the GS1 schemas overlaid onto
>> schema.org <http://schema.org>. It uses JSON-LD syntax, which would
>> support several variations on this approach. It is based on examples
>> from GS1's proposal circulated to the schema.org <http://schema.org>
>> community recently.
>>
>> (https://lists.w3.org/Archives/Public/public-vocabs/2015Jan/0069.html).
>> Instead of writing
>>
>>     "@context": "http://schema.org/",   "@vocab":
>> "http://gs1.schema.org/", it would be possible to simply write
>> "@context": "http://gs1.schema.org/".
>>
>>
>>
>>
>>
>>
>>
>
>
Received on Sunday, 15 February 2015 17:20:16 UTC