Proposal for Schema.org extension mechanism from Guha on 2015-02-13 (public-vocabs@w3.org from February 2015)

From: Guha <guha@google.com>
Date: Fri, 13 Feb 2015 13:34:13 -0800
To: W3C Vocabularies <public-vocabs@w3.org>
Cc: Sandro Hawke <sandro@w3.org>, Tim Berners-Lee <timbl@w3.org>, Ralph Swick <swick@w3.org>
Message-ID: <CAPAGhv_GogtqOGW+2GDFTqQ+vw3fOZR6BOWzFvyVqVHST=DdFg@mail.gmail.com>
Schema.org extension mechanism



Motivation

   As schema.org adoption has grown, a number groups with more specialized
vocabularies have expressed interest in extending schema.org with their
terms. The most prominent example of this is GS1 with product vocabularies.
Other examples include real estate, medical and bibliographic information.
Even in something as common as human names, there are groups interested
creating the vocabulary for representing all the intricacies of names.

Outline of solution

There are two kinds of extensions: reviewed extensions and external
extensions. Both kinds of extensions typically add subclasses and
properties to the core. Properties may be added to existing and/or new
classes. More generally, they are an overlay on top of the core, and so
they may add domains/ranges, superclasses, etc. as well. Extensions have to
be consistent with the core schema.org. Every item in the core (i.e.,
www.schema.org) is also in every extension. Extensions might overlap with
each other in concepts (e.g., two extensions defining terms for financial
institutions, one calling it FinancialBank and other calling it
FinancialInstitution), but we should not have the same term being reused to
mean something completely different (e.g., we should not have two
extensions, one using Bank to mean river bank and the other using Bank to
mean financial institution).

Reviewed Extensions

Each reviewed extension (say, e1), gets its own chunk of schema.org
namespace: e1.schema.org. The items in that extension are created and
maintained by the creators of that extension.  Reviewed extensions are very
different from proposals. A proposal, if accepted, with modifications could
either go into the core or become a reviewed extension.

A reviewed extension is something that has been looked at and discussed by
the community, albeit not as much as something in the core. We also expect
a reviewed extension to have strong community support, preferably in the
form of a few deployments.

External Extensions

Sometimes there might be a need for a third party (such as an app
developer) to create extensions specific to their application. For example,
Pinterest might want to extend the schema.org concept of ‘Sharing’ with
‘Pinning’. In such a case, they can create schema.pinterest.com and put up
their extensions, specifying how it links with core schema.org. We will
refer to these as external extensions.



How it works for webmasters

All of Schema.org core and all of the reviewed extensions will be available
from the schema.org website. Each extension will be linked to from each of
the touch points it has with the core. So, if an extension (say, having to
do with Legal stuff) creates legal.schema.org/LegalPerson which is a
subclass of schema.org/Person, the Person will link to LegalPerson.
Typically, a webpage / email will use only a single extension (e.g.,
legal), in which case, instead of ‘schema.org’ they say ‘legal.schema.org’
and use all of the vocabulary in legal.schema.org and schema.org.

As appropriate, the main schema.org site will also link to relevant
external extensions. With external extensions, the use of multiple
namespaces is unavoidable.

What does someone creating an extension need to do

 We would like extension creators to not have to worry about running a
website for their extension. Once the extension is approved, they simply
upload a file with their extension into a certain directory on github.
Changes are made through the same mechanism.

Since the source code for schema.org is publicly available, we encourage
creators of external extensions to use the same application.

Examples

Archives example in RDFa

This example uses a type that makes sense for archival and bibliographic
applications but which is not currently in the schema.org core: Microform,
defined as "Any form, either film or paper, containing microreproductions
of documents for transmission, storage, reading, and printing. (Microfilm,
microfiche, microcards, etc.)"

The extension type is taken from  http://bibliograph.net/Microform, (which
on this proposed model would move to bib.schema.org) which is a version of
the opensource schema.org codebases that overlays bibliographic extras onto
the core schema.org types. The example is adapted from
http://schema.org/workExample.


<div vocab="http://bib.schema.org/">

   <p typeof="Book" resource="http://www.freebase.com/m/0h35m">

       <em property="name">The Fellowship of the Rings</em> was written by

       <span property="author">J.R.R Tolkien</span> and was originally
published

       in the <span property="publisher" typeof="Organization">

           <span property="location">United Kingdom</span> by

           <span property="name">George Allen & Unwin</span>

       </span> in <time property="datePublished">1954</time>.

       The book has been republished many times, including editions by

       <span property="workExample" typeof="Book">

           <span property="publisher" typeof="Organization">

               <span property="name">HarperCollins</span>

           </span> in <time property="datePublished">1974</time>

           (ISBN: <span property="isbn">0007149212</span>)

       </span> and by

       <span property="workExample" typeof="Book Microform">

           <span property="publisher" typeof="Organization">

               <span property="name">Microfiche Press</span>

           </span> in <time property="datePublished">2016</time>

           (ISBN: <span property="isbn">12341234</span>).

       </span>

   </p>

</div>

Alternative RDFa:

The example above puts all data into the extension namespace. Although this
can be mapped back into normal schema.org it puts more work onto consumers.
Here is how it would look using multiple vocabularies:

<div vocab="http://schema.org/" prefix="bib: http://bib.schema.org/">

   <p typeof="Book" resource="http://www.freebase.com/m/0h35m">

       <em property="name">The Fellowship of the Rings</em> was written by

       <span property="author">J.R.R Tolkien</span> and was originally
published

       in the <span property="publisher" typeof="Organization">

           <span property="location">United Kingdom</span> by

           <span property="name">George Allen & Unwin</span>

       </span> in <time property="datePublished">1954</time>.

       The book has been republished many times, including editions by

       <span property="workExample" typeof="Book">

           <span property="publisher" typeof="Organization">

               <span property="name">HarperCollins</span>

           </span> in <time property="datePublished">1974</time>

           (ISBN: <span property="isbn">0007149212</span>)

       </span> and by

       <span property="workExample" typeof="Book bib:Microform">

           <span property="publisher" typeof="Organization">

               <span property="name">Microfiche Press</span>

           </span> in <time property="datePublished">2016</time>

           (ISBN: <span property="isbn">12341234</span>).

       </span>

   </p>

</div>

Here is that last approach written in JSON-LD (it works today, but would be
even more concise if the schema.org JSON-LD context file was updated to
declare the 'bib' extension):

<script type="application/ld+json">

{

 "@context": [ "http://schema.org/",

      { "bib": "http://bib.schema.org/" } ],

 "@id": "http://www.freebase.com/m/0h35m",

 "@type": "Book",

 "name": "The Fellowship of the Rings",

 "author": "J.R.R Tolkien",

 "publisher": {

    "@type": "Organization",

 },

 "location": "United Kingdom",

 "name": "George Allen & Unwin",

},

 "datePublished": "1954",

 "workExample": {

   "@type": "Book",

   "name": "Harper Collins",

   "datePublished": "1974",

   "isbn": "0007149212"

 },

 "workExample": {

   "@type": ["Book", "bib:Microform"],

   "name": "Microfiche Press",

   "datePublished": "2016",

   "isbn": "12341234"

 }

}

</script>


GS1 Example

<script type="application/ld+json">

{

   "@context": "http://schema.org/",

   "@vocab": "http://gs1.schema.org/",

   "@id": "http://id.manufacturer.com/gtin/05011476100885",

   "gtin13": "5011476100885",

   "@type": "TradeItem",

   "tradeItemDescription": "Deliciously crunchy Os, packed with 4 whole
grains. Say Yes to Cheerios",

   "healthClaimDescription": "8 Vitamins & Iron, Source of Calcium & High
in Fibre",

   "hasAllergenRelatedInformation": {

       "@type": "gs1:AllergenRelatedInformation",

       "allergenStatement": "May contain nut traces"

   },

   "hasIngredients": {

       "@type": "gs1:FoodAndBeverageIngredient",

       "hasIngredientDetail": [

           {

               "@type": "Ingredient",

               "ingredientseq": "1",

               "ingredientname": "Cereal Grains",

               "ingredientpercentage": "77.5"

           },

           {

               "@type": "Ingredient",

               "ingredientseq": "2",

               "ingredientname": "Whole Grain OATS",

               "ingredientpercentage": "38.0"

           }

     ]

   },

   "nutrientBasisQuantity": {

       "@type": "Measurement",

       "value": "100",

       "unit": "GRM"

   },

   "energyPerNutrientBasis": [

       {

           "@type": "Measurement",

           "value": "1615",

           "unit": "KJO"

       },

       {

           "@type": "Measurement",

           "value": "382",

           "unit": "E14"

       }

   ],

   "proteinPerNutrientBasis": {

       "@type": "Measurement",

       "value": "8.6",

       "unit": "GRM"

   }

}

</script>

This example shows a possible encoding of the GS1 schemas overlaid onto
schema.org. It uses JSON-LD syntax, which would support several variations
on this approach. It is based on examples from GS1's proposal circulated to
the schema.org community recently.

(https://lists.w3.org/Archives/Public/public-vocabs/2015Jan/0069.html).
Instead of writing

   "@context": "http://schema.org/",   "@vocab": "http://gs1.schema.org/",
it would be possible to simply write "@context": "http://gs1.schema.org/".
Received on Friday, 13 February 2015 21:34:42 UTC