Extension Mechanism - Implementation Details

Dear Dan,
dear Guha:

I have a few questions and recommendations regarding the schema.org extension mechanism.

1. Top-level notion of extensions
=================================
It is not yet fully clear to me whether the mechanism aims at being 

a) an umbrella for a largely decentralized set of vocabularies, or 
b) just as a mechanism for partioning the vocabulary in order to simplify the management of the codebase.

I think that b) is more desireable, at least for reviewed extensions. In that case, users of schema.org in mark-up would not have to know whether a property comes from an extension or from core. Yet still, we could always trace down from which extension an element originates, and we can automatically spot name clashes from different extensions.

In that scenario, the main benefit of the extension mechanism will be to keep contributions in individual files in the codebase, which frees us from the problem of removing/adding individual lines scattered across the RDFa file of the core vocabulary. In particular, it becomes easier to try and lateron remove contributions. In the traditional approach, it was very cumbersome to remove contributions at a later stage because they may be scattered across the entire RDFa file (in particular domain/range statements for existing elements). We will also have less merge conflicts.

2. Identifiers of elements from extensions in MARKUP
====================================================
I think that, at least for reviewed extensions, we should have one flat namespace http://schema.org/<element_name> for types, properties, and individuals from both core AND extensions, in all syntaxes (Microdata, RDFa, and JSON-LD). Otherwise, we will make markup as complicated as in RDFa 1.0 times. You would have to choose one vocabulary per entity / itemscope and switch between the simple version of a type (e.g. http://schema.org/Car) and the enhanced type (e.g. (http://auto.schema.org/Car) depending on whether you need additional properties or you don't.

This would add cognitive complexity and thus lots of errors in markup, in particular as we plan to extend types from schema.org with additional properties in extensions, i.e. there is likely overlap between the core and one or more extensions.

3. Redirects
============
a) If a type or property or individual exists ONLY in one or more extensions, there should not be simply a 404 error when trying to dereference its URL from markup (i.e. the flat namespace).

So if there was a type "Foo" in the extension http://foo.schema.org, i.e. http://foo.schema.org/Foo, a HTTP GET and HEAD request to http://schema.org/Foo should not simply return a 404 status code, but either

- a 301 or 302 redirect to http://foo.schema.org/Foo (if only one extension defines it) or
- a short overview page like

"The element you are referencing is not part of schema.org core, but is defined in the following extensions:

- http://foo.schema.org/Foo
- http://acme.schema.org/Foo"

We also have to check whether we understand the implications of overlapping definitions in more than one extension. In theory, two or more extensions could add conflicting statements to the same element from schema.org, e.g. 
- cycles of subClassOf or subPropertyOf statements or
- clashing textual definitions.

b) If a type or property or individual is defined in schema.org core but EXTENDED in one ore more extensions, there must be reasonable hints to this. 
I have doubts that the current mechanism is sufficient.

A simple fix would be to return an overview page, like so

"Note: The element you are referencing is augmented in the following extensions:

- http://foo.schema.org/Foo
- http://acme.schema.org/Foo"

Better would be to try to list additional properties and subtypes (for types) and values (for properties) directly in the page in schema.org core (http://schema.org/Foo) and indicate that they are from extensions by a different color.

4. Textual definitions
======================
What happens with the description of an element if it is updated by one or more extensions?


Best wishes

Martin

-----------------------------------
martin hepp  http://www.heppnetz.de
mhepp@computer.org          @mfhepp







Best wishes / Mit freundlichen Grüßen

Martin Hepp

-------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail:  martin.hepp@unibw.de
phone:   +49-(0)89-6004-4217
fax:     +49-(0)89-6004-4620
www:     http://www.unibw.de/ebusiness/ (group)
        http://www.heppnetz.de/ (personal)
skype:   mfhepp 
twitter: mfhepp

Check out GoodRelations for E-Commerce on the Web of Linked Data!
=================================================================
* Project Main Page: http://purl.org/goodrelations/

Received on Thursday, 21 May 2015 09:09:21 UTC