Re: Nature of change in Schema.org

On 31 July 2013 15:19, Jindřich Mynarz <mynarzjindrich@gmail.com> wrote:
> Hi,
>
> I want to ask a few questions related to the nature of change in Schema.org.
>
> The major question is if there were any backwards-incompatible changes in
> Schema.org besides fixes of obvious errors?
>
> I thought about this when I was pondering the prospects of our proposed
> Schema.org extension [1] to be pulled into the core Schema.org. Having
> looked through the archive of this mailing list I found several emails that
> suggested very similar changes to the ones that we do in the extension for
> the job market. For example, both [2] and [3] mention changing the
> schema:hiringOrganization property into more general schema:employer
> property that could be used not only with schema:Organization, but with
> schema:Person as well. [2], [3] and [4] note that there should be a way to
> express when schema:JobPosting ceases to be valid (by adding
> schema:dateClosing or schema:dateExpires, for example), which is also the
> case for our proposal. On this note, Martin Hepp suggests using
> gr:validThrough [5]. The emails mentioned also discuss other things
> overlapping with our proposal, such as remote jobs or salary range (min,
> max). The discussion about these modelling decisions dates back to 2011,
> since when the original schema:JobPosting remained fairly stable. This
> leaves me wondering whether there is in fact any chance of incorporating
> such changes in Schema.org.
>
> I think it would be beneficial if Schema.org maintainers were upfront about
> accepting backwards-incompatible change proposals. If no
> backwards-incompatible changes are pulled into Schema.org, that's fine, but
> it should be documented. Is the Schema.org policy regarding such changes
> available online?
>
> Having this issue resolved would then help Schema.org users to decide on the
> trade-off whether they should settle for the core Schema.org (perhaps
> enhanced with incremental extensions) or if they are willing to depart from
> the Schema.org either by creating a backwards-incompatible extension or a
> standalone vocabulary (possibly mapped to Schema.org).
>
> On a related note, I want to ask a more concrete question. The Schema.org
> extension mechanism [6] provides guidance on minting new sub-properties of
> existing properties. How you, however, create super-properties? Are
> extension proposals allowed to add super-properties? For instance, our
> proposal for the domain of job market adds the property schema:employer,
> which is defined as a super-property to schema:hiringOrganization.

We don't have very rigid policies. But in general, there's a strong
bias towards additive changes, since any existing vocabulary that is
being used is unlikely to completely vanish.

The mechanism described in http://schema.org/docs/extension.html is
pretty rough and minimal; there are some other techniques that we
ought to also document. In particular, since schema.org was launched,
two interesting things have happened:

1. W3C RDFa has been simplified as RDFa 1.1 (Lite), and looks a lot
like Microdata in terms of complexity for publishers. However RDFa's
data model deals more comfortably with the notion that some entity
might be usefully described with two or more independently defined
types. To approximate this within Microdata we added the slightly
awkward property to schema.org "additionalType", since microdata
assumes that all the main "itemtype"s for some item come from a common
vocabulary. These changes improve the options for schema extensions.

2. Late last year, driven by the heavy work of integrating most of
Good Relations into Schema.org, we implemented a new backend workflow
for the site based on RDFa schema files.  Several of these (draft and
live) can be found at
https://dvcs.w3.org/hg/webschema/file/default/schema.org/ext

This approach means that much of the site is now generated by scripts
that consult an ordered list of HTML/RDFa/RDFS files. Currently the
system understands basic notions of type/subClassOf, labels/comments,
and our rather scruffy/wiki-like notion of range/domain
("rangeIncludes", "domainIncludes") associations between types and
properties. Now that we also have per-property pages on schema.org I
expect to revisit this machinery to add in a few more useful
facts-about-properties, including provenance/acknowledgements,
super-subproperty hierarchies, mappings to other schemas.

So what does this mean for 'extensions'?

For schemas that happily have an independent existence, RDFa is a very
respectable mechanism for deploying them in data that is otherwise
schema.org-based. This does not mean that all schema.org search engine
products will necessarily 'understand' the extension, but that is also
true of some parts of 'core' schema.org too. Being part of schema.org
indicates that the search engines are broadly supportive of the
vocabulary; not necessarily that they all have shipping products that
directly use every type or property.

An example of RDFa-based extension in instance data here:
http://lists.w3.org/Archives/Public/public-vocabs/2013Jun/0028.html  :

<div vocab="http://schema.org/" prefix="x:
http://example.org/2013/person-extras123#" typeof="Person
x:Minister">
 <span property="name">Joan Smith</span>
</div>

... this allows an extension type, "Minister" to be described properly
in HTML/RDFa/etc, rather than leaving its definition to your
imagination, as deploying http://schema.org/Person/Minister would do.

For schemas aimed for inclusion, a similar approach (slightly edited
schema files) gives configuration files that could ultimately be
integrated into schema.org directly.

I hope these quick notes help sketch how the technology picture has
been evolving. They don't directly address the non-technical need for
those of us at schema.org to more clearly communication our intentions
around specific proposals and a workflow for progressing them - we
will get there too! But I do believe the RDFa design at least allows
for 'companion vocabularies' to be deployed alongside/within plain
schema.org descriptions, without requiring everything be integrated
immediately.

Dan (allegedly on vacation)

Received on Wednesday, 31 July 2013 20:48:04 UTC