Re: The Vocabulary, Schema.org governance, etc.

From: Dan Brickley <danbri@google.com>
Date: Wed, 24 Sep 2014 12:03:08 +0100
Message-ID: <CAK-qy=5aroO8yfUgGx5iRcoZOOwc3aNUZDC5GHBPKorTJGCg3A@mail.gmail.com>
To: trond.huso@ntb.no
Cc: W3C Web Schemas Task Force <public-vocabs@w3.org>
On 24 September 2014 11:09,  <trond.huso@ntb.no> wrote:
> First: I applause the job being done by all people on the list, by all
> people working for the founding “fathers” (aka organizations) or schema.org.
> Just out of curiosity:
> Is there a problem why not w3c (or any other organization, although w3c
> seems most natural) could govern the vocabulary being displayed on
> schema.org?
> Since the work being done is as open as possible, what steps has to be made
> to make it even more open?
> As it looks now, it feels as the work begin done is for an open, transparent
> and a non-profit organization.

There are a number of ways in which the schema.org initiative is
distinctively unusual, when compared against traditional W3C standards
work, which make it challenging as a potential W3C Working Group. I
say this as an ex W3C staff member who has great admiration for the
W3C community and respect for W3C standards.

Schema.org was designed in part to break out of the chicken/egg
problem around RDF and related graph datamodel approaches. RDF was
barely consumed, so it was barely published; it was barely published,
so it was barely consumed. Schema.org is very explicitly clear (yes,
transparent) about prioritising a strong connection with large-scale
consumers, because usage drives adoption which drives usage, ... This
is much easier for an independent project to do than for a formal
standards activity.

Schema.org was designed for fast, incremental iteration. Recently
we've been making packages of updates approximately monthly. A lot of
small tweaks scattered across 1200+ term definitions. W3C classically
operates on a multi-year cycle with discrete technology specifications
gradually making their way to a final, frozen, recommendation status.
Schema.org is less of a "technology" and more of a gradually evolving
dictionary. Although there have been some successes  (e.g. SKOS, which
began outside formal W3C process,
http://www.w3.org/2001/sw/Europe/reports/thes/) W3C is generally tuned
towards the creation and management of underlying Web technologies
rather than descriptive schemas.

Schema.org has a *very* broad scope (Web content, email content,
etc.); successfully consensus-based formal industry standards thrive
when there is a very clear, carefully scoped and verifiably achievable
scope with predictable milestones along the way. Note that schema.org
does limit its activities very carefully in one important way: we just
create vocabulary. The underlying technology standards in which the
vocabulary is encoded (e.g. JSON-LD, RDFa, Microdata) are much better
developed elsewhere. Schema.org is at heart a list of informal
examples, a list of types, a list of associated properties, and some
hierarchy and description for each - this underlying simplicity allows
us to rely on supporting Web technologies developed by other groups.

Schema.org does not promise never to change its definitions, although
it has a track record of minimising this, avoiding changes for the
sake of ontological purity. As a large, cross domain vocabulary,
sometimes modest adjustments are needed in pursuit of usability,
intelligibility, integration between proposals etc. Such fine-grained
incremental tweaks are an awkward fit with the W3C Recommendation
Track for standardization, where large chunks of 'technology' become
frozen between major updates. Experience from Dublin Core and FOAF
vocabularies suggests that finer-grained term status can be useful -
it might be worth investigating per-type, per-property metadata to
indicate rough expected stability of schema definitions.

None of these issues are insurmountable differences, and there is an
overarching sense of common enterprise here, even if the machinery of
collaboration might differ.  But I hope this sketch gives some sense
for how the very things that have made  schema.org successful also
make it rather different from classical W3C standardization efforts.
That said there are plenty of ways of collaborating via the
instruments of W3C process beyond the REC-track: the BibExtend
Community Group efforts come to mind. Another would be the publication
of Interest Group notes such as the rdf-vcard work that Renato has
been undertaking for many years (see http://www.w3.org/TR/vcard-rdf/ -
most recently updated as a Semantic Web Interest Group Note).


