WebSchemas, Schema.org and W3C

Date: Sun, 13 Jan 2013 11:07:54 +0000
>> This list should, IMHO, really just be about
>> a) concrete extensions of schema.org for certain domains or usage ("we need an additional property for type XYZ for the following reasons") and
>> b) issues with the current extension mechanisms or concrete proposals on how to enhance them,
> I don't disagree that these are in scope for discussion. But isn't
> public-vocabs a discussion list to support the W3C Web Schemas Task
> Force, which has its own somewhat broader charter and scope [1]?
> Personally I would like to see a little less focus on getting the
> models just right, and a little more focus on using this W3C
> discussion space as neutral territory to discuss how we are using the
> vocabularies in applications and tools.

Yes. This was intended as a meeting place where all schemas touch.
Several things have led to it being seen solely as "the schema.org
list": (i) people don't read charters (ii) unlike other schemas, the
schema.org project declared public-vocabs@ to be its home / main list
(iii) the high profile and all embracing scope of schema.org.

As a member of the RDF community since 1997, I'm painfully aware of
some of our failings. It is (as has been expressed already in this
thread) important to avoid over-burdening schema.org with every hope
and aspiration that attaches to the RDF, '[sS]emantic [wW]eb', 'Linked
[open] Data' etc labels. Or put another way; schema.org has no
intention of being overburdened with such things.

Two particular failings of our community come to mind. One is that we
have an endearing and frustrating architecture of politeness based on
the use of namespaces that has led to a situation in which we have a
fragmented suite of independent vocabularies that are hard for new
parties to adopt. The culture around RDF is that you only publish
schemas for the 'diffs', the missing vocabulary that wasn't covered by
a jumbled mix of existing terminology. So anyone doing document-like
markup would be frowned at - "Did you consider using Dublin Core?";
anyone publishing an RDF vocabulary describing people "Why didn't you
use FOAF?", and so on. And the very architecture that supported this -
namespaces - allowed us to continue to design these parallel
descriptive systems without being forced to sit down together and work
out how they can be combined to solve real world problems.

 A couple of years ago, I did sit down and look at the words we'd
chosen in various deployed and popular-ish RDF vocabularies; I called
it "Zoo"; https://github.com/danbri/Zoo/blob/master/zoo.foaf.tv/index.html
... this showed that 'Collection' was used in bibo:, swan:, 'Work' in
skos:; cc: vcard:; 'description' in dcterms: doap: gr: ical: sioc:,
'category' in 'doap: gr: po: vcard:', 'subject' in dcterms: po: rdf:
sioc:, title in 'dcterms: foaf: sioc: vcard:' and so on. Part of my
hope for this forum is that  -yes, heavily nudged by the creation of
schema.org - RDF vocabulary managers and editors could finally take
the time to stay in touch. That parties working on vocabularies
designed to be deployed alongside each other, could do the world a
favour and talk to each other a bit more. It is good that we have the
namespaces technical mechanism; but it has for too long allowed us to
sidestep the need to talk about how different vocabularies fit
together as more than mere triples.

So WebSchemas was designed to be something a bit more than 'the
schema.org mailing list at W3C', and I still believe that. We (the
larger 'we') need a forum in which all schemas intended for
planet-wide use are equally 'on topic'. The existence of schema.org
should not have a chilling effect on the design, use and deployment of
other RDF vocabularies. Even if the schema.org partner companies are
not in a position right now to collectively promise to
support/understand/use/endorse non-schema.org vocabulary, it is still
healthy to have multiple efforts, initiatives and perspectives. (The
move towards RDFa Lite is a very positive thing here, btw.)

The second failing of the community around RDF is that we have - as
the years have drifted by - acquired a reputation for enjoying talk
over action, and this isn't entirely undeserved. Yesterday I was
re-reading some old mail threads with the late and lamented Aaron
Swartz - http://lists.foaf-project.org/pipermail/foaf-dev/2000-August/004215.html
- that frustration was already present in 2000. In the charter for
this WebSchemas group i.e.
http://www.w3.org/2001/sw/interest/webschema.html we list some semweb
permathread themes explicitly as out-of-scope.

"Out of scope topics include:

* Advocacy of data models or syntaxes without attention to real-world use cases
* The use of inference
* debate over foundational ontologies"

This does not mean that inference and foundational ontologies are
uninteresting or unimportant, just that every successful forum needs
to have some core scope, and that we have plenty of other places
around W3C to debate those topics. What makes the WebSchemas group
special? Just that here, finally, we have somewhere where parties
responsible for globally adopted RDF schemas can do the responsible
thing and stay more carefully in touch with each other.

As Martin points out in a mail that arrived while typing this, ... one
list is not going to be enough for everything. And in terms of work
style for getting (sub-)schemas created and integrated, one size
doesn't fit all. What we've found with schema.org is that different
collaboration styles make sense for different domains. I suggested a
W3C Community Group to Richard Wallis and I'm pleased to see that it
has independent existence and activity. A few months ago I helped set
up a 'sports schemas' group (just a Google Group mailing list), but
that initiative is yet to thrive. We have a very active and largely
independent community around the LRMI vocabulary managed quite
separately, but linked to this one by mail, wiki and occasional audio
catchups. There is of course Good Relations, which also enjoys
independent existence.

In general I think W3C community groups are a fine mechanism for more
focussed and intense vocabulary collaboration, and this forum serves
more for integration issues and high level overview on how all the
pieces of the jigsaw fit together. It could be great, for example, to
see a community group around modeling fiction (and Comics?), but we
also need a place where all such efforts can report back to the wider
community. The creation of schema.org has made all this more urgent
and timely, but it is something we've needed for a while. In the
Dublin Core world we talk about this as 'application profiles';
templates and examples explaining how independently designed pieces of
vocabulary can be mixed together to address real world descriptive
needs. It should happen at W3C, schema.org should engage with it, but
the need is broader. I think WebSchemas is the right place for it.

I should also mention that there are a few areas now where groups
elsewhere around W3C have come up with vocabulary (e.g. Organization +
Registered Organization vocabs; DCAT/ADMS; Geo and post addresses)
that will likely inform improvements to schema.org. There is a need
for somewhere public to work out details around stability/versions,
appropriate acknowledgement, etc.

The fundamental problem of schema design is that the world is not
tidily partitioned; that all use cases interact and overlap -
'Intertwingularity'.  We can make focussed sub-fora for figuring out
how to describe sports, or fiction, or journals and books, but the
combinations and scope overlaps can be overwhelming. While good design
can help, perhaps even more important is communication.

And for that we need somewhere to talk. I don't think it ultimately
matters hugely whether there is a schema.org-specific mailing list at
W3C alongside a more general 'all vocabularies' one, versus a single
list as we have now. My preference is for a unified forum, and we will
likely spin off various schema.org-specific lists for specific
detailed schema.org topics. But given schema.org's cross-domain
nature, it seems important for the project to remain highly visible in
a cross-domain, multi-schema forum.


