Re: Proposal: Audiobook

From: Dan Brickley <danbri@google.com>
Date: Fri, 27 Sep 2013 16:48:08 +0100
Message-ID: <CAK-qy=6iUtwuUU=j19cARZSKC_7e9TSjMbNCDoNzP9zdgsn4QQ@mail.gmail.com>
To: Thad Guidry <thadguidry@gmail.com>
Cc: Phil Barker <phil.barker@hw.ac.uk>, "public-vocabs@w3.org" <public-vocabs@w3.org>
On 27 September 2013 14:42, Thad Guidry <thadguidry@gmail.com> wrote:
> [snip]
>> My suggestion is that in some way the documentation reflect which
>> properties are "core" and which have been added for some domain specific
>> purpose. I know it is difficult to say what constitutes the cross domain
>> core, but I think it would be relative easy and useful to group together
>> those properties of, say, CreativeWork that were added because they are
>> specifically relevant to resources being described for use in the context of
>> learning, or those properties that are specifically relevant to the
>> description of legal aspects of a resource. This might help users focus
>> their attention on those properties relevant to them and help them
>> understand what the descriptions mean. Alternatively, working vice-versa it
>> may be useful to suppress those properties of, say, a Diet or Volcano that
>> have been inherited but really aren't particularly applicable to the
>> specific class in question.
>> Phil
> [snip]
> Uhh... that's the whole point of HAVING Types to begin with... grouping
> common properties together around a Domain Type.
> (slap slap slap...wake up folks...we still like you Phil, however ;-))

Phil's suggestion is not unreasonable. There is a real cost to
elaborating the type system to try to distinguish everything
super-cleanly. Each new distinction will also introduce new forms of
complexity and opportunities for mis-shapen data. The alternative,
which is our currently rather flat and scruffy approach, means that
our simple documentation structure sometimes seems to nudge publishers
towards publishing weird or near-nonsensical data, like volcano fax

My favourite example currently is that http://schema.org/Place has a
subtype http://schema.org/Country which inherits an association to the
property http://schema.org/openingHoursSpecification

Now commonsense tells us that most countries don't have opening hours.
Something's weird about the very idea. But perhaps North Korea,
Vatican, ... other small states might plausibly have opening hours?
Would it really be so astonishing - rather than amusing - to find such
information in http://www.liechtenstein.li/index.php?id=56&L=1 or
other microstates (http://en.wikipedia.org/wiki/Microstate) ? Or
consider the border controls around Gaza.

Commonsense is notoriously hard to formalize. How can we capture the
common intuition that "opening hours" is probably not amongst the most
useful property to list alongside "Country", even if it is
theoretically feasible? If Countries can't have opening hours, what
about cities? Schemas draw rigid lines, when reality is much blurrier.
Sooner or later we end up with grey areas, and need to figure out how
to document around them.

Suppressing or downplaying properties in schema.org document when they
are not much used in practice could be quite handy - but also risks
being a self-fulfilling prophecy. Stats from the search engines or
commoncrawl.org / http://webdatacommons.org/ could be useful here. It
might also be useful (perhaps not on schema.org itself) to have
indications of which type/property combinations are actually used in
real services and tools. None of this is a substitute for tidying up
the schema definitions, but I do feel such softer techniques are very
much worth pursuing. We can't keep tightening schemas in the hope that
eventually we'll have made it impossible to express foolish things. It
is reasonable to seek a way of downplaying 'opening hours' on Country
without necessarily proclaiming that a Country can never ever have
opening hours...


Received on Friday, 27 September 2013 15:48:36 UTC

