Re: Schema addition request from Azamat Abdoullaev on 2017-03-20 (public-schemaorg@w3.org from March 2017)

From: Azamat Abdoullaev <ontopaedia@gmail.com>
Date: Mon, 20 Mar 2017 17:43:27 +0200
To: Dan Brickley <danbri@google.com>
Cc: "public-schemaorg@w3.org" <public-schemaorg@w3.org>
Message-ID: <CAKK1bf8bY75Xj4_D5YLobeB=JPnao97auh7EnF1JOWdJE3ggkA@mail.gmail.com>
If Google/Bing/Yahoo/Yandex/etc are reluctant to use all schema.org
markup as background knowledge ..., then there is no big sense to proceed
with the project (at least, with these stakeholders).

On Mon, Mar 20, 2017 at 5:02 PM, Dan Brickley <danbri@google.com> wrote:

> On 20 March 2017 at 01:24, Marijane White <whimar@ohsu.edu> wrote:
> > I don’t know what thoughts/opinions are on Moz are around these parts,
> but I’d like to note that this term is included in their local category
> listings, which would seem to imply that their research indicates it is a
> category known to at least some search engines.
> >
> > https://moz.com/local/categories/category/Medical%20Spa
>
> (to everyone on this thread)
>
> May I suggest that speculating on the exact behaviour and internal
> structure of search engines (whether my employer or any other) is
> unlikely to be a productive use of this mailing list, or the inboxes
> of the several hundred people on it.
>
> Those who hope for an explicit list of exactly how each search engine
> handles structured data should turn to the documentation sites
> published by that search engine. You are unlikely to get a lot more
> detail here or on the schema.org site. In the case of Google,
> everything Google has to say officially on the topic is at
> https://devsite.googleplex.com/search/docs/guides/intro-structured-data
> or nearby.
>
> Schema.org was founded by search engines and remains explicitly
> responsive to suggestions from any/all large scale consumers of
> markup, as well as to a wider community of participants in these
> discussions. This is not always easy to balance and I can see that it
> can be frustrating sometimes not to have an explicit recipe for
> figuring out the best areas to focus on for new schema.org vocabulary.
> Discussion here has come back (again) to questions around whether
> Google and others will "support" the markup, so I wanted to comment a
> little on that aspect.
>
> There are several senses in which a search engine such as Google might
> "support" or "use" schema.org. I'll comment here only from an
> informally Google-oriented perspective. This is not any kind of
> serious taxonomy, just some informal notes to make clear that
> "supports" is not a simple binary yes/no thing -
>
> 1) A search engine might (or might not) be generally supportive of the
> project, initiative, approach, as being good for the Web, for the
> structured data ecosystem, as a foundation for new developments, and
> so on.
>
> 2) A search engine might (or might not) make use of some specific
> schema.org term(s) to support a particular user-visible feature such
> as various kinds of snippets, summary panels, carousels and so on.
>
> 3) A search engine might (or might not) use any-or-all schema.org
> markup as background knowledge to improve products and their various
> features, and to get better at understanding, summarizing and
> representing the real world meaning of various kinds of online
> content.
>
> 4) A search engine might (or might not) have products and features
> where particular interactions (e.g. matching of certain queries) take
> structured data into account - e.g. see Matt Cutts' observations in
> https://www.youtube.com/watch?v=OolDzztYwtQ - even if the UI doesn't
> make a big explicit fuss about it.
>
> 5) A search engine might (or might not) have products/features e.g.
> cloud stuff, custom search, analytics, or be deploying new
> technologies like Web components (see e.g.
> https://developers.google.com/web/updates/2015/03/creating-
> semantic-sites-with-web-components-and-jsonld)
> ... which make it easier for sites/publishers to themselves make
> better use of their own structured data using whichever schema terms
> make sense to their own applications.
>
> 6) A search engine might (or might not) make use of schema.org's
> vocabulary when dealing with information coming from sources other
> than the public Web, or in other kinds of product and service.
>
> 7) I could go on...
>
> I won't go into the specifics of any of these, except to say that
> Google's public documentation (and testing tool) focusses primarily on
> (2.) because it is the most tangible and practical, ... but those
> recommendations are set against a backdrop of wider and growing
> support for schema.org structured data in the various broader senses
> sketched above. It's nearly 6 years since Google announced support for
> schema.org, and the range of users to which schema.org structured data
> is put has grown very substantially. Specific products and features
> might come and go, particular encodings (e.g. microdata vs json-ld)
> might change, but the general direction of using this stuff in more
> and more ways has been pretty clear.
>
> All of this doesn't give a clear or automatic answer to specific
> questions like "should we add some new term x to schema.org?".
> Sometimes we (the schema.org community "we") have added small things
> speculatively and it has turned out to be useful and later picked up
> in user-visible product features, e.g. in the (2) sense above; other
> times, for a huge range of reasons, schema.org additions may have been
> less successful. There are a variety of considerations including ease
> of adding the markup (e.g. does it match what a lot of major sites
> have in their databases already), and so on. It would be reasonable to
> expect a few more words in this direction on the schema.org site or
> its github to help guide discussions, ... but we really can't keep
> having the "but does/will Google/Bing/Yahoo/Yandex/etc explicitly use
> it?" thread here every 3 weeks, and it is not useful to speculate on
> the internal design of search engines on this mailing list. There are
> many other places on the Web devoted to speculating on how search
> engines might work internally. For our discussions here, it is best to
> focus more on the public contents of the Web. Given the success of
> schema.org, there is value in "rounding out" various areas of the
> vocabulary where there are simple fixable gaps in vocabulary coverage,
> regardless of whether they're expected to turn up explicitly used in
> product features in the short term. As we have attempted to do so
> we've also run into situations where there is risk of massive
> redundancy and complication, which is why in 2017 we'll need to give
> attention to issues around compositional terms
> (https://github.com/schemaorg/schemaorg/issues/1493) and to bridging
> with long-tail resources like Wikidata (e.g.
> https://github.com/schemaorg/schemaorg/issues/1186
> https://github.com/schemaorg/schemaorg/issues/280). As those efforts
> mature, the various notion of "supports"/"understands"/"uses" floating
> around will continue to evolve too.
>
> Hope this helps a little. The other thing I wanted to note is that the
> new "pending" area (see pending.schema.org) gives us an intermediate
> zone where we can park proposed terms, with a lower barrier to entry,
> alongside a slightly weaker sense that the terms there are broadly
> "supported" by consuming applications. For example, when ClaimReview
> was added there it was just an idea; a year later it is widely used on
> high profile fact-checking sites, as well as in consuming products...
>
> cheers,
>
> Dan
>
>
Received on Monday, 20 March 2017 15:44:02 UTC