Re: Propose astro.schema.org

Agreed with Dan.

Barry, to help you even more... Here's a nice project that actually aims to
bridge the gap of astrophysical models and publically available data  -
https://github.com/trillian/trillian

I would envision that at some point in the future, the work of Trillian
could be queried by consumers at web-scale, and optionally through new App
and Site Indexing technologies being developed by Google and other
stakeholders with Wikidata alignment.


Thad
+ThadGuidry <https://www.google.com/+ThadGuidry>

On Fri, May 29, 2015 at 6:46 AM, Dan Brickley <danbri@google.com> wrote:

> On 29 May 2015 at 10:59, Ed Summers <ehs@pobox.com> wrote:
> >
> >> On May 29, 2015, at 5:49 AM, mfhepp@gmail.com wrote:
> >>
> >> This will likely not pass the Alexa 1M test.
> >
> > I haven’t been paying attention :) what is the Alexa 1M test?
>
> I don't know either :)  Presumably roughly "used on a lot of major
> sites". But I haven't thought about Alexa in a while I must admit.
>
> There was some related discussion on public-vocabs about when
> something ought to go in the core versus be handled as an extension -
> https://lists.w3.org/Archives/Public/public-vocabs/2015May/0009.html
>
> Here's a distinction that I don't think we make often enough. It is
> fairly intuitive (I argue, without evidence):
>
> Schema.org's vocabularies are fundamentally for large scale
> *communication* of structured data. Its core vocabularies are much
> less likely to meet the needs of people choosing schemas to actually
> manage/store/create such data, i.e. for the source or master format.
>
> Putting things in those terms makes clear that there remains plenty of
> work to do in the non-schema.org RDF/OWL universe (as well as in
> extensions, where the distinction perhaps gets blurred).
>
> Publishing data using schema.org typically involves transformation,
> mapping and conversion from some underlying representation (which
> often enough for publishers will be SQL or Java interfaces or
> something custom or application-oriented e.g. Drupal's data storage
> abstractions). What you do in the privacy of your own database is
> entirely your own business. OWL and other Linked Data RDF vocabularies
> may or may not be useful there, depending on your situation. It is
> highly unlikely that schema.org's core vocabulary alone will be enough
> to be the sole, ultimate and underlying representation for most
> databases. Creating your own more focussed schemas/ontologies in
> RDFS/OWL, rather than using planet-wide ontologies, can help to bridge
> that gap. Creating more focussed schema.org extensions may also help -
> but this is new territory for us all. Schema.org's emphasis remains on
> Web-scale communication, rather than as a supplier of master formats.
>
> When you choose the terminology that actually structures your own
> database(s), every detail matters - subtleties of definition, quirks
> of the specific datasets and sources you're dealing with, versioning
> etc. When publishing such data for consumption elsewhere in the Web,
> particularly in search-oriented apps, it is natural to trade some of
> that fine grained control for a wider audience. So for the astro case,
> it might be that carefully modeled independent ontologies would be the
> way to go within the astronomy community, but that once this work is
> done or identified it could be mapped into a schema.org extension. As
> Martin notes this is pretty much the route taken with the Good
> Relations, Automotive etc work. There is no reason that independent
> astro-oriented ontologies couldn't also be expressed as an
> astro.schema.org extension, and perhaps this would help make the
> resulting data markup clearer and easier for publishers. It wouldn't
> guarantee search engines would suddenly start adding astro-related
> search features, any more than the presence of "Volcano" in
> schema.org's core (i.e. http://schema.org/Volcano) has led to many
> volcano-oriented search features. But it could help focus discussion
> on the astronomical aspects rather than on all the supporting
> background vocabulary that is also needed.
>
> We have created the updated Extensions mechanism to help reduce the
> gap between custom schemas and schema.org, rather than to replace
> non-schema.org schemas.
>
> >From my own perspective schema.org serves much of the purpose that
> FOAF originally aimed at: a general "utility" vocabulary to boostrap
> this kind of structured data sharing - e.g. in
>
> https://web.archive.org/web/20140331104046/http://www.foaf-project.org/original-intro
> FOAF (then "RDFWeb") was described as a "starter vocabulary".
> See also http://www.w3.org/TR/NOTE-MCF-XML/#secA. for an earlier
> bootstrap vocab which inspired FOAF. Schema.org is also a starter
> vocabulary, but it is on a larger scale (number of sites, size of
> vocabulary, impact of consuming apps) than we ever achieved with FOAF.
>
> I would say three things stopped FOAF itself evolving to serve as such
> as "starter vocabulary":
>
> 1. We were too cautious about the size of the schema; 100 terms seemed
> at the time terribly large. By relying on the multiple independent
> namespaces RDFS approach we pushed complexity onto publishers, who
> suffered from the lack of "attention to detail" coordination across
> vocabularies. Real world descriptive problems do not map cleanly onto
> independently managed RDF namespaces, resulting in fragmentation and
> confusion on how best to express various situations in RDF.
>
> 2. Few consuming applications. We made some fun demos and prototypes,
> but there was relatively little serious consumption of the data.
> Without high profile consumption, data quality suffers and mainstream
> publishers are not motivated, so it remains hard to break out of the
> early-adopter tech/standards/research scene.
>
> 3. We were too cautious about evolving the schemas once they were
> being used on many sites. Early design errors and compromises got
> frozen in.
>
> The approach at schema.org differs on all of these three points:
>
> - the vocabulary is much larger, allowing many common scenarios to be
> described purely by schema.org terms.
> - there is an explicit up-front link to large scale consumption of the
> data by mainstream user-facing applications
> - the schemas are constantly being tweaked and improved, even
> including name changes and redesigns when we think they improve
> usability and integration.
>
> This last point brings me back to my first: if you are choosing the
> underlying format for managing your data, this kind of constant
> improvement can be ... annoying. While we do make frozen snapshots
> available (see http://schema.org/version/ ) but the general approach
> we take is to keep improving and integrating things.  This is not
> something that those of us in the wider RDF community have done enough
> of, in terms of improving how independently managed vocabularies fit
> together. The hope with schema.org extensions is that we can find a
> balance and have more decentralization of vocabulary creation while
> still keeping a broad community communicating here who care about
> integration and consistency, while acknowledging that changes often
> have to be incremental and pragmatic.
>
> For example, the Bibliographic extensions community at
> https://www.w3.org/community/schemabibex/ ... automotive ontologies at
> https://www.w3.org/community/gao/ and most recently a proposed
> health/medical extensions community, see
>
> https://www.w3.org/community/blog/2015/05/21/proposed-group-healthcare-ontology-community-group/
>  ... these all have scope to go deeper into their focus areas than
> core schema.org efforts. But they keep some connection also via
> shema.org core vocabulary, so that shared notions of CreativeWork,
> Organization, Event etc etc don't get repeatedly re-invented. For
> Astronomy I'd suggest perhaps a W3C Community Group would also make
> sense, both to draw together existing work, sites/publishers and
> toolmakers who are interested to collaborate on shared vocabulary, but
> also to give a coordination point so that we can keep a conversation
> open on how these things all fit together. At this stage I think
> studying what's out there is of far greater value than worrying about
> what should go in to a hypothetical astro.schema.org or into any new
> ontologies...
>
> verbosely,
>
> Dan
>
>

Received on Friday, 29 May 2015 14:24:51 UTC