- From: Thad Guidry <thadguidry@gmail.com>
- Date: Fri, 29 May 2015 09:24:23 -0500
- To: Dan Brickley <danbri@google.com>
- Cc: Ed Summers <ehs@pobox.com>, "mfhepp@gmail.com" <mfhepp@gmail.com>, Barry Carter <carter.barry@gmail.com>, "schema.org Mailing List" <public-schemaorg@w3.org>
- Message-ID: <CAChbWaOP-Tt9nog31Zu3oqC1=DSxgm5dcgnAFYfq6AVE_7_UMQ@mail.gmail.com>
Agreed with Dan. Barry, to help you even more... Here's a nice project that actually aims to bridge the gap of astrophysical models and publically available data - https://github.com/trillian/trillian I would envision that at some point in the future, the work of Trillian could be queried by consumers at web-scale, and optionally through new App and Site Indexing technologies being developed by Google and other stakeholders with Wikidata alignment. Thad +ThadGuidry <https://www.google.com/+ThadGuidry> On Fri, May 29, 2015 at 6:46 AM, Dan Brickley <danbri@google.com> wrote: > On 29 May 2015 at 10:59, Ed Summers <ehs@pobox.com> wrote: > > > >> On May 29, 2015, at 5:49 AM, mfhepp@gmail.com wrote: > >> > >> This will likely not pass the Alexa 1M test. > > > > I haven’t been paying attention :) what is the Alexa 1M test? > > I don't know either :) Presumably roughly "used on a lot of major > sites". But I haven't thought about Alexa in a while I must admit. > > There was some related discussion on public-vocabs about when > something ought to go in the core versus be handled as an extension - > https://lists.w3.org/Archives/Public/public-vocabs/2015May/0009.html > > Here's a distinction that I don't think we make often enough. It is > fairly intuitive (I argue, without evidence): > > Schema.org's vocabularies are fundamentally for large scale > *communication* of structured data. Its core vocabularies are much > less likely to meet the needs of people choosing schemas to actually > manage/store/create such data, i.e. for the source or master format. > > Putting things in those terms makes clear that there remains plenty of > work to do in the non-schema.org RDF/OWL universe (as well as in > extensions, where the distinction perhaps gets blurred). > > Publishing data using schema.org typically involves transformation, > mapping and conversion from some underlying representation (which > often enough for publishers will be SQL or Java interfaces or > something custom or application-oriented e.g. Drupal's data storage > abstractions). What you do in the privacy of your own database is > entirely your own business. OWL and other Linked Data RDF vocabularies > may or may not be useful there, depending on your situation. It is > highly unlikely that schema.org's core vocabulary alone will be enough > to be the sole, ultimate and underlying representation for most > databases. Creating your own more focussed schemas/ontologies in > RDFS/OWL, rather than using planet-wide ontologies, can help to bridge > that gap. Creating more focussed schema.org extensions may also help - > but this is new territory for us all. Schema.org's emphasis remains on > Web-scale communication, rather than as a supplier of master formats. > > When you choose the terminology that actually structures your own > database(s), every detail matters - subtleties of definition, quirks > of the specific datasets and sources you're dealing with, versioning > etc. When publishing such data for consumption elsewhere in the Web, > particularly in search-oriented apps, it is natural to trade some of > that fine grained control for a wider audience. So for the astro case, > it might be that carefully modeled independent ontologies would be the > way to go within the astronomy community, but that once this work is > done or identified it could be mapped into a schema.org extension. As > Martin notes this is pretty much the route taken with the Good > Relations, Automotive etc work. There is no reason that independent > astro-oriented ontologies couldn't also be expressed as an > astro.schema.org extension, and perhaps this would help make the > resulting data markup clearer and easier for publishers. It wouldn't > guarantee search engines would suddenly start adding astro-related > search features, any more than the presence of "Volcano" in > schema.org's core (i.e. http://schema.org/Volcano) has led to many > volcano-oriented search features. But it could help focus discussion > on the astronomical aspects rather than on all the supporting > background vocabulary that is also needed. > > We have created the updated Extensions mechanism to help reduce the > gap between custom schemas and schema.org, rather than to replace > non-schema.org schemas. > > >From my own perspective schema.org serves much of the purpose that > FOAF originally aimed at: a general "utility" vocabulary to boostrap > this kind of structured data sharing - e.g. in > > https://web.archive.org/web/20140331104046/http://www.foaf-project.org/original-intro > FOAF (then "RDFWeb") was described as a "starter vocabulary". > See also http://www.w3.org/TR/NOTE-MCF-XML/#secA. for an earlier > bootstrap vocab which inspired FOAF. Schema.org is also a starter > vocabulary, but it is on a larger scale (number of sites, size of > vocabulary, impact of consuming apps) than we ever achieved with FOAF. > > I would say three things stopped FOAF itself evolving to serve as such > as "starter vocabulary": > > 1. We were too cautious about the size of the schema; 100 terms seemed > at the time terribly large. By relying on the multiple independent > namespaces RDFS approach we pushed complexity onto publishers, who > suffered from the lack of "attention to detail" coordination across > vocabularies. Real world descriptive problems do not map cleanly onto > independently managed RDF namespaces, resulting in fragmentation and > confusion on how best to express various situations in RDF. > > 2. Few consuming applications. We made some fun demos and prototypes, > but there was relatively little serious consumption of the data. > Without high profile consumption, data quality suffers and mainstream > publishers are not motivated, so it remains hard to break out of the > early-adopter tech/standards/research scene. > > 3. We were too cautious about evolving the schemas once they were > being used on many sites. Early design errors and compromises got > frozen in. > > The approach at schema.org differs on all of these three points: > > - the vocabulary is much larger, allowing many common scenarios to be > described purely by schema.org terms. > - there is an explicit up-front link to large scale consumption of the > data by mainstream user-facing applications > - the schemas are constantly being tweaked and improved, even > including name changes and redesigns when we think they improve > usability and integration. > > This last point brings me back to my first: if you are choosing the > underlying format for managing your data, this kind of constant > improvement can be ... annoying. While we do make frozen snapshots > available (see http://schema.org/version/ ) but the general approach > we take is to keep improving and integrating things. This is not > something that those of us in the wider RDF community have done enough > of, in terms of improving how independently managed vocabularies fit > together. The hope with schema.org extensions is that we can find a > balance and have more decentralization of vocabulary creation while > still keeping a broad community communicating here who care about > integration and consistency, while acknowledging that changes often > have to be incremental and pragmatic. > > For example, the Bibliographic extensions community at > https://www.w3.org/community/schemabibex/ ... automotive ontologies at > https://www.w3.org/community/gao/ and most recently a proposed > health/medical extensions community, see > > https://www.w3.org/community/blog/2015/05/21/proposed-group-healthcare-ontology-community-group/ > ... these all have scope to go deeper into their focus areas than > core schema.org efforts. But they keep some connection also via > shema.org core vocabulary, so that shared notions of CreativeWork, > Organization, Event etc etc don't get repeatedly re-invented. For > Astronomy I'd suggest perhaps a W3C Community Group would also make > sense, both to draw together existing work, sites/publishers and > toolmakers who are interested to collaborate on shared vocabulary, but > also to give a coordination point so that we can keep a conversation > open on how these things all fit together. At this stage I think > studying what's out there is of far greater value than worrying about > what should go in to a hypothetical astro.schema.org or into any new > ontologies... > > verbosely, > > Dan > >
Received on Friday, 29 May 2015 14:24:51 UTC