- From: Dan Brickley <danbri@google.com>
- Date: Fri, 25 Oct 2013 16:13:18 +1100
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
- Cc: Guha <guha@google.com>, W3C Vocabularies <public-vocabs@w3.org>
On 25 October 2013 15:37, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote: > Strangenesses in schema.org, an incomplete list: > > Types as URLs. Properties as strings. Prescriptive property introductions. > Closed set of types, particularly with open set of properties. Union > ranges, particularly with sub and super properties. Single typing with a > multiple-parent type hierarchy. URLs as a subset of text. URl vs sameAs > property. additionalTypes property. So we talked about this at some length here at ISWC, Peter. As I mentioned f2f, I think you're (understandably given our docs) pushing together some quite different kinds of issues. A lot of your comments are specific to the Microdata syntax. Microdata can be seen as a fork of RDFa as it was in 2009, i.e. RDFa 1.0. The doc in http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html which introduced it talks through the various ways in which the RDF-ness of RDFa was largely thrown out to form Microdata, even while the basic/surface user-facing markup patterns remained pretty close. Microdata became a lot simpler for publishers, but "threw out the baby with the bath water" in terms of having an RDF interpretation of data content. Two later pieces of work are relevant: RDFa 1.1 especially RDFa Lite, which is a publisher-friendly and RDF-oriented view of RDFa 1.1, plus also Gregg's note on microdata/rdf mappings, http://www.w3.org/TR/microdata-rdf/ . RDFa Lite is very close to Microdata as far as publishers are concerned but more explicitly parses into real RDF. It is heavier for parser writers, but there are many more publishers than parser writers so that tradeoff seems reasonable. At this stage in history, schema.org is pluralist w.r.t. syntaxes; there are 5+ million domains publishing Microdata, ... that format is not going away any time soon. But there are also important advantages to RDFa - e.g. use of multiple types from independent vocabularies, as well as more explicit mapping to RDF graphs than given by the Microdata spec. We have posted (on blog.schema.org) posts that are positive about RDFa Lite, about JSON-LD (considered as an RDF notation), as well as Microdata. The common (RDF-based) data model gives some unity to this. As you've noticed, the main schema.org site is still very Microdata-centric. I expect us to add more examples in other notations; already you can see some JSON-LD examples e.g. http://schema.org/WatchAction and nearby. So some of your questions (e.g. properties 'as literals') relate to inadequacies of Microdata considered as a representational language. Although it might be possible to improve the Microdata spec to some extent, it is also appropriate to look to other representations like RDFa and JSON-LD, rather than trying to gradually mutate Microdata back into RDFa. Microdata is what it is, and it is not terribly hard to extract a plausible RDF graph from it even if that transformation is currently under-specified. Beyond graph notation, there is another cluster of issues around search engine pragmatism regarding pre-processing of messy data, the trailing-slash extension model, strings-where-we-expect-things, etc. Personal view here: a) the '/'-based extension mechanism proposed back in 2011 has not been a success and should be de-emphasised. It is not so useful to encourage people to write 'http://schema.org/Person/Minister'; better to migrate towards RDFa and using real independently declared and RDFS-documentable subtypes e.g. http://w3.org/ns/minister-vocab#Minister. b) where we say we'll handle the appearance of a string in places where a string is expected, we probably should say that we will treat that as a shortcut for saying 'the string is the value of the http://schema.org/name property of the thing' c) Mixing of URIs for things vs URIs for documents that describe those things - at this stage, messyness 'comes with the territory'. Everyone would like cleaner distinction between abstract entities and the docs that describe them, but we won't get there in the mainstream Web by simply demanding them - the Linked Data community has learned that even amongst enthusiastic experts, publishing such data is hard. As far as properties as properties go, you've noticed a few cases where we express in prose some notion of superproperty. There are also 'this property is replaced by that property' situations, e.g. http://schema.org/actors vs http://schema.org/actor. Third, if you look at the source files in W3C mercurial from which the schema.org site is generated, you'll see documentation of equivalentClass / equivalentProperty relationships in a few cases, e.g. https://dvcs.w3.org/hg/webschema/file/2d9d90bce7a0/schema.org/ext/dataset.html . In all the cases it is reasonable to expect more from the schema.org site implementation (displaying this data, exposing it in RDF/RDFS etc.), and documenting in some updated account of the data model. But right now as Guha says, we only have a very informal, skeletal notion of properties of properties. A step towards this was creating per-property pages, that could carry such information in a simple user-facing way. Dan
Received on Friday, 25 October 2013 05:13:46 UTC