W3C home > Mailing lists > Public > public-vocabs@w3.org > October 2013

Re: schema.org as reconstructed from the human-readable information at schema.org

From: Ivan Herman <ivan@w3.org>
Date: Fri, 25 Oct 2013 09:13:46 +0200
Cc: Dan Brickley <danbri@google.com>, Guha <guha@google.com>, W3C Vocabularies <public-vocabs@w3.org>
Message-Id: <D386F041-A87F-4759-AC9B-FDFA67D9E7F4@w3.org>
To: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Hi Peter,

just to chiming in... DanBri produced this at some point (Dan, I do not know whether that is the up-to-date version). 

http://schema.org/docs/schema_org_rdfa.html

yielding

http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fschema.org%2Fdocs%2Fschema_org_rdfa.html

in Turtle, or

http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fschema.org%2Fdocs%2Fschema_org_rdfa.html&format=json

in JSON-LD. Does this help for some of the details?

Ivan


On Oct 25, 2013, at 08:15 , Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:

> So maybe all these problems are with the presentation instead of the reality, but where is there a pointer to an examinable reality?  Either provide some examinable reality or some better presentation.
> 
> This is not to say that there shouldn't be a getting started document, which probably wouldn't have to change much, but there really isn't anything beyond the getting started document.
> 
> This is also not to say that there is anything at all wrong with microdata.  In fact, it could be quite useful to provide an RDF format that only needs short identifiers (and no namespaces).
> 
> peter
> 
> On Oct 24, 2013, at 10:13 PM, Dan Brickley <danbri@google.com> wrote:
> 
>> On 25 October 2013 15:37, Peter F. Patel-Schneider
>> <pfpschneider@gmail.com> wrote:
>>> Strangenesses in schema.org, an incomplete list:
>>> 
>>> Types as URLs.  Properties as strings.  Prescriptive property introductions.
>>> Closed set of types, particularly with open set of properties.  Union
>>> ranges, particularly with sub and super properties.  Single typing with a
>>> multiple-parent type hierarchy.  URLs as a subset of text. URl vs sameAs
>>> property.  additionalTypes property.
>> 
>> So we talked about this at some length here at ISWC, Peter. As I
>> mentioned f2f, I think you're (understandably given our docs) pushing
>> together some quite different kinds of issues. A lot of your comments
>> are specific to the Microdata syntax. Microdata can be seen as a fork
>> of RDFa as it was in 2009, i.e. RDFa 1.0. The doc in
>> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html
>> which introduced it talks through the various ways in which the
>> RDF-ness of RDFa was largely thrown out to form Microdata, even while
>> the basic/surface user-facing markup patterns remained pretty close.
>> Microdata became a lot simpler for publishers, but "threw out the baby
>> with the bath water" in terms of having an RDF interpretation of data
>> content. Two later pieces of work are relevant: RDFa 1.1 especially
>> RDFa Lite, which is a publisher-friendly and RDF-oriented view of RDFa
>> 1.1, plus also Gregg's note on microdata/rdf mappings,
>> http://www.w3.org/TR/microdata-rdf/ . RDFa Lite is very close to
>> Microdata as far as publishers are concerned but more explicitly
>> parses into real RDF. It is heavier for parser writers, but there are
>> many more publishers than parser writers so that tradeoff seems
>> reasonable. At this stage in history, schema.org is pluralist w.r.t.
>> syntaxes; there are 5+ million domains publishing Microdata, ... that
>> format is not going away any time soon. But there are also important
>> advantages to RDFa - e.g. use of multiple types from independent
>> vocabularies, as well as more explicit mapping to RDF graphs than
>> given by the Microdata spec. We have posted (on blog.schema.org) posts
>> that are positive about RDFa Lite, about JSON-LD (considered as an RDF
>> notation), as well as Microdata. The common (RDF-based) data model
>> gives some unity to this. As you've noticed, the main schema.org site
>> is still very Microdata-centric. I expect us to add more examples in
>> other notations; already you can see some JSON-LD examples e.g.
>> http://schema.org/WatchAction and nearby.
>> 
>> So some of your questions (e.g. properties 'as literals') relate to
>> inadequacies of Microdata considered as a representational language.
>> Although it might be possible to improve the Microdata spec to some
>> extent, it is also appropriate to look to other representations like
>> RDFa and JSON-LD, rather than trying to gradually mutate Microdata
>> back into RDFa. Microdata is what it is, and it is not terribly hard
>> to extract a plausible RDF graph from it even if that transformation
>> is currently under-specified.
>> 
>> Beyond graph notation, there is another cluster of issues around
>> search engine pragmatism regarding pre-processing of messy data, the
>> trailing-slash extension model, strings-where-we-expect-things, etc.
>> Personal view here: a) the '/'-based extension mechanism proposed back
>> in 2011 has not been a success and should be de-emphasised. It is not
>> so useful to encourage people to write
>> 'http://schema.org/Person/Minister'; better to migrate towards RDFa
>> and using real independently declared and RDFS-documentable subtypes
>> e.g. http://w3.org/ns/minister-vocab#Minister. b) where we say we'll
>> handle the appearance of a string in places where a string is
>> expected, we probably should say that we will treat that as a shortcut
>> for saying 'the string is the value of the http://schema.org/name
>> property of the thing' c) Mixing of URIs for things vs URIs for
>> documents that describe those things - at this stage, messyness 'comes
>> with the territory'. Everyone would like cleaner distinction between
>> abstract entities and the docs that describe them, but we won't get
>> there in the mainstream Web by simply demanding them - the Linked Data
>> community has learned that even amongst enthusiastic experts,
>> publishing such data is hard.
>> 
>> As far as properties as properties go, you've noticed a few cases
>> where we express in prose some notion of superproperty. There are also
>> 'this property is replaced by that property' situations, e.g.
>> http://schema.org/actors vs http://schema.org/actor. Third, if you
>> look at the source files in W3C mercurial from which the schema.org
>> site is generated, you'll see documentation of equivalentClass /
>> equivalentProperty relationships in a few cases, e.g.
>> https://dvcs.w3.org/hg/webschema/file/2d9d90bce7a0/schema.org/ext/dataset.html
>> . In all the cases it is reasonable to expect more from the schema.org
>> site implementation (displaying this data, exposing it in RDF/RDFS
>> etc.), and documenting in some updated account of the data model. But
>> right now as Guha says, we only have a very informal, skeletal notion
>> of properties of properties. A step towards this was creating
>> per-property pages, that could carry such information in a simple
>> user-facing way.
>> 
>> Dan
> 
> 


----
Ivan Herman, W3C 
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf






Received on Friday, 25 October 2013 07:14:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:32 UTC