(RDFa, RDFa Lite, microdata…) Re: question about medical code from Ivan Herman on 2013-01-27 (public-schemabibex@w3.org from January 2013)

From: Ivan Herman <ivan@w3.org>
Date: Sun, 27 Jan 2013 12:04:58 +0100
To: kcoyle@kcoyle.net
Cc: public-schemabibex@w3.org
Message-Id: <1CE42BC8-0FF6-4D39-BC61-63AC567F6B93@w3.org>
Dear all,

as Karen indicated, I was on a trip and I could not comment earlier on the rdfa/microdata/rdfa lite/microformat issues related to schema.org. I think it may be good for the discussion to clarify the situation. My apologies to those who know these things already (some things have already been said on this thread, too).

For a full disclosure: I worked on the RDFa specs a lot, as well as implementations. I "think" in terms of RDF and RDFa and not in terms of microdata. This may make me biased, but I do my best...

First of all, we should take microformats[1] out of the equation once and for all. Schema.org is completely silent on microformats and I do not know of any plans on them using it.

Schema.org is primarily a model and a generic vocabulary. This sounds obvious, but this means that there may be several syntaxes expressing the same model; for my mental model an RDF encoding of the data provides a syntax independent view of that data.

At the moment, the schema.org pages refer to microdata as the syntax they advice their webmasters to use. That may change in future insofar as schema.org may also add RDFa Lite (see below) as an alternative syntax, but it is not expected that they would abandon microdata. But it is important to realize that microdata can be mapped to RDF. There is a W3C Note on the algorithm to follow[3]; there are several implementations of that algorithm (including mine[4]). 

(In other words, microdata can be considered as a syntax for a subset of RDF (there are certain things, like usage of datatypes or multiplicity of types for a resource that are difficult or impossible to express in microdata). There are also structural limitation for microdata in expressing RDF: it is difficult (though not impossible) to express complex graph structures (as opposed to trees or forests), or data using several different vocabularies concurrently.)

RDFa 1.1 Full[5] comes from the RDF world, and is a complete serialization of RDF in terms of XML and/or HTML attributes. It is complete and therefore more (some would say many many time more) complex than microdata. It has proven to be too complex for the purposes of a (structurally) simple set of metadata like what schema.org uses; hence the definition of a subset of RDFa 1.1, called RDFa 1.1 Lite[6]. RDFa Lite has the same level of complexity (or simplicity:-) as microdata; it is almost possible to make a simple 1-1 exchange of terms to get from microdata to RDF Lite or back. The major difference I personally see (in favour of RDFa) is RDFa's ability to mix vocabularies easily within the same set of metadata. For the usage of schema.org this may not be important *now* but this may come handy in future. 

What is important to note: if the same HTML file is annotated with RDFa Lite or with the corresponding microdata, the generated RDF (using [3] for the microdata case) will be essentially identical ('essentially' meaning that some extra 'administrative' triples may be added beyond the core metadata, but that can be ignored by applications). Hence my approach of looking at schema.org data in RDF.

To make things a little bit more complicated: RDFa 1.1 is defined for all kinds of XML dialects; it not yet, officially, final for HTML5[7,8]. The reason of this delay is that it has to be formally aligned with some of the HTML5 features that are not yet final either.

Where doest this all take us in this discussion?

 - it is perfectly o.k. to use Turtle/N3 as encoding of what metadata structures this group wants to define. It is o.k. because, at the end of the day, for many applications that is what counts (yes, I am biased, I think in terms of RDF...)
 - having the same codes in RDFa Lite as well as in microdata is good. Good to see the complexity of the encoding in both
 - indeed, it is better *not* to use RDFa Full features. Having had a cursory look at the Wiki page (and I intend to add a comment later on that) I believe this is all fine, because the examples seem to abide to Lite already.

I hope this was helpful for some of you, and not too boring for others...

Ivan

P.S. Before you ask: why having RDFa Lite as well as Microdata? The answer is: this is a historical artefact, the result of the fact that even standardization bodies are made up of humans:-)



[1] http://microformats.org
[2] http://www.w3.org/TR/microdata/
[3] http://www.w3.org/TR/microdata-rdf/
[4] http://www.w3.org/2012/pyMicrodata/
[5] http://www.w3.org/TR/rdfa-syntax/
[6] http://www.w3.org/TR/rdfa-lite/
[7] http://www.w3.org/TR/rdfa-in-html/
[8] http://www.w3.org/2010/02/rdfa/sources/rdfa-in-html/



On Jan 26, 2013, at 18:49 , Karen Coyle <kcoyle@kcoyle.net> wrote:

> 
> 
> On 1/26/13 9:21 AM, Jason Ronallo wrote:
> 
>> 
>> First, we ought not to confuse Microdata [1] with Microformats [2].
>> While the Schema.org partners have chosen to consume Microdata and
>> RDFa Lite, they have not agreed to support Microformats beyond some
>> they already consume.
> 
> Thanks, Jason, for the clarification. I need to sit down and memorize those definitions.
> 
> 
>> 
>> I don't think this group should try to make any recommendation that
>> would work in RDFa and not work in the more constrained RDFa Lite or
>> Microdata, since it is these syntaxes that the Schema.org partners
>> have agreed to consume.
> 
> That makes perfect sense to me. However, since I am not a coder (this should be obvious to all by now :-)), does this mean that any of the recommendations we have on our wiki need to change? I note that some of them do not have Microdata/RDFa lite examples, and therefore I simply don't know if they are compliant or not. Could someone with more coding knowledge take on this task? And should I drop the N3 example from the Identifiers-2 page?
> 
> Thanks, and sorry if this makes more work for others.
> 
> kc
> 
> 
>> 
>>> The question seems to be whether RDFa compliance is to be the test for every
>>> proposal for schema.org vocabularies. It definitely does not seem to have
>>> been in the past. Perhaps we need to ask this of DanBri?
>> 
>> I think you can refer to the Compliance section of this page:
>> http://schema.org/docs/datamodel.html
>> 
>> "While we would like all the markup we get to follow the schema, in
>> practice, we expect a lot of data that does not. We expect schema.org
>> properties to be used with new types. We also expect that often, where
>> we expect a property value of type Person, Place, Organization or some
>> other subClassOf Thing, we will get a text string. In the spirit of
>> "some data is better than none", we will accept this markup and do the
>> best we can."
>> 
>> I'm not sure what is meant by RDFa compliance, since RDFa Lite is
>> completely compliant with RDFa. It may not be as powerful and expose
>> all of the features we might like, but it is still compliant. I think
>> the best we can do is to provide examples of what would be best (like
>> provide a URI when possible), but expect that some publishers will
>> just enter a text string. The onus for complexity and sorting out poor
>> data is on the consumers and on the producers.
>> 
>> Jason
>> 
>> 
>> [1] Microdata http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html
>> [2] Microformats http://microformats.org/
>> 
> 
> -- 
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Sunday, 27 January 2013 11:05:25 UTC