Re: Consumer guidance from Ivan Herman on 2011-11-23 (public-html-data-tf@w3.org from November 2011)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 23 Nov 2011 10:48:30 +0100
To: Jeni Tennison <jeni@jenitennison.com>
Cc: HTML Data Task Force WG <public-html-data-tf@w3.org>
Message-Id: <DB3B8571-D3FB-4A65-BC76-C66D5B37E38F@w3.org>
On Nov 22, 2011, at 22:48 , Jeni Tennison wrote:

> Ivan,
> 
> On 22 Nov 2011, at 10:05, Ivan Herman wrote:
>> I miss some other factors that may have to be listed as part of the publishing/consuming decision.
>> 
>> - Are you bound to one vocabulary or more. If only one, I guess RDFa/md/mf provide a more or less equal environment in this respect; but if you are bound to several vocabularies (now or in future) within the content, eg, to combined it with Linked Data, RDFa is much more appropriate
> 
> Are you talking about for publishing or consuming?
> 
> For publishing, the page
> 
>  http://www.w3.org/wiki/Mixing_HTML_Data_Formats

Ah, my bad. It is just a bit unclear to me how these different wiki pages will end up in one? two? more? documents as a report...

> 
> talks about the mechanics of using multiple vocabularies. I'm not sure what to add there? Perhaps something more in the part in the page
> 
>  http://www.w3.org/wiki/Choosing_an_HTML_Data_Format#Publishing_in_Multiple_Formats
> 
> like:
> 
>  If your target consumers will all accept the same syntax, it is usually
>  easiest to use that single syntax in your pages. However, microdata does
>  not support multiple types for a single entity, so if your target 
>  consumers expect different vocabularies to be used for the same entities 
>  you may find it easier to mix syntaxes or use RDFa or microformats, which
>  do support multiple vocabularies.
> 
> Would that address your concern?

Yes.

> 
> On the consuming side, perhaps it's worth adding something like:
> 
>  While adopting existing vocabularies is generally a good idea, be aware
>  that it can be hard for publishers to use multiple vocabularies to
>  describe a single entity, particularly if they use microdata to do so.
>  It will generally work best to consume a single base vocabulary on top
>  of which you understand additional properties.
> 
> I don't know if that's along the lines you were thinking?


Hm. I am not 100% sure I understand what this means... I would rather say something along the lines that the consumer should be prepared to the fact that the published material may include references to other vocabularies (typing, predicates, etc) that the consumer does not necessarily know about. In such a case, the consumer should ignore those references but should by no means influence consuming vocabulary items that it understands. This is, for example, a very important aspect of schema.org that was not made clear at the initial announcement: consumer may safely mix schema.org and, say, good relations terms for the same resource; schema.org will just pick its own terms out of the structure and live happily with that.

> 
>> - I am not sure you want to raise the datatype issue, but there are again differences there that may influence the publishing and consuming choices
> 
> OK, I think that probably comes under vocabulary design. As far as I can see, the only time the ability to annotate values with datatypes makes a difference is if the type of the value of a property cannot be inferred from the property and the syntax of the value. Personally, I've been convinced that vocabularies in which that's the case are hard to use and likely to lead to bad data.
> 

If you refer to an automatic inference of type, I tend to agree with you. What it means for publishers is that if the data and its consumption is dependent on datatypes (or at least would be significantly better using them) then RDFa is a better choice which provides a clear typing facility. (The only exception may be the <time> element.) 

> I think that's a good thing to mention in the vocabulary design page. I'll have a go at some wording...
> 
>> - If you rely on javascripting together with the structured data, there are again differences: microformats, as far as I know (may be wrong!) does not have a dedicated API; microdata has that as part of its definition; RDFa has some drafts around but they are not on the same level of maturity as their counterpart in microdata. A somewhat similar issue is the access to the data in json.
> 
> 
> Yes. The section on Tooling Considerations at
> 
>  http://www.w3.org/wiki/Choosing_an_HTML_Data_Format#Tooling_Considerations
> 
> is meant to cover that, but of course it's hard to give general advice there both because we can't list all available tools and because the tooling landscape changes so rapidly.

Sure. And listing explicit tools and libraries would not really be a good idea. 

But I think making it clear that, at present at least, only microdata has an API as part of its specification is worth mentioning; that is important if developers want to use, say, Javascript (although, at this moment, I am not sure any of the browsers implement this API). We could/should also mention that similar work is considered for the RDFa landscape, but its maturity (as of now) is not on the same level as microdata.  

Cheers

Ivan

> 
> But yes, it could do with being less mealy-mouthed. I'd welcome any suggested wording (or just edit the page).
> 
> Thanks,
> 
> Jeni
> -- 
> Jeni Tennison
> http://www.jenitennison.com
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 23 November 2011 09:45:51 UTC