Re: Comments about microdata from Ian Hickson on 2009-06-09 (public-html@w3.org from June 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 9 Jun 2009 21:58:24 +0000 (UTC)
To: Henri Sivonen <hsivonen@iki.fi>
Cc: "public-html@w3.org WG" <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0906092123250.1648@hixie.dreamhostps.com>

On Wed, 13 May 2009, Henri Sivonen wrote:
> 
> (Bikeshed alert.) itemprop and subject aren't in concordance. E.g. 
> 'prop' and 'subj' or 'p' and 's' would be.

We could do itemsubject="" I guess, but I'm not convinced this is a 
serious problem.

> itemprop may take an 'absolute URL'. However, an 'absolute URL' is 
> defined in terms of resolving onto itself and the algorithm for 
> resolving URIfies URLs. Therefore, it seems that an absolute IRI cannot 
> be an absolute URL. Since RDF seems to allow IRIs as properties and the 
> current practice is to allow IRIs in various places, it seems to me that 
> itemprop should take absolute IRIs instead of absolute URLs or the 
> definition of absolute URL should be adjusted.

I believe this issue has been raised with Dan's draft; yes?

> Is there a reason why the value of various properties is taken to be the 
> textContent as such without performing white space normalization 
> (zapping leading and trailing white space and collapsing runs of white 
> space into single U+0020)?

It is simpler to just use textContent than to try to Do The Right Thing 
here. Consider a case where the whitespace is important (e.g. ASCII art). 
It seems easier to just require users of the data to apply normalisation 
where they need it than to require it to be applied uniformly. (There is 
no reason to believe that the normalisation can't be applied later, as far 
as I can tell.)

There is a broader problem of alt="" and <bdo> being dropped by 
textContent. I'm not sure what to do about this. It affects other 
non-microdata situations also.

> Why doesn't the conversion to JSON expose top-level link relations, 
> metadata items, cites and title? I think it should expose the same data 
> as the RDF conversion (except perhaps lang, because annotating each 
> value with lang in JSON would be cumbersome).

The goals of the JSON and RDF outputs are different. The goal of the JSON 
output is just to expose the data required for the use cases involving the 
drag-and-drop API. We want to keep that more or less to a minimum of 
required work for responsiveness. The goal for the RDF conversion is to 
get as much of the data as possible into a form that can be used by RDF 
systems, so that people aren't tempted to use microdata where there are 
better HTML mechanisms in place already.

> When the <time> element is used to create a RDF literal, the literal 
> should probably come with an appropriate datatype.

I considered doing that but it would probably double the complexity of an 
HTML-to-RDF implementation (since they'd have to detect the format of the 
<time> element's value instead of just using the contents of the 
attribute), so I haven't done it for now.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 9 June 2009 21:59:01 UTC