Re: Mapping Microdata to RDF (Action-6) from Gregg Kellogg on 2011-10-10 (public-html-data-tf@w3.org from October 2011)

From: Gregg Kellogg <gregg@kellogg-assoc.com>
Date: Mon, 10 Oct 2011 15:05:59 -0400
To: Jeni Tennison <jeni@jenitennison.com>
CC: Gregg Kellogg <gregg@kellogg-assoc.com>, "public-html-data-tf@w3.org" <public-html-data-tf@w3.org>
Message-ID: <8ACC284F-A7B1-4F1B-8070-EEDD76A54C2A@greggkellogg.net>
On Oct 10, 2011, at 2:23 AM, Jeni Tennison wrote:

> Hi Gregg,
> 
> On 10 Oct 2011, at 06:42, Gregg Kellogg wrote:
>> On Oct 8, 2011, at 2:19 PM, Jeni Tennison wrote:
>>> ...
> 
>>> * I'm not sure we should be ignoring properties that are neither absolute URIs nor on a typed item; perhaps we should be constructing URIs for them that look like {document base URI}#{property}?
>> 
>> If there is an itemtype, a property value should either by an absolute URI, or something that is appended to the type namespace. The issues about lexical form of that are left to the base Microdata spec.
> 
> I suspect that we may be talking at cross purposes. What I'm saying is that if someone has in their page something like:
> 
>  <span itemscope><span itemprop="name">Gregg Kellogg</span></span>
> 
> (not nested within some other element with an itemtype on it) then it will produce the microdata:
> 
>    {
>      "properties": {
>        "name": [
>          "Gregg Kellogg"
>        ]
>      }
>    }
> 
> So rather than not producing any RDF at all (which is what I think step 6.2.1 is saying), I think it should produce:
> 
>  [] <#name> "Gregg Kellogg" .
> 
> where the base URI (which the #name will be concatenated to) is based on the document base URI.

I can see the value in that; when in doubt, generate a triple seems like a good strategy. I'll make it so.

>>> ...
> 
>>> * in step 3 of generating an RDF Collection, I think the object should be the blank node associated with the next element in the array rather than the next element in the array itself
>> 
>> ...
> 
> I think I'm confused over the wording in step 3. We start with an array containing "foo" and "bar". What the text says is:
> 
>  1. Create a new array containing a blank node for every value in list
> 
> assume this is an array containing _:bn1 and _:bn2.
> 
>  2. For each pair of blank node and value from list the following triple is generated:
> 
>      subject: blank node
>      predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#first
>      object: value
> 
> so I now have:
> 
>  _:bn1 http://www.w3.org/1999/02/22-rdf-syntax-ns#first "foo" .
>  _:bn2 http://www.w3.org/1999/02/22-rdf-syntax-ns#first "bar" .
> 
>  3. For each blank node in the array the following triple is generated:
> 
>      subject: blank node
>      predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#rest
>      object: next element in the array or, if that does not exist,
>              http://www.w3.org/1999/02/22-rdf-syntax-ns#nil
> 
> The meaning "the array" in this step is ambiguous. When I read it, I assumed it was the initial property value array which contains "foo" and "bar". That would mean generating:
> 
>  _:bn1 http://www.w3.org/1999/02/22-rdf-syntax-ns#rest "bar" .
>  _:bn2 http://www.w3.org/1999/02/22-rdf-syntax-ns#rest 
>        http://www.w3.org/1999/02/22-rdf-syntax-ns#nil .
> 
> which is wrong (and what I was objecting to). What I think you mean is the array of blank nodes, which would mean generating:
> 
>  _:bn1 http://www.w3.org/1999/02/22-rdf-syntax-ns#rest _:bn2 .
>  _:bn2 http://www.w3.org/1999/02/22-rdf-syntax-ns#rest 
>        http://www.w3.org/1999/02/22-rdf-syntax-ns#nil .
> 
> Perhaps if you used the terms 'value array' and 'blank node array' consistently through the steps then it would avoid this confusion?

Frankly, I just lifted the text from the RDFa Core 1.1 draft [9], so if this is confusing, RDFa is probably confusing as well and should be updated. However, we do use two distinct terms _array_ and _list_. _list_ is used for the sequential values that are to be encoded into an rdf:List, _array_ is used for the set of blank nodes allocated to each member, so saying "next element in the array isn't really ambiguous. To attempt to clarify, how about the following:

1) Create a new array _array_ containing a blank node for every value in _list_.
2) For each pair of _blank node_ and _value_ from _list_ the following triple is generated:

  subject: _blank node_
  predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#first
  object: _value_

3) For each _blank node_ in the array the following triple is generated:

  subject: _blank node_
  predicate: http://www.w3.org/1999/02/22-rdf-syntax-ns#rest
  object: next element in _array_ or, if that does not exist, http://www.w3.org/1999/02/22-rdf-syntax-ns#nil 

4) Return the first blank node from _array_.

>>> Having some examples would be really useful. Perhaps you can add links to them from the wiki page?
>> 
>> I'll add some examples to the ReSpec document, but we should also have a space for them on the wiki. We should probably turn [1] into a reference to the ReSpec document, discussion and examples.
> 
> Yes, sounds great :)
> 
> Thanks,
> 
> Jeni
> 
>>>> [1] http://www.w3.org/wiki/Mapping_Microdata_to_RDF
>>>> [2] http://www.w3.org/TR/2011/WD-microdata-20110525
>>>> [3] https://github.com/gkellogg/rdf-microdata
>>> [4] http://dev.w3.org/html5/spec/Overview.html#document-base-url
>>> [5] http://www.w3.org/Bugs/Public/show_bug.cgi?id=14233
>>> [6] http://dev.w3.org/html5/spec/text-level-semantics.html#the-time-element
>> [7] http://wiki.whatwg.org/wiki/Time_element#duration
> [8] http://www.w3.org/Bugs/Public/show_bug.cgi?id=13240
[9] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview.html#PS-Lists
> 
> -- 
> Jeni Tennison
> http://www.jenitennison.com
>
Received on Monday, 10 October 2011 19:06:51 UTC