Re: Microdata to RDF: First Editor's Draft (ACTION-6) from KANZAKI Masahide on 2011-10-16 (public-html-data-tf@w3.org from October 2011)

From: KANZAKI Masahide <mkanzaki@gmail.com>
Date: Sun, 16 Oct 2011 16:37:24 +0900
To: Jeni Tennison <jeni@jenitennison.com>
Cc: Gregg Kellogg <gregg@kellogg-assoc.com>, Martin Hepp <martin.hepp@ebusiness-unibw.org>, public-html-data-tf@w3.org
Message-ID: <CAHQ1n3C2x+nafP9bL_Pi7p7Q6Opne4kk8bhG5S6T=mGwfv4igw@mail.gmail.com>

Hello Jeni, thanks for the reply.

2011/10/16 Jeni Tennison <jeni@jenitennison.com>:
> The rationale for always mapping to a collection when there are multiple values is that it preserves the order in case it *is* needed. It's easy(ish) to remove ordering if you don't want it; impossible to reinstate order once it's been lost if you. See [1] for more detailed reasoning.

Well, I think the rationale to keep ordering is reasonable one. What
I'm anxious about is that current RDF model/practice is not
necessarily take collection (list) into account, e.g. properties
range, SPARQL query etc. It might become more common and desirable in
the future, but seems too big jump at this moment, I'm afraid.


> But what about rather than assuming a generic parse followed by some post-processing, if we explicitly left it up to implementations of the algorithm? We could say that each of the various things where knowledge of the vocabulary would make you do things differently is implementation-defined within particular constraints. So we would have something like:
>
>  * the _property_URI_creation_method_ is one of X, Y or Z (TBD) and is implementation defined
>  * the _datatype_ for a literal value is implementation defined
>  * the _multi-value_mapping_ is either _to_a_collection_ or _to_multiple_statements_ and is implementation defined
>
> Implementations themselves would then be free to use whatever method was suitable for them to determine how to set each of these, which might include some combination of:
>
>  * having hard-coded knowledge of particular vocabularies
>  * looking up what to do from a registry
>  * working out what to do based on a schema or ontology
>  * having some fixed defaults that will work in 99% of cases
>
> This would provide enough framework that individual implementations didn't each have to reinvent how to do everything, but the ability to insert vocabulary knowledge early in the process and a guarantee (by making it implementation defined rather than implementation determined) that the users of a tool will be informed about the tool's behaviour.
>
> What do you think? Would this work as an approach?

So, implementation choice (or compatibility parsing method as Martin
suggested) sounds a good starting point. I wonder, however, it would
not very happy for users if different tools generate different RDF
from the same microdata. Maybe some sort of defaults or recommended
methods would be useful.

cheers,

-- 
@prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name
"KANZAKI Masahide"; :nick "masaka"; :email "mkanzaki@gmail.com"].

Received on Sunday, 16 October 2011 07:37:52 UTC