- From: Gregg Kellogg <gregg@kellogg-assoc.com>
- Date: Thu, 13 Oct 2011 02:43:34 -0400
- To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- CC: Bob Ferris <zazi@smiy.org>, "public-vocabs@w3.org" <public-vocabs@w3.org>
On Oct 12, 2011, at 11:24 PM, "Martin Hepp" <martin.hepp@ebusiness-unibw.org> wrote: > Hi Greg, > Thanks! I think this is getting better and better! Are there already implementations for this? My own Ruby-based parser [1] does most of this. I haven't yet deployed it to my online distiller. > In particular, does anybody know whether there is a parser module for rdflib (Python) that supports this? > > Second question: I am not entirely sure how the RDF with the collections will look like. It should match the same queries as the RDFa markup. The use of collections is required to match the order-requirements of Microdata, but I feel your pain. This could be addressed with some entailment rules, I suppose. Best follow up with your concerns on public-html-data-tf@w3.org. > And a suggestion: It would be good to specify a heuristic for turning property values into proper typed RDF literals if > > - the vocabulary is known or retrievable from the Web and > - has a single rdfs:range statement to a known xsd:datatype or > - if it can be determined from looking at the data. > > The rough algorithm could be > > - for local properties, check whether the itemtype URI is dereferencable > - for properties with full URIs, check whether the itemtype URI is dereferencable > - search for rdfs:range statements for the property > - if the object of these statements is not a URI, break > - if it is a URI, one could do a regex test to the xsd namespace or check that it is not subject of additional triples in the vocabulary representation (in order to catch complex cases with user-defined datatypes; I think there are some OWL 2 use cases that may use this). In principle, also such user-defined datatype would be fine to work with, as long as they have a URI, but one may want to leave this out for the moment > - attach the "^^<URI>" suffix to the literal. > > Another approach would be to try to determine the datatype from the data and attach the respective suffix, e.g. int, integer, decimal, float, double, or boolean. If you cover those with a simple heuristic, you would also support most usages. Some integers meant to be xsd:float or double would get the wrong datatype by this, but an RDF environment would still know how to handle the data for comparison etc. > > Both would make the data good to work with in an RDF/SPARQL environment. > > So basically the following two examples should result in the same triples: > > a) GoodRelations in Microdata > <div itemscope itemtype="http://purl.org/goodrelations/v1#ProductOrServiceModel" itemid="#model"> > <span itemprop="name">ACME Electric Anvil</span> > ... > Weight: <div itemprop="http://purl.org/goodrelations/v1#weight" itemscope > itemtype="http://purl.org/goodrelations/v1#QuantitativeValue"> > <span itemprop="hasValueFloat">50</span> kg > <meta itemprop="hasUnitOfMeasurement" content="KGM" > > </div> > </div> > > b) GoodRelations in RDFa > <div typeof="gr:ProductOrServiceModel" about="#model"> > <span property="gr:name">ACME Electric Anvil</span> > ... > Weight: <div rel="http://purl.org/goodrelations/v1#weight"> > <div typeof="gr:QuantitativeValue"> > <span property="gr:hasValueFloat" datatype="xsd:float">50</span> kg > <div property="gr:hasUnitOfMeasurement" content="KGM"></div> > </div> > </div> > </div> We discussed doing datatype entailment in RDFa, but the problem is that we can't require that all parsers would do this, due to problems accessing external resources. Better to lobby for an @itemdatatype, which has been suggested. This would have to go through Hixie, but please send use cases to the HTML data TF. > Martin Gregg [1] http://rubygems.org/gems/rdf-microdata > On Oct 13, 2011, at 7:34 AM, Gregg Kellogg wrote: > >> (Appologies if this shows up twice, the first from a separate account seems to have gone to a filter). >> >> Note that the just-released Microdata to RDF draft defines property URI generation using the same domain as the @itemtype, not relative to the type itself. Read about it at [1]; comments welcome, feedback to public-html-data-tf@w3.org. >> >> Gregg >> >> [1] http://lists.w3.org/Archives/Public/public-html-data-tf/2011Oct/0066.html >> >> On Oct 12, 2011, at 3:57 PM, Martin Hepp wrote: >> >>> FYI: GoodRelations will clearly define in its next service update that the URIs of properties should be formed by attaching the local name of the property to the base URI of the vocabulary, not to the URI of the itemtype that gives the context. >>> >>> I also think that this is the most useful pattern for most cases, but if that cannot be written in the standard, Microdata parsers must simply offer this as a heuristic. >>> >>> >>> On Oct 12, 2011, at 10:44 AM, Bob Ferris wrote: >>> >>>> Hi, >>>> >>>> On 10/12/2011 9:45 AM, Bernard Vatant wrote: >>>>> Thanks for the pointer to any23.org <http://any23.org> >>>>> >>>>> An issue I clearly see with URIs such as http://schema.org/Person/name >>>>> is that some properties are used by more than one class. So we'll have >>>>> for example http://schema.org/Movie/duration and >>>>> http://schema.org/Event/duration potentially misleading to the idea that >>>>> they are different properties with specific domains, although the >>>>> definition found for "duration" is exactly the same at both >>>>> http://schema.org/Movie and http://schema.org/Event : "The duration of >>>>> the item (movie, audio recording, event, etc.) in ISO 8601 date format >>>>> <http://en.wikipedia.org/wiki/ISO_8601>." So it's another argument for >>>>> having this definition clearly published at a single place, under >>>>> http://schema.org/duration - with expected range >>>>> http://www.schema.org/Duration. (which BTW would lead to the side issue >>>>> of having a property and its range just differing by one character case, >>>>> not a good practice in my opinion). >>>> >>>> +1 for excluding the class domains in the URIs of multiple classes spanning properties, i.e., a name is a name is a name. A human user and also a machine will get the relation (specific meaning) of name via its context, i.e., the types of that resource, e.g., schemaorg:Person => a person's name etc. >>>> >>>> Cheers, >>>> >>>> >>>> Bo >>>> >>>> >>>> PS: otherwise we would probably end up with something the like the Freebase vocabulary ;) >>>> >>>> >>> >>> >> >
Received on Thursday, 13 October 2011 06:44:25 UTC