- From: Benjamin Nowack <bnowack@semsol.com>
- Date: Fri, 22 Jan 2010 11:22:50 +0100
- To: Ian Hickson <ian@hixie.ch>
- Cc: "Tab Atkins Jr." <jackalmage@gmail.com>, public-html@w3.org, "Philip Jägenstedt" <philipj@opera.com>
On 22.01.2010 09:30:58, Ian Hickson wrote: >On Fri, 22 Jan 2010, Benjamin Nowack wrote: >> >> P.S. as I just saw Ian's comment on IRC[1]: >> >> This algorithm ignores non-RDF structures such as >> >> <div itemscope itemtype="http://example.com/"> >> <span itemprop="a/b"/> >> </div> >> or >> <div itemscope itemtype="http://example.com/a/"> >> <span itemprop="b"/> >> </div> >> >> because common RDF vocabularies simply don't use URI patterns like >> "http://example.com/" or "http://example.com/a/" to declare resource >> types. > >Given this regexp (from your earlier e-mail): > > /^(.*[\/\#])([^\/\#]+)$/ > >...I understand that "a/b" wouldn't be usable as a keyword, because the >regexp's second pattern doesn't match strings with / and # characters. But >why would the other three types not work? The regexp only applies to itemtype. itemprops wouldn't have any # or / at all. "http://example.com/" and "http://example.com/a/" are not accepted by the regex because there are no (RDF) vocabularies where these URLs (trailing slash or hash) are used to specify a type. >Into what RDF statements would your proposal turn the above two examples? My algorith doesn't fire on those. The itemtype is not an RDF class. No typical RDF is generated. It may be converted to the prefixed/escaped triples, but I don't think RDFers would really define OWL axioms for each and every type and prop to extract sane RDF from those triples. I'm not aware of many OWL apps that apply ontological operations to RDF from HTML in the wild. >Another example would be: > > <div itemscope itemtype="http://example.com/vocab#"> > <span itemprop="x"/> > <span itemprop="http://example.com/vocab#x"/> > </div> > >For sanity, in the microdata model, this has to be two distinct >properties. What RDF would your proposal convert the above into? Same as above, the itemtype is not an RDF class. Here is one that'd be RDF: <div itemscope itemtype="http://example.com/vocab#Example"> <span itemprop="x"/> <span itemprop="http://example.com/vocab#x"/> </div> Assuming that empty values make sense, the two properties would result in the same predicate URI: http://example.com/vocab#x because "x" (per spec wording) is from the same vocabulary as http://example.com/vocab#Example, and "http://example.com/vocab#x" is a full URI, which happens to be from the same vocab, too. It's fine to have 2 distinct properties in the Microdata model including the DOM API, but effectively just one in RDF. The RDF model differs in other situations, too (graph vs. tree etc). If the 2 models were identical, there wouldn't have been a need for Microdata in the first place. It would of course be possible to mandate that URI-based itemprops MUST NOT be from the same vocabulary specified by the itemtype. This would be intuitive as URI-based itemprops are meant to enable vocab mixing. It doesn't make too much sense to specify a context vocabulary and still use fully qualified itemprop URLs. >> Requiring OWL magic to convert Microdata to its target RDF vocabulary >> makes Microdata even more complex to understand than RDFa. OWL is well >> beyond what a Microdata-to-RDF parser writer should need to know. > >The parser wouldn't need to know it at all, that's the point. The parser >can just convert it all into RDF, and then a simple blob of OWL can be >added to the triple store so that any RDF use of the data will treat the >microdata-originating properties as equivalent to the more commonly used >RDF vocabularies'.(After all, if the user didn't intend to use tools that >leverage the power of RDF, there's really not much point going to the >trouble to convert everything into RDF in the first place. The user could >just as easily simply use a JSON-like data structure, which is easier to >understand and query for most purposes.) Well, ... ;) RDF is RDF, and OWL is OWL. Even if certain OWL axioms can be written in simple RDF blobs, this doesn't mean that evaluating these definitions is equally simple. You need an inference engine or at least a SPARQL processor with UPDATE functionality. The overlap between people who use RDF as a data integration mechanism and those who run OWL engines is pretty small. Have a look at [1], you can find communities for each colour, some overlap, some don't. I've created dozens of RDF apps, I can't remember when I last required OWL. The problem with OWL-based Microdata to RDF mappings is that someone would have to define a mapping for each term. Unfortunately, there is no RDF mechanism where you could auto-convert all terms prefixed with "http://www.w3.org/1999/xhtml/microdata#" to something else. And even if you had the OWL axioms and an OWL processor, you'd end up with twice the triples than those generated by the parser. And inferred triples are not necessarily associated with the same originating graph, i.e. you'd lose provenance information, unless you build the OWL processing into the parser. Having said that, there is still an easy way to end up with proper RDF triples even if the conversion algorithm is kept as is. The parser just reverts the prefixing/escaping in case of RDF itemtypes. But it would be nice if RDF converters wouldn't need that extra step. Cheers, Benji [1] http://bnode.org/blog/2009/07/08/the-semantic-web-not-a-piece-of-cake > >-- >Ian Hickson U+1047E )\._.,--....,'``. fL >http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. >Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' >
Received on Friday, 22 January 2010 10:23:19 UTC