- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 29 Jan 2010 09:29:54 +0000 (UTC)
- To: Philip Jägenstedt <philipj@opera.com>
- Cc: HTML WG <public-html@w3.org>
- Message-ID: <Pine.LNX.4.64.1001280113100.22027@ps20323.dreamhostps.com>
On Wed, 20 Jan 2010, Philip Jägenstedt wrote: > On Tue, 19 Jan 2010 09:22:25 +0100, Ian Hickson <ian@hixie.ch> wrote: > > On Sun, 17 Jan 2010, Philip Jägenstedt wrote: > > > This algorithm uses the http://purl.org/dc/terms/ namespace, while > > > the mapping at <http://dev.w3.org/html5/md/#conversion-to-rdf> uses > > > the http://purl.org/dc/elements/1.1/ namespace. > > > http://purl.org/dc/terms/ seems to be the canonical namespace at > > > this time, so I suggest just using that. > > > > Wait, what? I'm confused. What exactly are you saying should change? > > The works vocabulary maps itemprop="title" to > http://purl.org/dc/elements/1.1/title while the algorithm for converting > a document to RDF maps <title>foo</title> to > http://purl.org/dc/terms/title. Unless there's some specific reason for > this, use http://purl.org/dc/terms/title in both cases, as > /elements/1.1/ is apparently a legacy namespace (see > http://dublincore.org/documents/dcmi-terms/#H3) Fixed. > > > There's an issue with how vocabularies that use subitems are > > > currently handled. In short, triples are only generated if the item > > > either has a type which is an absolute URL or if the item property > > > is an absolute URL. This prevents site-private data from being > > > exported as RDF, which is a good thing. However, for vocabularies > > > which have an item type for the top-level item but not for subitems > > > (which seems quite unnecessary) this means that no triples are > > > generated for the subitems, even though the subitem reasonably be > > > considered to be using the same vocabulary as the typed top-level > > > item. To illustrate the point, here's the output of the RDF > > > extraction (as Turtle) from the Jack Bauer example if the current > > > spec is honored: [...] > > > > > > As you see, the structured subitems org, adr, etc just point to > > > blank nodes with no further triples for those nodes. My fix is to > > > pass on the type of the parent item when generating triples for > > > subitems as a default, which is overridden if the subitem defines > > > its own type (as e.g. agent does in the above). I think this is > > > sensible and it certainly produces a more complete RDF graph: [...] > > > > That works if you know the vocabulary and thus know that the nested > > subitem is from that vocabulary, but it seems highly suspect in the > > case where you don't know that. Also, consider: > > > > <div itemscope itemtype="http://example.com/person"> > > <p itemprop="school" itemscope> > > I go to school in the <span itemprop="class">middle</span> classroom. > > </p> > > <p itemprop="demographics" itemscope> > > I am <span itemprop="class">middle</span>-classed. > > </p> > > </div> > > > > (A bit contrived, but you get the idea.) It would be wrong to use the > > same predicate for both itemprep="class" cases. > > Since no itemtype is used for itemprop="school", > http://example.com/person must define this as part of its vocabulary, > unless the above is an example of invalid markup. Since it's all one > vocabulary, using the same prefix for the RDF predicates seems quite > logical. I really don't think they're the same predicate, but I agree that we need to expose these triples somehow. Consider: <div itemscope itemtype="http://example.com/a" itemref="x"></div> <div itemscope itemtype="http://example.com/b" itemref="x"></div> <div id="x"> <p itemprop="q" itemscope> <span itemprop="r">s</span> </p> </div> Right now this generates four blank nodes, which is a bug, it should generate three. But if we generate three, then what predicate do we use for the itemprop? I ended up going with kind of a compromise solution. The above generates _four_ triples, but _three_ nodes: @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix eg: <http://www.w3.org/1999/xhtml/microdata#http%3A%2F%2Fexample.com%2F> . _:n0 rdf:type <http://example.com/a> ; <eg:a%23%3Aq> _:n2 . _:n1 rdf:type <http://example.com/b> ; <eg:b%23%3Aq> _:n2 . _:n2 <eg:a%23%3Aq%20r> "s" ; <eg:b%23%3Aq%20r> "s" . Basically, instead of using "type:name", I used "type:parent-name name", where the space character is another character that, like ":", cannot appear in "name" and thus is usable here without making anything ambiguous. Hopefully this solves the problem relatively neatly, if not in the most performance-optimal way. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 29 January 2010 09:30:40 UTC