W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > December 2012

Re: Reproducing Gregg/Niklas' thoughts (@itemref issue) (ISSUE-144)

From: Niklas Lindström <lindstream@gmail.com>
Date: Sat, 8 Dec 2012 02:29:55 +0100
Message-ID: <CADjV5jcCKCD4BFod4Z3jhQaswM5CwXCbhntH_T-8nED_LcjWAA@mail.gmail.com>
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: Ivan Herman <ivan@w3.org>, Dan Brickley <danbri@danbri.org>, W3C RDFa WG <public-rdfa-wg@w3.org>
Ivan, Gregg,

Thanks for examining this! I don't need to write a proposal then, you
readily got the gist of it.

Here's a working implementation in Python:

-- 8< --
from rdflib import *

RDFA = Namespace("http://www.w3.org/ns/rdfa#")

def expand_rdfa_prototype(graph, keep_prototype=False):
    for s, proto in graph.subject_objects(RDFA.ref):
        if (proto, RDF.type, RDFA.Prototype) in graph:
            for p, o in graph.predicate_objects(proto):
                if (p, o) != (RDF.type, RDFA.Prototype):
                    graph.add((s, p, o))
    if keep_prototype:
        return
    for s, proto in graph.subject_objects(RDFA.ref):
        graph.remove((s, RDFA.ref, proto))
        if (proto, RDF.type, RDFA.Prototype) in graph:
            graph.remove((proto, None, None))
-- >8 --

Ivan, feel free to add that to pyRdfa for experimentation. Although I
imagine you're already writing code for this. ;)

I'm somewhat optimistic at the moment. This does seem to provide the
necessary mechanics.

I need to stress a couple of things though:

1) IMO, this *must not* be promoted as an alternate way of describing
*one* resource in different parts of the page. We have always had
@resource (and in full RDFa, @about) for this very reason. I fear that
that may have been overlooked recently. For instance, four of Gregg's
examples would be much clearer using that instead of any
itemref/prototype feature. A prototype feature should only be about
reproducing descriptions for multiple *different* resources.

2) @itemref is sometimes used for copying statements already in scope
for describing an item. E.g. the title and author of a page, reused
for e.g. an event. In the case we've seen with e.g. NCSU, that *may*
lead to odd data though (e.g. domain violations). And I really think
it is much better (as in easier so see what's going on) to just
reproduce any smaller parts that are shared, using <link> and <meta>
(if they are to be hidden). That said, this Prototype feature does
work here as well, by using of a nested rdfa:ref to capture the piece
which is both to describe the current subject, and reused elsewhere.
Example:

    <div property="about" typeof="CreativeWork">
      ....
     <div property="rdfa:ref" typeof="rdfa:Prototype" resource="_:main_image">
        <img alt="Sketch" property="image" src="/building_sketch.jpg" />
      </div>
      ....
    <div property="about" typeof="LandmarksOrHistoricalBuildings">
      <link property="rdfa:ref" resource="_:main_image" />
      ....

Here, the CreativeWork and LandmarksOrHistoricalBuildings share the
same image relation, via a prototype which is "folded in" by the
prototype post-processing. Of course, as just stated, I certainly
think it's better to just link to it twice (and in this case it would
save bytes, altough in the NCSU original it might be a tie, since the
real link is 225 chars long). In any case, the prototype feature is
much more verbose than @itemref (the wrapping ref div can be replaced
by just an @id in the image, to be ref:ed in just an @itemref on the
landmarks div). Although I still believe that this is fine, since it
ought to prevent prototypes from being overused where simple
repetition is just plain.. simpler.

3) The Prototype feature may come off as using semantics for what
seems like a syntax issue. Granted, "rdfa:ref" can be interpreted like
"reference to a prototypical resource whose (non-meta) characteristics
also apply to this resource" (and by "meta" I mean its rdfa:Prototype
class). Somewhat like a *very* distilled form of a union of
onProperty+hasValue OWL restrictions (see my example at [1]). So it
might not be too artificial (even for RDF people).

4) To me, the most important question continues to be: how is this
data supposed to be consumed? Does a ProductModel and variants thereof
actually suffice in reality [2]? Should a name, image and keywords be
copied verbatim for both a page, the work it describes, and the
building that the work depicts? What is necessary, and what is SEO
guesswork? If it's the latter, can it be acceptable to simply use
<link> and <meta>? Or does the interlinking between resources provide
a better context anyway, with clean, descriptive data enabling
services to make rich snippets usable? Or does embedded metadata have
to be denormalized? If so, will RDFa Prototypes be a viable option?

Let's continue to debate this, and gather more feedback.

Best regards,
Niklas

[1]: https://gist.github.com/4039715
[2]: https://github.com/niklasl/rdf-sparql-lab/blob/master/schema.org/tests/expand-model/001-in.html

(PS. Just for the record: we did in the past (on the subject of
@itemref) discuss supporting multiple resources in @about/@resource. I
do not suggest to debate it again, but I don't want us to completely
forget about it. Though it would only solve some of the cases.)

On Fri, Dec 7, 2012 at 11:30 PM, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> I'v updated my distiller at http://rdf.greggkellogg.net/distiller with support for rdf:ref. To make this work, be sure to check the "Expand graph" checkbox.
>
> All in all, implementing it took about an hour, most of which was for creating tests. It provides essentially equivalent functionality to @itemref, but in a more RDF-friendly way. I recommend adding support for the feature to HTML5+RDFa.
>
> Gregg Kellogg
> gregg@greggkellogg.net
>
> On Dec 7, 2012, at 2:12 PM, Gregg Kellogg <gregg@greggkellogg.net> wrote:
>
>> I added experimental support to my parser (will deploy to distiller later) as part of vocabulary expansion. I pretty much implement Ivan's algorithm as part of RDFa vocabulary expansion, with the following difference:
>>
>> I modified the DELETE clause to remove the rdfa:Prototype on the subject resource as well:
>>
>> DELETE DATA {
>>  ?x rdfa:ref ?PR .
>>  ?x rdf:type rdfa:Prototype .
>>  ?PR ?p ?y .
>> }
>>
>> Here are some example tests, based on those used by the Microdata RDF note:
>>
>> To a single ID:
>>
>>          <div>
>>            <div typeof="schema:Person">
>>              <link property="rdfa:ref" resource="_:a"/>
>>            </div>
>>            <p resource="_:a" typeof="rdfa:Prototype">Name: <span property="schema:name">Amanda</span></p>
>>          </div>
>>
>> should produce
>>
>>          @prefix schema: <http://schema.org/> .
>>          [a schema:Person; schema:name "Amanda"] .
>>
>> Adds additional property:
>>
>>        <div>
>>          <div typeof="schema:Person">
>>            <p>My name is <span property="schema:name">Gregg</span></p>
>>            <link property="rdfa:ref" resource="_:surname"/>
>>          </div>
>>          <p resource="_:surname" typeof="rdfa:Prototype">My name is <span property="schema:name">Kellogg</span></p>
>>        </div>
>>
>> should produce
>>
>>          @prefix schema: <http://schema.org/> .
>>          [ a schema:Person; schema:name "Gregg", "Kellogg"] .
>>
>> Multiple subjects with different types:
>>
>>          <div>
>>            <div typeof="schema:Person">
>>              <link property="rdfa:ref" resource="_:a"/>
>>            </div>
>>            <div typeof="foaf:Person">
>>              <link property="rdfa:ref" resource="_:a"/>
>>            </div>
>>            <p resource="_:a" typeof="rdfa:Prototype">Name: <span property="schema:name foaf:name">Amanda</span></p>
>>          </div>
>>
>> should produce
>>
>>          @prefix foaf: <http://xmlns.com/foaf/0.1/> .
>>          @prefix schema: <http://schema.org/> .
>>          [ a schema:Person; schema:name "Amanda"; foaf:name "Amanda"] .
>>          [ a foaf:Person; schema:name "Amanda"; foaf:name "Amanda"] .
>>
>> Multiple references:
>>
>>          <div>
>>            <div typeof="schema:Person">
>>              <link property="rdfa:ref" resource="_:a"/>
>>              <link property="rdfa:ref" resource="_:b"/>
>>            </div>
>>            <p resource="_:a" typeof="rdfa:Prototype">Name: <span property="schema:name">Amanda</span></p>
>>            <p resource="_:b" typeof="rdfa:Prototype"><span property="schema:band">Jazz Band</span></p>
>>          </div>
>>
>> should produce
>>
>>          @prefix schema: <http://schema.org/> .
>>          [ a schema:Person;
>>            schema:name "Amanda";
>>            schema:band "Jazz Band";
>>          ] .
>>
>>
>> With chaining:
>>
>>          <div>
>>            <div typeof="schema:Person">
>>              <link property="rdfa:ref" resource="_:a"/>
>>              <link property="rdfa:ref" resource="_:b"/>
>>            </div>
>>            <p resource="_:a" typeof="rdfa:Prototype">Name: <span property="schema:name">Amanda</span></p>
>>            <div resource="_:b" typeof="rdfa:Prototype">
>>              <div property="schema:band" typeof=" schema:MusicGroup">
>>                <link property="rdfa:ref" resource="_:c"/>
>>              </div>
>>            </div>
>>            <div resource="_:c" typeof="rdfa:Prototype">
>>             <p>Band: <span property="schema:name">Jazz Band</span></p>
>>             <p>Size: <span property="schema:size">12</span> players</p>
>>            </div>
>>          </div>
>>
>> should produce
>>
>>          @prefix schema: <http://schema.org/> .
>>          [ a schema:Person;
>>            schema:name "Amanda" ;
>>            schema:band [
>>              a schema:MusicGroup;
>>              schema:name "Jazz Band";
>>              schema:size "12"
>>            ]
>>          ] .
>>
>> Shared resource:
>>
>>          <div>
>>            <div typeof=""><link property="rdfa:ref" resource="_:a"/></div>
>>            <div typeof=""><link property="rdfa:ref" resource="_:a"/></div>
>>            <div resource="_:a" typeof="rdfa:Prototype">
>>              <div property="schema:refers-to" typeof="">
>>                <span property="schema:name">Amanda</span>
>>              </div>
>>            </div>
>>          </div>
>>
>> should produce:
>>
>>          @prefix schema: <http://schema.org/> .
>>          [ schema:refers-to _:a ] .
>>          [ schema:refers-to _:a ] .
>>          _:a schema:name "Amanda"
>>
>>
>> I'll have my updated distiller released support this later today.
>>
>> Gregg
>>
>> On Dec 7, 2012, at 11:22 AM, Ivan Herman <ivan@w3.org> wrote:
>>
>>>
>>> On Dec 7, 2012, at 14:00 , Dan Brickley wrote:
>>>
>>>>
>>>> On 7 Dec 2012 15:21, "Ivan Herman" <ivan@w3.org> wrote:
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> I tried to reproduce what Gregg/Niklas were considering yesterday and, I believe, here are the rules that we may define and then use a post-processing step on the resulting graph that execute those:
>>>>>
>>>>> INSERT DATA {
>>>>> ?x ?p ?y .
>>>>> }
>>>>> DELETE DATA {
>>>>> ?x rdfa:ref ?PR .
>>>>> ?PR ?p ?y .
>>>>> }
>>>>> WHERE {
>>>>> ?x rdfa:ref ?PR .
>>>>> ?PR ?p ?y.
>>>>> ?PR a rdfa:Prototype .
>>>>> }
>>>>>
>>>>> Ie, if I have somewhere:
>>>>>
>>>>> <div resource="#p" typeof="rdfa:Prototype">
>>>>>   <span property="foo">bar</span>
>>>>> </div>
>>>>
>>>> ....ah, so you're using special terms in an rdf vocab, to avoid making extra syntax?
>>>>
>>>> If this <div> had nested subelements, which part would be in the Prototype?
>>>
>>> Everything. The whole lot:-)
>>
>>>>
>>>>> ...
>>>>> ...
>>>>>
>>>>> <div resource="#A">
>>>>>   <span property="yep">Yep Yep</span>
>>>>>   <span property="rdfa:ref" resource="#p"/>
>>>>> </div>
>>>>>
>>>>> then what I would the following graph:
>>>>>
>>>>> <#A>
>>>>> <yep> "Yep Yep" ;
>>>>> <foo> "bar" .
>>>>>
>>>>> <#p> a rdfa:Prototype ;
>>>>> <foo> "bar" .
>>>>>
>>>>>
>>>>> Which is roughly a @itemref as we know it. I think it works and can be implemented without too much problems.
>>>>
>>>>
>>>> Thanks for investigating this issue!
>>>>
>>>>> Here, though, the problems I see with this. I do not consider these as show stoppers but we have to realize those
>>>>>
>>>>> - As you see, the triples on the prototype itself also make it in the final graph. I am not sure it is o.k., but I also do not know how to remove them. We could define, in the SPARQL 1.1 terms, some sort of a property path based DELETE DATA clause, but implementation of that might be a bit difficult. I am not sure it is worth it.
>>>>>
>>>>
>>>> I assume SPARQL is purely for documentational convenience / spec here, and not a real dependency?
>>>
>>> Yes. At the moment, that is the only syntax that can express all these rules (cannot express removal in N3:-(
>>
>> Pretty easy to do; I just create an additional rule to match the statements to be removed, and remove them from my output graph.
>>
>>>>
>>>>> - The pattern I used above is of course fine. But what happens if the user does the following:
>>>>>
>>>>> <div property="rdf:type" resource="rdfa:Prototype>
>>>>> <span property="foo">bar</span>
>>>>> </div>
>>>>>
>>>>> the subject, ie, the ?PR in the SPARQL pattern, would be anything that was inherited, which may lead to funny situations. In other words, we do give a rope to the user to hand himself, although I agree that this is very much a corner case.
>>>> I do worry about mixing vocab and syntax for such reasons.
>>>>
>>>>> - Would the execution of those rules be a required feature? If so, we would have to talk to the Google implementers (via DanBri) whether they would implement this at all. If not, the major use case of introducing this falls...
>>>> I don't fully understand. But I'd like to work this through next week with examples...
>>>
>>> O.k.
>>>
>>> For reference, there was another approach:
>>>
>>> http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Nov/0003.html
>>>
>>> which was based on the idea of a DOM manipulation *before* any type of RDFa processing, but reproducing a similar feature to @itemref. There are also issues with that one
>>>
>>> - does not work (well) if a streaming parser is used
>>> - for any implementation that is in a browser, it should start by duplicating the DOM and work on that one; indeed, manipulating the DOM that is also used for display is not a good idea:-(
>>>
>>> I am not 100% which of the two approaches I prefer (if we do anything, that is). I still tend to prefer the DOM manipulation one that seems to have less caveats for me, but that is just a mild preference...
>>>
>>> Ivan
>>>
>>>
>>>>
>>>> cheers,
>>>>
>>>> Dan
>>>>
>>>>> Food for thought...
>>>>>
>>>>> Ivan
>>>>>
>>>>>
>>>>> ----
>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>> Home: http://www.w3.org/People/Ivan/
>>>>> mobile: +31-641044153
>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>> ----
>>> Ivan Herman, W3C Semantic Web Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
Received on Saturday, 8 December 2012 01:30:55 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:19:57 UTC