Re: how do I copy some properties that are part of a bigger pattern from Niklas Lindström on 2014-03-09 (public-rdfa@w3.org from March 2014)

From: Niklas Lindström <lindstream@gmail.com>
Date: Sun, 9 Mar 2014 18:10:03 +0100
To: Jarno van Driel <jarnovandriel@gmail.com>
Cc: Gregg Kellogg <gregg@greggkellogg.net>, public-rdfa <public-rdfa@w3.org>
Message-ID: <CADjV5jeG6AFf9Tp_y4PZgiMRHU1sUfWRxphdZMnZTDDQLC3sKw@mail.gmail.com>
Hi Jarno,

On Sun, Mar 9, 2014 at 5:08 PM, Jarno van Driel <jarnovandriel@gmail.com>wrote:

> "...outputs two different nodes for what seemingly is the same
> corporation..."
> You're right in stating that this results in two instances of the same
> Corporation. Which is the only way in Microdata to have an Item
> (Corporation) be linked to other Items by means of different properties
> (copyrightHolder & publisher). The following markup simply wouldn't work in
> Microdata:
> <div itemprop="manufacturer" itemref="corporation-data">
>

Yes, microdata (presumably) being a tree model prevents it from connecting
items together naturally. It's a big flaw. It only deals with surface data,
and says nothing about what it means. Perhaps @itemid makes it into some
kind of graph at times though, it's hard to tell when there are no
semantics explaining what that entails.


In Microdata itemref can only get additional info about a Type. You can't
> use it on a property and then use itemref to get the @itemtype elsewhere.
> That's why in Microdata I have to declare the Corporation twice, to be able
> to link it to different entities (ItemPage & Article) by means of different
> properties (copyrightHolder & publisher). Which brings me to the question:
> Can this be accomplished RDFa Lite where it can't in Microdata? - keeping
> in mind that in this specific example according to schema.org rules the
> publisher and copyrightHolder are both expected to 'have' a type and are
> not supposed to 'link' to a type.
>

Yes, it can. RDFa uses the RDF data model, which is a graph [1]. There is
no difference here between links and "nested" items. You type and (when
needed) identify things, link them together and describe their details with
literals (texts) - all using properties. That is what I did in the example
given.


"...<p resource="#page">
> <span property="copyrightHolder" typeof="Corporation" resource="#corp">..."
> The downside to this method is that the copyrighHolder-Corporation now
> gets linked falsely. I quickly checked the output in Google's SDTT, which
> showed the Corporation being a child of the WPFooter as opposed to being
> the copyrightHolder of the ItemPage. The use of rdfa:pattern prevents this
> happening as does a itemscope without an itemtype in Microdata e.g. <div
> itemscope>.
>

The Google SDTT is wrong. It should recognize that <p resource="#page">
sets the subject for nested statements (here ensuring that the <#page> has
the <#corp> as :copyrightHolder). It seems that adding a @typeof:

    <p resource="#page" typeof="ItemPage">

makes it behave somewhat more as expected. But note that that isn't
necessary in RDFa, it's just a workaround for a bug in the SDTT. (Try the
example out in e.g. <http://rdfa.info/play/> to see it more clearly.)



> "Also, the resulting data here doesn't contain two distinct nodes for what
> is apparently meant to be the same corporation."
> True, but the two distinct nodes also have type-specific relations to the
> two distinct items this example has, namely ItemPage and Article. Maybe
> that info got a bit lost because I stripped out so much of the original
> HTML. The source I took this from has an ItemPage with a gazillion other
> types attached to it while the Article is just that, an Article, with it's
> own set of properties, mostly separated from the rest of the content on the
> ItemPage, only sharing data from the Corporation.
>

I think I see how you mean. But if you think of this in terms of the RDF
data model, the items simply are resources linked together (and assigned
some types, and described with textual properties), rather than blocks of
data tied to the page structure (or the microdata tree structure, which
hardly helps). In this model, the corporation is surely one thing,
connected to from the ItemPage using copyrightHolder, and from the Article
using publisher (both of which are fine since the thing linked to is of the
expected type).



> "I'd be happy to take a look at such examples as well."
> Maybe we should meet in an IRC session, like Gregg suggested, because I'm
> convinced we can keep this argument-counterargument up for quite some time.
> Not that I mind, since this mailing has already given me a ton to think
> about, but simply to be more time-efficient. Just let me know what you guys
> prefer, either way is fine with me.
>

I'm fine either way too. :) I tend to have intermittent bouts of time, so
mailing is usually better for examples. But I could go for a chat over
specifics if needed.

Cheers,
Niklas

[1]: http://www.w3.org/TR/rdf11-primer/



>
>
> 2014-03-09 14:19 GMT+01:00 Niklas Lindström <lindstream@gmail.com>:
>
> Hi Jarno and Gregg!
>>
>> It seems to me that this is a good example of where @itemref-like
>> functionality is quite unnecessary in RDFa. The #copyright-holder simply
>> contains a link from the page to the corporation, and the #publisher-url
>> and #publisher-description contain properties of that corporation. The
>> resulting microdata, however, outputs two different nodes for what
>> seemingly is the same corporation, so perhaps the example has been
>> simplified too much, thus obscuring what is actually needed?
>>
>> Still, In RDFa, instead of adding different @id:s to disparate parts of
>> the page which are about the same resource (and then listing them in
>> @itemref), you simply use @resource to capture the fact that a given block
>> is about it.
>>
>> Your example can thus be written like this in RDFa Lite:
>>
>> - - - 8< - - -
>>
>> <body vocab="http://schema.org/" typeof="ItemPage" resource="#page">
>>   <article property="text">
>>     <div typeof="Article">
>>       <link property="publisher" resource="#corp">
>>
>>       <h1 property="name">How to copy properties in RDFa Lite &
>> Microdata</h1>
>>     </div>
>>   </article>
>>
>>   <footer property="mentions" typeof="WPFooter">
>>     <div property="text">
>>       <p resource="#page">
>>         <span property="copyrightHolder" typeof="Corporation"
>> resource="#corp">
>>           <a property="url" href="http://www.example.org">
>>              <span property="name">Corporation name</span>
>>           </a>
>>
>>           <span property="description">Corporation description</span>
>>          </span>
>>       </p>
>>     </div>
>>   </footer>
>> </body>
>>
>> - - - >8 - - -
>>
>> In my opinion, this is a more convenient way of handling data smeared out
>> in a messy tag soup (with the results being shorter and more legible). Of
>> course, you need to name these resources, unless they already have formal
>> URIs, but that's easily done with a fragment identifier or a bnode id. (And
>> note that in microdata, you instead need to ensure that a layout designer
>> doesn't meddle with the @id values used by @itemref, for quite different
>> reasons (their use in CSS and JS).)
>>
>> Also, the resulting data here doesn't contain two distinct nodes for what
>> is apparently meant to be the same corporation.
>>
>> Remember, it is only when you need to duplicate a set of properties for
>> different resources that rdfa:copy is necessary. And even in those
>> circumstances, you might be able to leverage the way @resource can group
>> descriptions together, to build up one pattern from disparate parts of the
>> page.
>>
>> I'd be happy to take a look at such examples as well.
>>
>> Cheers,
>> Niklas
>>
>>
>>
>> On Sun, Mar 9, 2014 at 11:51 AM, Jarno van Driel <jarnovandriel@gmail.com
>> > wrote:
>>
>>> I think your and my latest example just passed each other Gregg. I guess
>>> I posted mine when you were writing yours because when I compare the two I
>>> see we implemented the same workaround by means of additional @resource.
>>>
>>> "I wouldn't recommend the use of included patterns in RDFa, but it can
>>> be made to work."
>>> I wouldn't recommend it either but unfortunately the everyday website
>>> out there consists out of a HTML-soup which doesn't allow for Semantic
>>> markup to be added in a nice and clean way. Now I mainly work on already
>>> existing websites, where I have to make do with HTML that's already in
>>> place. Therefore itemref or rdfa:pattern are indispensable when
>>> organizing/linking data that's smeared out over many different HTML
>>> elements on a page. I am very aware this results in markup that isn't
>>> 'nice' but it helps create meaning even if the HTML is a mess.
>>>
>>> "P.S., I think it's great that you're trying to describe this for a
>>> wider audience!"
>>> Well, I'm not doing it alone. Aaron Bradley is acting as the devil's
>>> advocate by asking me questions which mess up the solutions I provide.
>>> Which in return forces me to come up with different solutions and ask a lot
>>> of questions at the public-vocabs (and now here as well).   :)
>>>
>>> So trying to do something for a bigger audience will most definitely end
>>> up in something that has been contributed by many people. As always this
>>> kind of stuff ends up being a multi-community/person effort since it brings
>>> together so many different specializations and specifications.
>>>
>>> --
>>>
>>> Andy and Gregg,
>>> Thanks for sharing your knowledge, I'll make sure re-share it and am
>>> hopeful it will result in an article (or series of) which will try to serve
>>> anybody who is (or should be) interested in this type of info.
>>>
>>>
>>> 2014-03-09 6:46 GMT+01:00 Gregg Kellogg <gregg@greggkellogg.net>:
>>>
>>> On Mar 8, 2014, at 5:50 PM, Jarno van Driel <jarnovandriel@gmail.com>
>>>> wrote:
>>>>
>>>> "..the @resource attributes get in the way.."
>>>> Could you explain this to me a bit more please Gregg? Because if I
>>>> parse my last markup through the Structured data linter and RDFa Play I get
>>>> 100% the same outcome as with your markup. Yandex and Google see the same
>>>> data as well (in a ever so slightly different manner).
>>>>
>>>> When I look at the output these parsers have no trouble extracting the
>>>> @resources as different rdfanodes. Unless I'm completely overlooking
>>>> something, or am breaking some cardinal rules, which both are feasible
>>>> since I just got around to looking more deeply into RDFa Lite.
>>>>
>>>>
>>>> In order to be able to reference the publisher-uri and
>>>> publisher-description information as patterns, they need to have an
>>>> identifier, which I supplied by adding @resource (and
>>>> @typeof="rdfa:Pattern) to each. However, this changes the scope of their
>>>> properties relative to the copyright-holder.
>>>>
>>>> In you're RDFa version you weren't able to access the publisher-uri or
>>>> publisher-description, as you do from Microdata. The RDFa property copying
>>>> uses a resource of type rdfa:Pattern, which must be identified as a
>>>> resource. For this reason, I added the @resource and @typeof for both the
>>>> publisher-description and publisher-url. However, doing that, changes the
>>>> current subject for each of these, so the "url" and "description"
>>>> properties are allocated to different resources. To get around this, I
>>>> added the rdfa:copy properties both the the publisher reference, and to the
>>>> copyright-holder, so that the properties appear in each of them. I wouldn't
>>>> recommend the use of included patterns in RDFa, but it can be made to work.
>>>>
>>>> I'd recommend both for Microdata and RDFa to keep references simple,
>>>> and using included references, while possible, can make things more
>>>> confusing. This is certainly not a pattern we were concerned about when
>>>> crafting the property copying mechanism in HTML+RDFa. They two really work
>>>> quite differently: Microdata requires full access to the DOM so that
>>>> referenced elements can be copied, which requires random access to the DOM.
>>>> The RDFa mechanism operates at a semantic level, by creating triples as
>>>> normal. RDFa is intended to work with streaming processors, where there is
>>>> no random-access to the DOM. The spec provides details of the rules which
>>>> are applied to achieve the effect of property copying [1], but it's not
>>>> really magic to RDFa, and could just as easily be done for triples
>>>> extracted from Turtle, or even Microdata, if the appropriate copying rules
>>>> were applied.
>>>>
>>>> I understood that you didn't know how to deal with a pattern embedded
>>>> in another pattern, which I attempted to address for you. I think that the
>>>> RDFa I provided does essentially what your Microdata does. If you want to
>>>> discuss more, we should probably meet on IRC.
>>>>
>>>> Gregg
>>>>
>>>> P.S., I think it's great that you're trying to describe this for a
>>>> wider audience!
>>>>
>>>> [1] http://www.w3.org/TR/rdfa-in-html/#implementing-property-copying
>>>>
>>>>
>>>> 2014-03-09 1:33 GMT+01:00 Gregg Kellogg <gregg@greggkellogg.net>:
>>>>
>>>>> Hi Jarno, I don't think you can do precicely what you want, since if a
>>>>> pattern is included in another pattern, the @resource attributes get in the
>>>>> way. You can do it by adding some more rdfa:copy properties. This is what I
>>>>> came up with:
>>>>>
>>>>> <body vocab="http://schema.org/" resource="#item-page"
>>>>> typeof="ItemPage">
>>>>>   <link property="rdfa:copy" href="#copyright-holder">
>>>>>
>>>>>   <article property="text">
>>>>>     <div resource="#article" typeof="Article">
>>>>>       <div property="publisher" typeof="Corporation">
>>>>>         <link property="rdfa:copy" href="#publisher-url"/>
>>>>>         <link property="rdfa:copy" href="#publisher-description"/>
>>>>>       </div>
>>>>>
>>>>>
>>>>>       <h1 property="Name">How to copy properties in RDFa Lite &amp;
>>>>> Microdata</h1>
>>>>>     </div>
>>>>>   </article>
>>>>>
>>>>>   <footer property="mentions" typeof="WPFooter">
>>>>>     <div property="text">
>>>>>       <p resource="#copyright-holder" typeof="rdfa:Pattern">
>>>>>         <span property="copyrightHolder" typeof="Corporation">
>>>>>           <link property="rdfa:copy" href="#publisher-url"/>
>>>>>           <link property="rdfa:copy" href="#publisher-description"/>
>>>>>           <span resource="#publisher-url" typeof="rdfa:Pattern">
>>>>>             <a id="publisher-url" property="url" href="
>>>>> http://www.example.org" title>
>>>>>               <span property="name">Corporation name</span>
>>>>>             </a>
>>>>>           </span>
>>>>>
>>>>>           <span resource="#publisher-description"
>>>>> typeof="rdfa:Pattern">
>>>>>             <span id="publisher-description"
>>>>> property="description">Corporation description</span>
>>>>>           </span>
>>>>>         </span>
>>>>>       </p>
>>>>>     </div>
>>>>>   </footer>
>>>>> </body>
>>>>>
>>>>>  Gregg Kellogg
>>>>> gregg@greggkellogg.net
>>>>>
>>>>> On Mar 8, 2014, at 2:37 PM, Jarno van Driel <jarnovandriel@gmail.com>
>>>>> wrote:
>>>>>
>>>>> <body vocab="http://schema.org/" resource="#item-page"
>>>>> typeof="ItemPage">
>>>>> <link property="rdfa:copy" href="#copyright-holder">
>>>>>
>>>>> <article property="text">
>>>>> <div resource="#article" typeof="Article">
>>>>>   <link property="publisher" typeof="Corporation" href=?????>
>>>>>
>>>>>  <h1 property="Name">How to copy properties in RDFa Lite &
>>>>> Microdata</h1>
>>>>> </div>
>>>>>  </article>
>>>>>
>>>>> <footer property="mentions" typeof="WPFooter">
>>>>>  <div property="text">
>>>>>  <p resource="#copyright-holder" typeof="rdfa:Pattern">
>>>>>  <span property="copyrightHolder" typeof="Corporation">
>>>>>   <a id="publisher-url" property="url" href="http://www.example.org"
>>>>> title>
>>>>>   <span property="name">Corporation name</span>
>>>>>  </a>
>>>>>
>>>>> <span id="publisher-description" property="description">Corporation
>>>>> description</span>
>>>>>  </span>
>>>>>  </p>
>>>>>  </div>
>>>>> </footer>
>>>>> </body>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
Received on Sunday, 9 March 2014 17:11:08 UTC