W3C home > Mailing lists > Public > public-rdfa@w3.org > March 2014

RE: how do I copy some properties that are part of a bigger pattern

From: Richard H. McCullough <rhm@pioneerca.com>
Date: Tue, 11 Mar 2014 11:02:49 -0700
Message-ID: <COL129-W7730036CD20DE461C9EAF6AA770@phx.gbl>
To: Jarno van Driel <jarnovandriel@gmail.com>, Gregg Kellogg <gregg@greggkellogg.net>
CC: Niklas Lindström <lindstream@gmail.com>, public-rdfa <public-rdfa@w3.org>
You might want to steal some ideas from the mKR language.

mKR lets you name any list of propositions, e.g.,
      my propositions :: { proposition list };
and manipulate that list in numerous ways.

You can add, delete, ... propositions
You can change the underlying class hierarchy
...
 
Dick McCullough 

Context Knowledge Systems

mKE and the mKR language

mKR/mKE tutorial

Date: Tue, 11 Mar 2014 15:41:09 +0100
From: jarnovandriel@gmail.com
To: gregg@greggkellogg.net
CC: lindstream@gmail.com; public-rdfa@w3.org
Subject: Re: how do I copy some properties that are part of a bigger pattern

Thanks for the sources Gregg. Some of 'm I know but with the new insights I have now I bet some of 'm will make much more sense to me now. I'll make sure to read it before asking more questions.


2014-03-11 2:01 GMT+01:00 Gregg Kellogg <gregg@greggkellogg.net>:

On Mar 10, 2014, at 5:07 AM, Jarno van Driel <jarnovandriel@gmail.com> wrote:


"...There is no difference here between links and "nested" items..." + "...Try the example..."

Thanks, you just made my brain explode.   =)
It's been a couple of years since my first attempts at understanding RDFa - which failed miserably - since I have difficulty translating the W3 specifications in, for me, understandable rules on how it's supposed to be used and what it can do. Your comments together with the RDFa Play outcome succeeded where countless hours of reading specifications and experimenting with markup have failed me. Seriously Niklas, thanks! 


Now as for the IRC meet, let that slide for now. A tsunami of possibilities just flushed over me and I have to give it some time to let it sink in. The first thought I had after reading your comments and seeing the RDFa Play outcome was that writing an article about the use of @itemref isn't that difficult but comparing that to rdfa:pattern just became a whole lot more complicated. It now has become clear to me there is no 1:1 relation between the two - where I thought there was - and that RDFa offers different solutions for many of the situations where one only can use @itemref in Microdata. Which IS marvelous but which leaves me confused in how to clarify that in an article without writing a series that's as thick as the bible.

There are some great discussion threads on public-rdfa-wg in around December 2012, starting with a proposal from Ivan. Check out the "Reproducing Gregg/Niklas' thoughts ..." thread in http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Dec/thread.html.

As Niklas points out, the original concept was that a semantic approach to property-copying, where we identified a resource and used it as the source for copying properties, and remove the original "template" resource. Basically, it could mostly be done using SPARQL with INSERT DATA/DELETE DATA. It's worth looking at the thread to see some of the thought processes that were going on at the time.

I do know however that I want to limit myself to RDFa Lite since it's the RDFa community's answer to Microdata. Or at least that's way I understand it. So let me therefore ask, what are the differences between RDFa and RDFa Lite? Is there any clear documentation about the difference between the two I can read?

The RDFa Lite 1.1 recommendation <http://www.w3.org/TR/rdfa-lite/> pretty much calls this out. Also, the RDFa 1.1 Primer <http://www.w3.org/TR/xhtml-rdfa-primer/>. The key observation to the LIte recommendation is that RDFa gets complicated when there are too many attributes on an element, and the distinction between @about and @resource can be subtle. Even now, I see people having a problem with Microdata, when they use @itemprop on an anchor, and seem to expect the content of the element, rather than the value of @href to be used as the property's value. RDFa suffers from the same issue, but things get simpler when you restrict yourself to using fewer attributes and avoid combining them together.

That said, there is quite a bit of power in full RDFa 1.1, particularly in the use of lists and chaining <http://www.w3.org/TR/rdfa-core/#s_chaining>. Chaining is really useful when you have a number of resource values from the same property, for example the author list of a document. This avoids repeating markup, but it is a sophisticated feature. IMO, you really can't write RDFa (full or lite) or Microdata without running it through a distiller to verify that it says what you mean.

Let me widen the question: Are there any sources you guys can recommend me to read about RDFa (Lite)?Like I said earlier, it's been a couple of years for me, so I hope new documentation exists by now, besides the W3 specifications.

Manu wrote a great post on the differences between RDFa Lite and Microdata: <http://manu.sporny.org/2012/mythical-differences/>.

Gregg
2014-03-09 18:10 GMT+01:00 Niklas Lindström <lindstream@gmail.com>:


Hi Jarno,

On Sun, Mar 9, 2014 at 5:08 PM, Jarno van Driel <jarnovandriel@gmail.com> wrote:




"...outputs two different nodes for what seemingly is the same corporation..."



You're right in stating that this results in two instances of the same Corporation. Which is the only way in Microdata to have an Item (Corporation) be linked to other Items by means of different properties (copyrightHolder & publisher). The following markup simply wouldn't work in Microdata:




<div itemprop="manufacturer" itemref="corporation-data">
Yes, microdata (presumably) being a tree model prevents it from connecting items together naturally. It's a big flaw. It only deals with surface data, and says nothing about what it means. Perhaps @itemid makes it into some kind of graph at times though, it's hard to tell when there are no semantics explaining what that entails.





In Microdata itemref can only get additional info about a Type. You can't use it on a property and then use itemref to get the @itemtype elsewhere. That's why in Microdata I have to declare the Corporation twice, to be able to link it to different entities (ItemPage & Article) by means of different properties (copyrightHolder & publisher). Which brings me to the question: Can this be accomplished RDFa Lite where it can't in Microdata? - keeping in mind that in this specific example according to schema.org rules the publisher and copyrightHolder are both expected to 'have' a type and are not supposed to 'link' to a type.





Yes, it can. RDFa uses the RDF data model, which is a graph [1]. There is no difference here between links and "nested" items. You type and (when needed) identify things, link them together and describe their details with literals (texts) – all using properties. That is what I did in the example given.






"...<p resource="#page">
	<span property="copyrightHolder" typeof="Corporation" resource="#corp">..."




The downside to this method is that the copyrighHolder-Corporation now gets linked falsely. I quickly checked the output in Google's SDTT, which showed the Corporation being a child of the WPFooter as opposed to being the copyrightHolder of the ItemPage. The use of rdfa:pattern prevents this happening as does a itemscope without an itemtype in Microdata e.g. <div itemscope>.




The Google SDTT is wrong. It should recognize that <p resource="#page"> sets the subject for nested statements (here ensuring that the <#page> has the <#corp> as :copyrightHolder). It seems that adding a @typeof:




    <p resource="#page" typeof="ItemPage">
makes it behave somewhat more as expected. But note that that isn't necessary in RDFa, it's just a workaround for a bug in the SDTT. (Try the example out in e.g. <http://rdfa.info/play/> to see it more clearly.)




 
"Also, the resulting data here doesn't contain two distinct nodes for what is apparently meant to be the same corporation."
True, but the two distinct nodes also have type-specific relations to the two distinct items this example has, namely ItemPage and Article. Maybe that info got a bit lost because I stripped out so much of the original HTML. The source I took this from has an ItemPage with a gazillion other types attached to it while the Article is just that, an Article, with it's own set of properties, mostly separated from the rest of the content on the ItemPage, only sharing data from the Corporation.




I think I see how you mean. But if you think of this in terms of the RDF data model, the items simply are resources linked together (and assigned some types, and described with textual properties), rather than blocks of data tied to the page structure (or the microdata tree structure, which hardly helps). In this model, the corporation is surely one thing, connected to from the ItemPage using copyrightHolder, and from the Article using publisher (both of which are fine since the thing linked to is of the expected type).




 
"I'd be happy to take a look at such examples as well."<
br>Maybe we should meet in an IRC session, like Gregg suggested, because I'm convinced we can keep this argument-counterargument up for quite some time. Not that I mind, since this mailing has already given me a ton to think about, but simply to be more time-efficient. Just let me know what you guys prefer, either way is fine with me.




I'm fine either way too. :) I tend to have intermittent bouts of time, so mailing is usually better for examples. But I could go for a chat over specifics if needed.




Cheers,Niklas
[1]: http://www.w3.org/TR/rdf11-primer/
 






2014-03-09 14:19 GMT+01:00 Niklas Lindström <lindstream@gmail.com>:





Hi Jarno and Gregg!
It seems to me that this is a good example of where @itemref-like functionality is quite unnecessary in RDFa. The #copyright-holder simply contains a link from the page to the corporation, and the #publisher-url and #publisher-description contain properties of that corporation. The resulting microdata, however, outputs two different nodes for what seemingly is the same corporation, so perhaps the example has been simplified too much, thus obscuring what is actually needed?







Still, In RDFa, instead of adding different @id:s to disparate parts of the page which are about the same resource (and then listing them in @itemref), you simply use @resource to capture the fact that a given block is about it.







Your example can thus be written like this in RDFa Lite:

- - - 8< - - -
<body vocab="http://schema.org/" typeof="ItemPage" resource="#page">






  <article property="text">    <div typeof="Article">      <link property="publisher" resource="#corp">

      <h1 property="name">How to copy properties in RDFa Lite & Microdata</h1>

    </div>  </article>
  <footer property="mentions" typeof="WPFooter">    <div property="text">      <p resource="#page">






        <span property="copyrightHolder" typeof="Corporation" resource="#corp">          <a property="url" href="http://www.example.org">






            <span property="name">Corporation name</span>          </a>
          <span property="description">Corporation description</span>






        </span>      </p>    </div>  </footer></body>
- - - >8 - - -
In my opinion, this is a more convenient way of handling data smeared out in a messy tag soup (with the results being shorter and more legible). Of course, you need to name these resources, unless they already have formal URIs, but that's easily done with a fragment identifier or a bnode id. (And note that in microdata, you instead need to ensure that a layout designer doesn't meddle with the @id values used by @itemref, for quite different reasons (their use in CSS and JS).)







Also, the resulting data here doesn't contain two distinct nodes for what is apparently meant to be the same corporation.
Remember, it is only when you need to duplicate a set of properties for different resources that rdfa:copy is necessary. And even in those circumstances, you might be able to leverage the way @resource can group descriptions together, to build up one pattern from disparate parts of the page.







I'd be happy to take a look at such examples as well.
Cheers,Niklas



On Sun, Mar 9, 2014 at 11:51 AM, Jarno van Driel <jarnovandriel@gmail.com> wrote:


I think your and my latest example just passed each other Gregg. I guess I posted mine when you were writing yours because when I compare the two I see we implemented the same workaround by means of additional @resource.








"I wouldn't recommend the use of included patterns in RDFa, but it can be made to work."I wouldn't recommend it either but unfortunately the everyday website out there consists out of a HTML-soup which doesn't allow for Semantic markup to be added in a nice and clean way. Now I mainly work on already existing websites, where I have to make do with HTML that's already in place. Therefore itemref or rdfa:pattern are indispensable when organizing/linking data that's smeared out over many different HTML elements on a page. I am very aware this results in markup that isn't 'nice' but it helps create meaning even if the HTML is a mess.








"P.S., I think it’s great that you’re trying to describe this for a wider audience!"Well, I'm not doing it alone. Aaron Bradley is acting as the devil's advocate by asking me questions which mess up the solutions I provide. Which in return forces me to come up with different solutions and ask a lot of questions at the public-vocabs (and now here as well).   :)








So trying to do something for a bigger audience will most definitely end up in something that has been contributed by many people. As always this kind of stuff ends up being a multi-community/person effort since it brings together so many different specializations and specifications.








--
Andy and Gregg,Thanks for sharing your knowledge, I'll 
make sure re-share it and am hopeful it will result in an article (or series of) which will try to serve anybody who is (or should be) interested in this type of info.









2014-03-09 6:46 GMT+01:00 Gregg Kellogg <gregg@greggkellogg.net>:








On Mar 8, 2014, at 5:50 PM, Jarno van Driel <jarnovandriel@gmail.com> wrote:









"..the @resource attributes get in the way.."Could you explain this to me a bit more please Gregg? Because if I parse my last markup through the Structured data linter and RDFa Play I get 100% the same outcome as with your markup. Yandex and Google see the same data as well (in a ever so slightly different manner).









When I look at the output these parsers have no trouble extracting the @resources as different rdfanodes. Unless I'm completely overlooking something, or am breaking some cardinal rules, which both are feasible since I just got around to looking more deeply into RDFa Lite. 








In order to be able to reference the publisher-uri and publisher-description information as patterns, they need to have an identifier, which I supplied by adding @resource (and @typeof=“rdfa:Pattern) to each. However, this changes the scope of their properties relative to the copyright-holder.








In you’re RDFa version you weren’t able to access the publisher-uri or publisher-description, as you do from Microdata. The RDFa property copying uses a resource of type rdfa:Pattern, which must be identified as a resource. For this reason, I added the @resource and @typeof for both the publisher-description and publisher-url. However, doing that, changes the current subject for each of these, so the “url” and “description” properties are allocated to different resources. To get around this, I added the rdfa:copy properties both the the publisher reference, and to the copyright-holder, so that the properties appear in each of them. I wouldn’t recommend the use of included patterns in RDFa, but it can be made to work.








I’d recommend both for Microdata and RDFa to keep references simple, and using included references, while possible, can make things more confusing. This is certainly not a pattern we were concerned about when crafting the property copying mechanism in HTML+RDFa. They two really work quite differently: Microdata requires full access to the DOM so that referenced elements can be copied, which requires random access to the DOM. The RDFa mechanism operates at a semantic level, by creating triples as normal. RDFa is intended to work with streaming processors, where there is no random-access to the DOM. The spec provides details of the rules which are applied to achieve the effect of property copying [1], but it’s not really magic to RDFa, and could just as easily be done for triples extracted from Turtle, or even Microdata, if the appropriate copying rules were applied.








I understood that you didn’t know how to deal with a pattern embedded in another pattern, which I attempted to address for you. I think that the RDFa I provided does essentially what your Microdata does. If you want to discuss more, we should probably meet on IRC.








Gregg
P.S., I think it’s great that you’re trying to describe this for a wider audience!
[1] http://www.w3.org/TR/rdfa-in-html/#implementing-property-copying









2014-03-09 1:33 GMT+01:00 Gregg Kellogg <gregg@greggkellogg.net>:









Hi Jarno, I don’t think you can do precicely what you want, since if a pattern is included in another pattern, the @resource attributes get in the way. You can do it by adding some more rdfa:copy properties. This is what I came up with:









<body vocab="http://schema.org/" resource="#item-page" typeof="ItemPage">  <link property="rdfa:copy" href="#copyright-holder">









  <article property="text">    <div resource="#article" typeof="Article">      <div property="publisher" typeof="Corporation">








        <link property="rdfa:copy" href="#publisher-url"/>        <link property="rdfa:copy" href="#publisher-description"/>      </div>










      <h1 property="Name">How to copy properties in RDFa Lite &amp; Microdata</h1>    </div>  </article>

  <footer property="mentions" typeof="WPFooter">    <div property="text">      <p resource="#copyright-holder" typeof="rdfa:Pattern">








        <span property="copyrightHolder" typeof="Corporation">          <link property="rdfa:copy" href="#publisher-url"/>          <link property="rdfa:copy" href="#publisher-description"/>








          <span resource="#publisher-url" typeof="rdfa:Pattern">            <a id="publisher-url" property="url" href="http://www.example.org" title>








              <span property="name">Corporation name</span>            </a>          </span>
          <span resource="#publisher-description" typeof="rdfa:Pattern">








            <span id="publisher-description" property="description">Corporation description</span>          </span>        </span>








      </p>    </div>  </footer></body>










Gregg Kellogggregg@greggkellogg.net


On Mar 8, 2014, at 2:37 PM, Jarno van Driel <jarnovandriel@gmail.com> wrote:









<body vocab="http://schema.org/" resource="#item-page" typeof="ItemPage">	<link property="rdfa:copy" href="#copyright-holder">









	<article property="text">		<div resource="#article" typeof="Article">








			<link property="publisher" typeof="Corporation" href=?????>










			<h1 property="Name">How to copy properties in RDFa Lite & Microdata</h1>
		</div>








	</article>
	<footer property="mentions" typeof="WPFooter">		<div property="text">









			<p resource="#copyright-holder" typeof="rdfa:Pattern">				<span property="copyrightHolder" typeof="Corporation">








					<a id="publisher-url" property="url" href="http://www.example.org" title>








						<span property="name">Corporation name</span>					</a>









					<span id="publisher-description" property="description">Corporation description</span>				</span>








			</p>		</div>
	</footer>








</body>
















 		 	   		  
Received on Tuesday, 11 March 2014 18:03:43 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:04:52 UTC