Re: Weighing the ideas around itemref from Ivan Herman on 2012-12-10 (public-rdfa-wg@w3.org from December 2012)

From: Ivan Herman <ivan@w3.org>
Date: Mon, 10 Dec 2012 06:21:16 -0500
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: Niklas Lindström <lindstream@gmail.com>, public-rdfa-wg <public-rdfa-wg@w3.org>, Dan Brickley <danbri@danbri.org>
Message-Id: <2596D2A5-1FC2-4103-B3E9-104B34E6F9B9@w3.org>
Looking at this arguments... I have the impression that, in many places, it is some sort of a judgement call. I also think that for this group to go out and say "this is how you ought to do that, the way you do it is semantically wrong", etc, will not really be a good idea.

I would propose the following as a way forward

- we get a consensus on the technical design, including the necessary spec text; once we have consensus, having 1-2 implementation is also a good idea for people to experiment with (Gregg has one, I will try to find the time to do one, although that may happen only at the end of the month or in January)
- we put this feature into the Last Call document, but we list it as an 'AT RISK' feature, possibly putting a short note into the document with the reason(s) we consider it at risk. We can put then a note in the document giving a _concise_ reason why we consider it at risk (possibility for misuse, clearer modeling, etc)
- in our public communication at LC we emphasize this AT RISK stuff, we really ask for public comments on this and we decide based on those

How does that sound?

Ivan


On Dec 9, 2012, at 21:16 , Gregg Kellogg wrote:

> On Dec 9, 2012, at 5:09 PM, Niklas Lindström <lindstream@gmail.com> wrote:
> 
>> I've been weighing the inputs and ideas that have been put forward
>> regarding ISSUE-144 (an @itemref-like feature) [1]. Let's discuss the
>> requirements here. Whatever we do, we must not rush things.
>> 
>> I still believe we need more knowledge about the actual needs. What
>> publishers need to do to capture their relevant content, and what
>> consumers need of that and how to use it. We need solid examples,
>> otherwise the claim that this is much required in the wild is moot.
>> 
>> I have observed two aspects of @itemref which we do *not* need to reproduce:
>> 
>> 1) Based on the microdata spec, and some online articles (e.g. [2],
>> [3]), @itemref is sometimes used to capture data about a resource
>> which is not nested within the element/@itemscope in question. As has
>> been said many times, this has always been a basic feature of RDFa,
>> since it supports specifying the subject (using @resource (and in
>> full, @about)) in the portion where the data exists.
>> 
>> (I have used this many times, e.g. when adding RDFa to our company
>> intranet. Using @itemref there instead would have made the solution
>> brittle and hard to grasp, since the subject for a piece of content
>> would only be discernible by noticing that the local @id was actually
>> used in an @itemref elsewhere in the template source, often spread
>> across files.)
> 
> Yes, and people coming from microdata don't always get this, as they typically use @itemscope without @itemid, and therefore get a unique BNode (or item if staying within the microdata-json rep) on each use, so the idea of spreading the assertions across the DOM is foreign; this is just a matter of education, IMO.
> 
>> 2) In this stackoverflow thread [4] the practise of linking to
>> resources is oddly done by using @itemref (instead of using references
>> with an @itemid, which is mainly like @resource). It also seems that
>> GoodRelations recommends this [5]. IIUC, the consequence, according to
>> the microdata algorithm, is basically a copying by value, resulting in
>> two different items (bnodes), instead of linking to the same resource.
>> This copying is not apparent of course; in the thread this is thought
>> of as linking data.
>> 
>> (And I don't blame them; the name @iremref certainly implies that it
>> is used for making item references, not element references for a
>> parser to jump to.. From what I've gathered, it basically instructs a
>> parser that "this item is also described by the content blocks at
>> these IDs". Basically an @itemdescriptionref.. Please correct me if
>> I'm missing a point here.)
> 
> Yes, I responded to Aaron Bradley on Manu's G+ thread (https://plus.google.com/u/1/102122664946994504971/posts/Zoq5EiNR9pw), he was noting @itemref as being important for this very reason, using an @itemref to reference an element with an @itemprop relating to a new item. I did point out to him that this ends up creating two items with the same information, and probably isn't really what he wants anyway, but such is the weight of examples that mis-use the syntax.
> 
>> So whatever we're after here, it doesn't need to be the exact
>> equivalent of @itemref (in fact, given the above, that would be a
>> costly choice in terms of complexity). We need to define the core of
>> what is sought after.
>> 
>> AFAIK, we have so far received two instances where an @itemref feature
>> is said to be needed:
>> 
>> 1) Martin Hepp initially reported that @itemref is necessary in real
>> world usage. Unfortunately, we haven't gotten any real examples
>> supporting this claim. From what I have gathered, the case described
>> is readily solved by using a ProductModel. If is is not, it is yet
>> unclear whether adding link and meta elements would suffice or not.
>> How many properties are to be copied into each product? (And are there
>> no dedicated product pages with more details, given the apparent need
>> to discover each product in search engines?)
> 
> It may be that ProductModel handles this case semantically, but I can imaging a number of other cases where something similar is done, for example a sequence of photos that all relate to the same subject, where the photo is itself the primary resource:
> 
> <figure typeof="schema:ImageObject">
>  <img property="schema:contentUrl" src="img1"/>
>  <figurecaption property="schema:name">Image 1</figurecaption>
>  <link property="rdfa:ref" resource="_:imagecontents"/>
> </figure>
> 
> <figure typeof="schema:ImageObject">
>  <img property="schema:contentUrl" src="img2"/>
>  <figurecaption property="schema:name">Image 2</figurecaption>
>  <link property="rdfa:ref" resource="_:imagecontents"/>
> </figure>
> 
> <div typeof="rdfa:Prototype" resource="_:imagecontents">
>  <a property="schema:location" href="someplace">Some place</a>
>  <a property="schema:about" href="someone">Some one</a>
>  ...
> </div>
> 
> In any case, I think the community has made the case that an @itemref-like feature is necessary. There are certainly cases where existing use of @itemref can be replaced with distributing @resource across a page, but it may be that this usage pattern is foreign enough for people coming from an SEO perspective, that it is still important. Also, past experience indicates that getting people to solve their problems by re-modeling to work around it (e.g., Product/ProductModel) is just not realistic; this can become an argument for the new users of semantic markup that the solutions put forward by the Semantic Web crowd are just not sympathetic to the needs of web developers.
> 
>> What other cases like this are there? Do sub-events need to copy
>> certain properties from their parent events? All properties or just a
>> handful? (Certainly the "subEvent" relation must not be copied, so
>> using @itemref to copy the parent data seems off.) This is the source
>> of our prototype idea, which I've been entertaining for a while. It is
>> alluring, and quite easy to implement. But that doesn't equate with
>> utility. I still don't know if the needs presented really demand it. I
>> think we need more experience and input here.
>> 
>> 2) The other case is from Jason Ronallo. This example [6] uses
>> @itemref to reuse a name, an image and a set of keywords between an
>> ItemPage, LandmarksOrHistoricalBuildings and CreativeWork (please see
>> the source to understand the details). In a way it is similar to the
>> Product case, with the potential difference (depending on how
>> important the ProductModel is as a concept in that example) that the
>> copied data here is also used within an itemscope to describe an
>> entity. And especially that this is mostly about picking out a few
>> pieces to avoid repetition in hidden @content. I do sympathize with
>> the desire to avoid duplication, but in general I still think the
>> repetition in meta and link elements would be fairly negligible. (And
>> that such direct use makes the effect much clearer, and reduces
>> complexity.)
> 
> This could also be seen in bibliographic use, where you have a Work, Product and Manifestation that share many properties, but are certainly semantically distinct entities.
> 
>> I've put a version of that example as RDFa using our experimental
>> prototypes at [7]. It certainly works, but I wonder if it's necessary.
>> 
>> Perhaps it would be enough if there was a mechanism to reuse a literal
>> value from another place in the document? That would remove the need
>> of sometimes copying the same textual value into several descriptions
>> within a page (often hidden in @content of meta elements). This could
>> be done by adding a new @contentref attribute:
>> 
>>   <div resource="#page" typeof="ItemPage">
>>     <h1 property="name" id="page_name">A Very Long Name Which Would
>> Be Tedious To Repeat</h1>
>>     <div property="about" resource="#creativework" typeof="CreativeWork">
>>       <meta property="name" contentref="page_name"/>
>>     </div>
>>   </div>
>> 
>> This would only copy the literal value. The @property (and any
>> @datatype) will be on the "start" element which uses the @contentid.
>> So in a way it would be like @datetime or @value in HTML5, just
>> indirected via an @id lookup. Adding just this would still require
>> repetition of links and meta elements though (e.g. for multiple
>> keywords). It would just remove the need for repeating literal
>> content. The question is still open whether that would suffice. I'm
>> suggesting this mostly to promote a balance of requirements.
>> 
>> The remaining, *very important* question, is whether search engines
>> penalize usage of meta and link elements? This has come up time and
>> again as a point of uncertainty for authors. I hope Schema.org
>> representatives can answer this, since it is a generally useful
>> pattern at times. I would expect it to be perfectly fine to add some
>> precision, as long as neither content nor links deviate in subject
>> matter. (There are many other ways to add hidden content for
>> subversive SEO purposes.)
> 
> This question comes up time and time again. Only time will tell what emerges, but IMO, uses of link and meta should prove to be okay, the warning against invisible markup is typically for large blocks of text which are moved off page or hidden in an obvious attempt to fool the ranking algorithms. We will probably see attempts to semantically fool updated algorithms too, if the use of schema.org really takes off. Presumably the solution to that would be to do some semantic textual analysis to see if it corresponds to the asserted markup.
> 
> Gregg
> 
>> Best regards,
>> Niklas
>> 
>> [1]: http://www.w3.org/2010/02/rdfa/track/issues/144
>> [2]: http://html5doctor.com/microdata/
>> [3]: http://net.tutsplus.com/tutorials/html-css-techniques/html5-microdata-welcome-to-the-machine/
>> [4]: http://stackoverflow.com/questions/8726413/schema-org-itemref-linking-multiple-sportevents-to-a-single-place
>> [5]: http://wiki.goodrelations-vocabulary.org/Cookbook/Video_content
>> [6]: http://d.lib.ncsu.edu/collections/catalog/mc00096-001-ff0155-000-001_0001
>> [7]: https://gist.github.com/4243921
>> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Monday, 10 December 2012 11:21:57 UTC