Re: My review of RDFa Core 1.1 (2011-12-15 version) from Shane McCarron on 2012-01-26 (public-rdfa-wg@w3.org from January 2012)

From: Shane McCarron <shane@aptest.com>
Date: Thu, 26 Jan 2012 02:08:43 -0600
To: Niklas Lindström <lindstream@gmail.com>
CC: public-rdfa-wg@w3.org
Message-ID: <4F210A0B.7030908@aptest.com>
And now, more detailed comments as I attempt to integrate your 
additional changes:

On 1/25/2012 8:08 PM, Niklas Lindström wrote:
> Relevant parts kept and commented inline below. There are some points 
> which need further consideration, and it would be good to have input 
> from more WG members on those.
>>> "2.2 Examples"
>>> --------------
>>>
>>> * The first example uses the terms "author", "prev" and "next". But
>>> these aren't mapped to property IRIs anymore, right? If they don't I
>>> believe it's a poor example. Although I'm a bit confused by the
>>> XHTML+RDFa 1.1 spec which still links to
>>> <http://www.w3.org/2011/rdfa-context/xhtml-rdfa-1.1>, defining
>>> these... In any case, it is probably confusing to have this in RDFa
>>> Core 1.1 if it relies on terms defined for XHTML only...
>>
>> These examples are all properly in XHTML+RDFa and I am loathe to change them
>> at this time.  The terms are defined for XHTML+RDFa and that is the only
>> language we have any control over.  In particular when talking about terms
>> we want to have some concrete examples, and the base only defines 3.
> Ok. Then the initial paragraph here should say "In XHTML 1.1", since
> these terms aren't available in HTML5 (nor XHTML5), right?

I have made it clear that XHTML+RDFa is used in the examples.  Thanks.

>
>>> * I find the example a bit awkward since it builds up an event by
>>> first implying that the current document is the event, before
>>> enclosing it as a bnode of type cal:Vevent..
>>
>> I removed this example in favor of using something about books to show
>> typeof as per a suggestion from Manu
> Good. But there are still two examples above that using cal:summary
> and cal:dtstart properties which describe the current document
> (compare to the full event described in section 8). Perhaps using
> something like:
>
>      <body>
>        <h1 property="dc:title">My home-page</h1>
>        <p>Last modified:<span property="cal:dtstart"
>                content="2015-09-16T16:00:00-05:00"
>                datatype="xsd:dateTime">today</span>.</p>
>      </body>
>
> is better?

I have put this in.

>>> "3.4 Plain literals"
>>> --------------------
>>>
>>> * As I already brought up, the description of literals in section "3.4
>>> Plain literals" isn't entirely correct. It might be good to add an
>>> example of a literal with language related to the ongoing example
>>> here, such as:
>>>
>>>      <http://dbpedia.org/resource/German_Empire>
>>>          rdfs:label "German Empire"@en;
>>>          rdfs:label "Deutsches Kaiserreich"@de .
>>
>> I wasn't smart enough to do this in the time I had.  If you want to provide
>> specific text I am happy to stick it in.  It is an editorial change and we
>> can do it at any time.
> How about this for "3.4 Plain literals": [[[
>
> Although IRI resources are always used for subjects and predicates,
> the object part of a triple can be either an IRI or a literal. In the
> example triples, Einstein's name is represented by a plain literal,
> specifically a basic string with no type or language information:
>
>      <http://dbpedia.org/resource/Albert_Einstein>
>        <http://xmlns.com/foaf/0.1/name>  "Albert Einstein" .
>
> A plain literal can also be given a language tag, to capture plain
> text in a natural language. For example, Einstein's birthplace has
> different names in english and german:
>
>       <http://dbpedia.org/resource/German_Empire>
>            rdfs:label "German Empire"@en;
>           rdfs:label "Deutsches Kaiserreich"@de .
>
> ]]]

Thanks!

>
>>> "3.6 Turtle"
>>> ------------
>>>
>>> * Perhaps the first two examples should include the full data being
>>> discussed for the sake of completeness?
>>
>> I couldn't decide what was missing.
> The full data (including the suggested language literals above) seems to be: [[[
>
> @prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#>  .
> @prefix dbp:<http://dbpedia.org/property/>  .
> @prefix foaf:<http://xmlns.com/foaf/0.1/>  .
>
> <http://dbpedia.org/resource/Albert_Einstein>
>    foaf:name "Albert Einstein";
>    dbp:birthPlace<http://dbpedia.org/resource/German_Empire>;
>    dbp:dateOfBirth "1879-03-14"^^<http://www.w3.org/2001/XMLSchema#date>;
>    foaf:depiction<http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg>  .
>
> <http://dbpedia.org/resource/German_Empire>
>    rdfs:label "German Empire"@en;
>    rdfs:label "Deutsches Kaiserreich"@de .
>
> ]]]
>
> The second example could probably do with just the data needed to
> illustrate abbreviation of the subject and datatype, i.e.: [[[
>
> @prefix dbp:<http://dbpedia.org/property/>  .
> @prefix dbr:<http://dbpedia.org/resource/>  .
> @prefix xsd:<http://www.w3.org/2001/XMLSchema#>  .
>
> dbr:Albert_Einstein dbp:dateOfBirth "1879-03-14"^^xsd:date .
>
> ]]]

I've done something.  I wanted to leave the prefixed references in too 
though.

>
>
>>> "6. CURIE Syntax Definition"
>>> ----------------------------
>>>
>>> * The following is stated:
>>> [[[
>>> A CURIE is comprised of two components, a prefix and a reference. The
>>> prefix is separated from the reference by a colon (:). In general use
>>> it is possible to omit the prefix, and so create a CURIE that makes
>>> use of the 'default prefix' mapping; in RDFa the 'default prefix'
>>> mapping is http://www.w3.org/1999/xhtml/vocab#. It's also possible to
>>> omit both the prefix and the colon, and so create a CURIE that
>>> contains just a reference which makes use of the 'no prefix' mapping.
>>> This specification does not define a 'no prefix' mapping. RDFa Host
>>> Languages must not define a 'no prefix' mapping.
>>> ]]]
>>>
>>> I find this confusing on three accounts:
>>>
>>> - Is the default prefix mapping set? Shouldn't it be possible to set
>>> it to what ever the default *vocabulary* is? So that someone can use
>>> e.g.:
>>>
>>>      <a vocab="http://example.org/vocab#" rel=":describedby" href="">
>>>
>>> To mean:
>>>
>>>      <>    <http://example.org/vocab#describedby>    <>    .
>>
>> No.  In CURIEs :foo ALWAYS references the XHTML vocabulary.  We provide no
>> way  to override this.
> I see. So this means that there is no way to use create such a triple
> using @vocab (due to 'describedby' being a predefined term)? I.e. for
> that, one have to use CURIEs with non-empty prefixes or full IRIs?

That's correct.

>
>>> - This definition of CURIEs state that terms are also CURIEs, does it not?
>>
>> No.  TERMS are things that are looked for BEFORE CURIEs are matched.
> Ok. The interplay of concepts is quite intricate here. It seems to me
> to be an overlap between terms and CURIEs with no prefix and leading
> ":"? I had the idea that terms were distinct from CURIEs in that the
> latter always started with a (possibly empty) prefix and ":". This
> since some terms are predefined and some resolved against the local
> default vocab (as defined in section "7.4.3 General Use of Terms in
> Attributes"). But not all references are terms, that seems clear.
>
> So it seems that certain expressions like "item@a" or "value#b" would
> fail to match the rules for terms, but are valid CURIEs. How would
> they then be resolved? From the next piece I conclude: not at all.

Yes, it would be not at all.

>>> - Although I interpret "RDFa Host Languages must not define a 'no
>>> prefix' mapping" to mean TODO
>>
>> Not sure where you were going with this, but...
> (Ugh; apparently I left this incomplete. I'm so sorry.) Well, at the
> time I thought that host languages could, in their default context,
> define such a mapping with the default vocabulary mechanism. But that
> is only for terms, and not for this 'no prefix' mapping; right? (And
> as said above, I conclude that those two expressions ("item@a" and
> "value#b") would not resolve.)
>
> Am I reading this right?

Yes.  This part of CURIEs is not something that we use in RDFa, but the 
feature is in there for some other potential uses of CURIEs out there.

>
>>> * I would like to see a note here about CURIEs effectively working
>>> like protocol shorthands, with appropriate warnings about how they
>>> *may* overshadow existing or future protocols (especially profiles
>>> with many prefixes could cause this in a non-obvious way). This is
>>> what we discussed when we resolved ISSUE-90 [1]. Note though that the
>>> resolution of ISSUE-125 [2] might remove the need for this.
>>
>> Added some text
> Great, this was definitely needed. However, I'm not satisfied with it yet.
>
>   - In "it is possible though unlikely, that schemes will be introduced
> in the future that will conflict with prefix mappings defined in a
> document", can we really say "unlikely"? The creation of schemes is
> entirely independent of the use of prefixes in RDF contexts, so we
> really don't know. We've already seen e.g. "http", "geo", "tag" and
> "urn", all in<http://prefix.cc/>. Perhaps if we say: "it is possible
> though unlikely, that popular schemes will be introduced in the
> foreseeable future that will conflict with (popular) prefix mappings".
>
> All of this requires monitoring and interception by people aware of
> both contexts. This note is where we raise awareness of this need for
> coordination. Of course problems won't appear over night, maybe not
> for many years; hopefully never. (And I do hope that the creation of
> new IRI schemes will continue to decrease in popularity). I gather
> that our position is to expect people to understand this and inform
> each other early on. (I'm just not sure.)

Yes of course.  As the author of this text, I was trying to explain the 
risk whilst stressing that we feel it is not significant.  If we as a 
working group feel this risk is significant, should we not be working in 
some way to reduce it?  Since we are not, we need to make the case that 
we understand the risk, but that we do not think the risk is great.

>
>   - The example uses an @href, but those cannot contain CURIEs, so they
> are safe. The attributes of concern are @about and @resource. (Of
> course if the use of CURIEs would catch on and become available to
> "actionable" link attributes the situation is worsened.)

I don't see CURIEs being used in actionable links.  In fact, I am sure 
that we said somewhere that they should never be for this very reason.  
I have changed it to resource.

>
>   - "In neither case would this RDFa overshadowing of the underlying
> scheme alter the way other consumers of the IRI treat that IRI." I
> don't follow. Do you mean in the sense of consumers *not* using RDFa,
> only the lexical value? That seems irrelevant to me for our purposes.

I wanted to point out that even if an RDFa processor were to treat 
something incorrectly in the future, any library that was NOT an RDFa 
processor would no doubt be updated to treat it correctly.   Moreover, 
the triple would NOT change.  Which I think is important.  If I as an 
author say that the object of a triple is foo:bar, it better damn well 
stay foo:bar (or what that expands to) for all of time.

>
>   - "It could, however, mean that the document author's intended use of
> the CURIE is misinterpreted by another consumer as an IRI." That
> should probably be: "It does mean that the document author's intended
> use of an IRI is misinterpreted, since any RDFa consumer would expand
> that as a CURIE and get a different IRI as a result."

This is backward in my opinion.  The author does not intend that it be 
used as an IRI.  The author intended it to be a CURIE.  But it became a 
legitimiate IRI later.  Some consumer might interpret it thus.  But an 
RDFa processor will continue to interpret it correctly.  Which is what I 
want.  Moreover, in general we do not put CURIEs where an IRI would be 
interpreted by any other processor anyway.  And since CURIEs are 
expanded before they are handed to any other processor, there is no risk 
that this CURIE, which could ALSO be interpreted as an IRI, would ever 
see the light of day - as it were.

> Note that in attributes where a prefix overshadows a scheme the
> resulting IRI in the RDF data is irrevocably different from its
> "CURIE-looking-like an-IRI" origin. And a consumer of the resulting
> RDF may never detect this. It's even more untraceable if the triples
> are transmitted further.

And that's fine.  The document author intended that the 
CURIE-looking-like-an-IRI be a CURIE, and intended that it expand to 
something that is not that IRI.  Perfect.

>   - "The working group considers this risk to be minimal at worst." I'd
> strike "at worst". (The notion "CURIE injection" comes to mind in a
> scenario where some social networking giant starts to publish snippets
> of RDFa for unassuming web administrators to use, poised against
> another proprietary protocol of some major digital artifact vendor,
> where the scheme and prefix happen to be the same. Of course I'm
> really exaggerating here. But perhaps you see my point.)

I removed it.  Thanks.

> (And while I fear there's little support for disallowing
> "prefix://"-like forms from being CURIEs, *if* that would be accepted
> then we should of course reformulate this note to highlight the
> lessened risk, and explain which kinds of schemes (forms of IRIs) are
> still at risk and require this care.)

Surely.

>>> "7.4.2 General Use of CURIEs in Attributes"
>>> -------------------------------------------
>>>
>>> * The note states: "An empty attribute value (e.g., typeof='') is
>>> still a CURIE, and is processed as such.". Is that really true? Isn't
>>> it rather so that certain attributes have meaning (effect on the
>>> processing) even when empty? The notion of an empty CURIE strikes me
>>> as strange.
>>
>> I was not sure how to change this.  While it is odd, it is important for
>> many steps in the sequence that empty attributes are not ignored.
> Many RDFa attributes expresses meaning even when empty, so their
> presence is in itself important information.

Quite.

>>> * The last sentence "As a special case, _: is also a valid reference
>>> for one specific bnode." is the only explanation of what "_:" means. I
>>> think it should be elaborated a little upon, making it clear how it
>>> works and why. (Also I was under the impression that it should
>>> generate a unique bnode each time it is used (and not represent the
>>> same bnode across the document), but that does not seem to be the
>>> case?)
>>
>> I have no idea at all what to do here.
> Nor do I. I've never used it, I think usage of empty @typeof fulfills
> my potential needs for what I thought it meant, and I don't really get
> why a kind of bnode "singleton" would be useful at all. Can anybody
> explain what it means and is used for?

Ivan explained it separately.


>
>
>>> "7.5 Sequence"
>>> --------------
>>>
>>> * Step 1 (and 3). Shouldn't the local list of IRI mappings actually be
>>> set to *a copy of* the list of IRI mappings from the evaluation
>>> context? In step 3, this local list is mutated by adding to it, so we
>>> must ensure that someone implementing it like this doesn't affect the
>>> list for following sibling elements. Either that or express "adding to
>>> the local list" differently, in functional terms.
>>
>> This is not really an issue.  'local' is local to each iteration of the
>> depth-first processing loop.  Nothing is passed 'by reference' in this
>> algorithm.
> Ok.. I see what you mean. Still, since it reads "the local list of IRI
> mappings is set to the list of IRI mappings from the evaluation
> context", and then "and these are added to the local list of IRI
> mappings", might one not get that impression? It concerned me since in
> step 13, the new evaluation context is either "a copy of the current
> context that was passed in to this level", or it is constructed again.
> Well, I may be splitting technical hairs here, and I suppose
> implementers won't actually be misled into mutating a shared list for
> all levels. And if they still do, basic testing will quickly show them
> the error in that. :)

Good point.  And I am loathe to mess with too much.
>>> * Step 10. It would be good to explain why "Also, current object
>>> resource should be set to a newly created bnode".
>>
>> Fixed
> Hm...  It now says: "Also, current object resource should be set to a
> newly created bnode (so that the incomplete triples have a subject to
> connect to if they are ultimately turned into triples)".
>
> But I thought that the subject for the incomplete triples is to be the
> *parent subject* resource (and the very reason for it to exist). Note
> that this newly created bnode in step 13 is used for the next parent
> *object*. All of this to support:
>
>      <div about="#parent_subject" rel="dc:hasPart">
>          <p property="rdfs:label">anonymous object</p>
>      </div>
>
> To generate:
>
>      <#parent_subject>  dc:hasPart [ rdfs:label "anonymous object" ] .
>
> In the more common(?) cases of hanging rels, where nested markup
> defines a new resource with @about, @typeof or a resource atttribute,
> this newly created bnode will be replaced and forgotten by that/those.
>
> It's late though so I need someone to verify this! If I'm not totally
> lost, the explanation for the newly created bnode should probably say
> something like: "so that the incomplete triples have an object to
> connect to if the first triple generating item encountered in a
> subsequent level is a predicate". (Well, some legible wording of
> that..)

Well... the 'current object resource' is passed to child nodes as the 
parent object (in processing step 13) , then in processing step 5, new 
subject is set to the parent object.  That is what I was alluding to.  
The bnode thus created becomes the current object resource, which then 
becomes the parent object for any child nodes, and is interpreted as the 
new subject by those nodes when they create triples and need something 
to attach to.  RIght?

>
>
>
>>> * Section "10. RDFa Vocabulary Expansion" should include the relevant
>>> triple from the cc vocabulary used for expansion?
>>
>> I think it does.    It shows the dc: reference that the cc: item refers to.
> True, but I meant the triple causing this to expand. That is, in the
> RDFa (nice!) from<http://creativecommons.org/ns#>, this triple:
>
>     cc:license rdfs:subPropertyOf dct:license .
>
> is specifically what makes the expansion infer dc:license from cc:license.

And you think that should be included in the expansion by a conforming 
processor?  Not just the terminal triple that is derived?

>
>
>>> "B.4 Changes"
>>> -------------
>>>
>>> Could this perhaps be merged with its only subsection "B.4.1 Major
>>> differences with RDFa Syntax 1.0"?
>>
>> Hmm - maybe.  Not right now though.   Thinking about it.
> Ok. Perhaps it'd be more valuable to have more than one subsection,
> e.g. "Minor differences", "Complying to RDFa 1.0" or similar (though
> that latter part in isolation is odd since it advices usage of
> deprecated features like @xmlns).

Yeah - still thinking about it...

>
>
> That should be all.
>
> Thanks for addressing my entire bulk of remarks! Impressive editorial
> work (which I've gathered is your modus operandi).
>
> Best regards,
> Niklas

-- 
Shane McCarron
Managing Director, Applied Testing and Technology, Inc.
+1 763 786 8160 x120
Received on Thursday, 26 January 2012 08:09:28 UTC