Re: Agenda Topic / Issue: Clarify the meaning of "ignore" with respect to attributes that have no legal value from Mark Birbeck on 2009-09-10 (public-rdf-in-xhtml-tf@w3.org from September 2009)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Fri, 11 Sep 2009 00:36:39 +0100
To: Shane McCarron <shane@aptest.com>
Cc: "public-rdf-in-xhtml-tf.w3.org" <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <640dd5060909101636n32599619s95df58705849e661@mail.gmail.com>
Hi Shane,

I'm not convinced there is a problem here.

We don't talk about "illegal" values in the way that you are implying
-- all we talk about is whether triples find their way into the
default graph, or not.

To illustrate; in section 5.4.4 we have this example:

  <link rel="foobar" href="http://example.org/page7.html" />

and we point out that this will not generate triples in the default graph.

But we don't say it won't generate any triples at all.

You'll no doubt recall that we added the whole notion of the default
graph as a way to allow parser writers some flexibility in supporting
new values, but in such a way that those new values would not
'pollute' the basic set of triples; we achieved this by saying that
anyone could create whatever triples they liked in whatever graphs
they liked, as long as there was a graph somewhere that contained the
specific set of triples that are mandated.

Now, the problem is that if we are going to allow values into our
attributes that we don't 'understand', so to speak, then we can't
write an algorithm that says 'if you don't understand the attribute
value then act as if the attribute doesn't exist'.

Instead, we have to write our algorithms so that (a) we act
consistently if the attribute exists, irrespective of its contents,
and (b) we act on lists of values, but allow for the fact that we may
sometimes have an empty list.

And in fact the spec does exactly this; you'll notice that steps 4 and
5 only talk about the _presence_ of @rel and @rev, not their values.
It's not until the triples are created at step 7 (or hanging triples
created at step 8), that the values within @rel and @rev are used.

You might think this is all about angels and pinheads, but it is
significant when it comes to bnodes. Take this example:

  <span about="#shane" xmlns:foaf="...">
    <span rel="ex:blah" xmlns:ex="...">
      <span property="foaf:name">Mark</span>
    </span>
  </span>

which generates:

  <#shane> ex:blah _:a .
  _:a foaf:name "Mark" .

Now, if there was no prefix mapping for 'ex', then the @rel value in
the middle could be treated as either @rel="", or as non-existent.

However, if it is treated as if the attribute is non-existent, then this:

  <span about="#shane" xmlns:foaf="...">
    <span rel="ex:blah">
      <span property="foaf:name">Mark</span>
    </span>
  </span>

is equivalent to this:

  <span about="#shane" xmlns:foaf="...">
    <span>
      <span property="foaf:name">Mark</span>
    </span>
  </span>

which generates this:

  <#shane> foaf:name "Mark" .

An unfortunate consequence!

However, if @rel is treated as 'existing but empty' (or as I said
before, being a list of values that just happens to be of zero
length), then the algorithm for creating a bnode would still be
invoked, regardless of the content.

This means that even though we lose the first triple because we don't
understand 'ex:blah', we still get the accurate reporting of the
second.

In other words, this:

  <span about="#shane" xmlns:foaf="...">
    <span rel="ex:blah">
      <span property="foaf:name">Mark</span>
    </span>
  </span>

is equivalent to this:

  <span about="#shane" xmlns:foaf="...">
    <span rel="">
      <span property="foaf:name">Mark</span>
    </span>
  </span>

which generates this:

  _:a foaf:name "Mark" .

(Which is how the spec is currently defined, and why I don't think
there is a problem.)

The final piece of the jigsaw is to bring back in the earlier point,
i.e., that the spec allows us to create a triple from 'foobar'
provided that we don't put it into the default graph. In this case my
parser might place the 'unknown' triple into a separate graph:

  Default graph:
    _:a foaf:name "Mark" .

  Graph A:
    <#shane> ex:blah _:a .

Now, as you can see, even though it's in a separate graph, I could
still run a SPARQL query over *both* graphs to yield the following
triples:

  <#shane> ex:blah _:a .
  _:a foaf:name "Mark" .

And since the parsing algorithm caused us to generate a bnode despite
not understanding 'ex:blah', then the triples in the two separate
graphs are 'aligned'.

By the way, there is another reason I prefer the "present but empty"
approach, and that's because it enables a simple construct that
provides a convenient way to generate a bnode to attach 'stuff' to:

  <span typeof="">
    <span property="ab:cd">ef</span>
    <span property="ab:gh">ij</span>
  </span>

which generates this:

  _:a ab:cd "ef" .
  _:a ab:gh "ij" .

Unfortunately, the spec is not so clear here, and says in step 4:

  if @typeof is present, obtained according to the section on CURIE and
  URI Processing, then [new subject] is set to be a newly created [bnode].

I think this is wrong, and doesn't fit with the spirit of @rel and
@rev; it would be better off simply being:

  if @typeof is present then [new subject] is set to be a newly created
  [bnode].

Regards,

Mark

On Tue, Sep 8, 2009 at 9:28 PM, Shane McCarron<shane@aptest.com> wrote:
> One issue that has come up recently is that we use inconsistent language in
> the RDFa Syntax Recommendation when discussing illegal values in attributes
> (thanks Philip!).
> Basically, in the current Recommendation we talk about attributes being
> ignored when the value(s) are illegal.  I believe that when we say this (and
> we say it in a couple of different ways), we ALWAYS mean:
>
> "When an attribute has no legal values, a conforming RDFa Processor MUST act
> as if the attribute were not present at all.  The processor MUST NOT act as
> if the attribute were present, but with the empty string as its value."
>
> So, for example,
>
> <a rel="blah:blah" href="file.html">something</a>
>
> Would never generate triple, because the prefix "blah" is not defined, so
> the system MUST act as if there was no @rel at all.
>
> <span property="blah:blah" datatype="blah:blah">some content</span>
>
> Would also generate no triples, since there would effectively be no
> @property AND no @datatype attributes.
>
> I don't think there is any disagreement on this point, but it is important
> and perhaps we should get a formal resolution on the books and a note in the
> errata document just so we eliminate this one area of potential confusion.
>
> Ben, please put this on the agenda for Thursday.
>
> --
> Shane P. McCarron                          Phone: +1 763 786-8160 x120
> Managing Director                            Fax: +1 763 786-8180
> ApTest Minnesota                            Inet: shane@aptest.com
>
>
>
>



-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Thursday, 10 September 2009 23:37:24 UTC