Re: ISSUE-89

Hi Ivan/Johannes,

>  Eagle eye: I think you are right...

Yes, Johannes is exactly right. Damn...

Unfortunately when I was developing the rules, the test case I was
running in my parser was this:

  <div about="http://dbpedia.org/resource/Baruch_Spinoza" rel="dbp:influenced">
    <span property="foaf:name">Albert Einstein</span>
    <span property="dbp:dateOfBirth" datatype="xsd:date">1879-03-14</span>
    <div rel="dbp:citizenship">
      <div>
        <span about="http://dbpedia.org/resource/Germany"></span>
        <span about="http://dbpedia.org/resource/United_States"></span>
      </div>
    </div>
  </div>

I was correctly getting:

  <http://dbpedia.org/resource/Baruch_Spinoza> dbp:influenced _:a .

to appear in the triples, but of course that was triggered by the
@rel="dbp:citizenship" element, and not the @property elements. If I
comment out that block I get exactly the problem Johannes describes.

:(


>  There seems to be a small bug in the processing steps. Namely, I guess
>  in the very last step of point 4, the [skip element] should be set to
>  true _unless_ a @property element is present. If that is done, then the
>  incomplete triples will be completed in div/div/span[1] and
>  div/div/span[2]. Actually, the triples will be completed and added to
>  the graph twice which is not a problem, because RDF is defined as a
>  _set_ of triples.
>
>  Mark, Ben, Manu, is that correct? Or do I miss something?

The problem is that this is exactly what I was trying to avoid. There
is no need to repeatedly generate the parent triple if no new subject
is generated.

The core of the problem is that you can't tell whether any of the
recursed elements generate that all important triple for you.


>  I believe this is really an editorial issue, though it is on the
>  borderline of technical. The intention is that, in 'human' terms, if an
>  element does not include any of the RFDa attributes (well, the @content
>  and @datatype are put aside here) then everything should simply 'flow'
>  through. AFAIK, all implementations do that. It was raised in the
>  discussion several times that one way of documenting this in the
>  processing steps is to define a separate clause in the processing steps
>  for this alternative and get it over with, so to say; but then it was
>  decided that this alternative would be incorporated into the main flow.
>  And that is where the it went wrong, at least I believe...

We've discussed this before, and it's not quite as simple as that. We
can't simply ignore any element that doesn't contain an RDFa
attribute, since we still need to process namespace and language
attributes, placing that information into the evaluation context that
is passed on.

But if we complete any incomplete triples on every child that is
recursed into we get lots of duplicates.


>  Johannes Koch wrote:
>
> [snip]
>
>  > 6. "/:div":
>  > [skip element] is 'false', 'true' returned from processing
>  > "/:html/:body/:div/:div", but no incomplete triples in [evaluation context]
>  > return 'true' (step 12)

This is the crucial step; we need to know that we should generate a
triple that uses the bnode we generated in step 8 for the subject.
Obviously we could do that unconditionally, but then we're back to
where we were before, which is that we'll get triples generated even
if the triple has no 'meaning'.


>  > The incomplete triple
>  >
>  >   <http://dbpedia.org/resource/Baruch_Spinoza> dbp:influenced ? .
>  >
>  > is never completed.

Right.

I'm going to try to solve this, but I have a feeling that I'm not
going to be able to crack it in such a way that it's a merely
editorial change. Therefore, as far as I can see the change that has
the least impact (the lesser of the various evils) is to not have
@property set the skip flag. We'll end up with duplicates of certain
triples, but as Ivan says, it's not so terrible.

Regards,

Mark

-- 
  Mark Birbeck

  mark.birbeck@x-port.net | +44 (0) 20 7689 9232
  http://www.x-port.net | http://internet-apps.blogspot.com

  x-port.net Ltd. is registered in England and Wales, number 03730711
  The registered office is at:

    2nd Floor
    Titchfield House
    69-85 Tabernacle Street
    London
    EC2A 4RR

Received on Tuesday, 4 March 2008 15:28:20 UTC