Re: keyrefs from Jeni Tennison on 2001-11-08 (xmlschema-dev@w3.org from November 2001)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Thu, 8 Nov 2001 10:28:58 +0000
To: Joan Pujol <joan.pujol@ima.udg.es>
CC: xmlschema-dev@w3.org
Message-ID: <5762695010.20011108102858@jenitennison.com>
Hi Joan,

> I suppose that it's in 3.11.5 Identity-constraint Definition
> Information Set Contributions, but I can't understand it.
> In fact, the unique pharagraf that I understand -a little ;-)- says:
> "..  [from 3.11.5]
> NOTE: The complexity of the above arises from the fact that keyref
> identity-constraints may be defined on domains distinct from the
> embedded domain of the identity-constraint they reference, or the
> domains may be the same but self-embedding at some depth. In either
> case the ·node table· for the referenced identity-constraint needs
> to propagate upwards, with conflict resolution.
> .."
>
> Some can give me a little light on this obscurity?

I'll give it a shot and hope someone corrects me if I'm wrong.

The schema processor goes through the instance document and for each
element it constructs an "identity constraint table". The identity
constraint table holds one entry per key/unique identity constraint
that has been defined on the element declaration for the element.

Within the identity constraint table, each identity constraint has an
associated "node table". The node table is basically a dictionary or a
hash table which links a particular element with the
elements/attributes that are used to identity that element.

Now, the entries in the identity constraint tables bubble up the
document, so if an element has an entry for a particular identity
constraint in its identity constraint table, then its parent element
has an entry in its identity constraint table as well. If there are
clashes within a node table for a particular identity constraint (i.e.
two elements have the same ID, but the document is still valid because
the two elements occur in different contexts) then the entries that
clash are both removed from the table.

So there's a tree of elements in the instance document, each with its
own set of node tables, and the node tables for the ones higher up the
document are formed by merging the ones from its children.

When you have a keyref identity constraint for an element, that
element looks at its identity constraint table and finds the node
table related to the referenced key or unique identity constraint. The
schema validator checks that the values of the fields from the keyref
match the values of the fields from the key/unique identity constraint
for some element in the node table.

Now the only entries that the node table can have are those that
either come from key/unique identity constraints on that particular
element (the same element as the keyref is on) or from one of its
descendants.

Thus, the keyref identity constraint has to be placed on the same
element as the key/unique identity constraint *or* on one of its
ancestors within the instance document.

The other thing you need to watch out for is that stuff about node
tables not having entries for keys that clash with each other. Imagine
a document like:

  A
  +- B
  |  +- C id=1
  |  +- C id=2
  +- B
  |  +- C id=1
  |  +- C id=2
  +- D
     +- E ref=1
     +- E ref=2

You define a key on the element declaration for the B elements, saying
that within each B element, every C element can be identified by its
id attribute. You define a keyref on the element declaration for the
*A* element, saying that each D/E ref attribute points to an entry in
that key.

Now in the node table for a B element, there's an entry for each of
its child C elements with the relevant id; all well and good. In the
node table for the A element, however, these node tables get merged.
Because there are two C elements that have the same id, both entries
get removed. So the node table on the A element doesn't actually
contain any entries. The keyref looks at the node table on the A
element, because that's where it's defined, and of course it doesn't
find the relevant key value, so it fails.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/
Received on Thursday, 8 November 2001 05:29:01 UTC