[4795] target required and null/empty/dangling references from John Arwe on 2007-08-22 (public-sml@w3.org from August 2007)

From: John Arwe <johnarwe@us.ibm.com>
Date: Wed, 22 Aug 2007 10:29:31 -0400
To: public-sml@w3.org
Message-ID: <OFC931B7A5.7F45F85A-ON8525733F.00410D77-8525733F.004FBBB6@us.ibm.com>
[1] bugzilla http://www.w3.org/Bugs/Public/show_bug.cgi?id=4795
[2] discussion from last f2f 
http://www.w3.org/2007/06/12-sml-minutes.html#item06 
[3a] submission text 
http://www.w3.org/Submission/2007/SUBM-sml-20070321/#Constraints_on_References
[3b] submission text 
http://www.w3.org/Submission/2007/SUBM-sml-20070321/#sml_3AtargetRequired 
[3c] submission text 
http://www.w3.org/Submission/2007/SUBM-sml-20070321/#sml_3AtargetRequired2
[4a] fpwd text 
http://www.w3.org/TR/2007/WD-sml-20070806/#Constraints_on_References
[4b] fpwd text 
http://www.w3.org/TR/2007/WD-sml-20070806/#sml_targetRequired
[4c] fpwd text 
http://www.w3.org/TR/2007/WD-sml-20070806/#sml_targetRequired2
As far as I see from a cursory inspection, 3a=4a (wrt to 4a's target 
required content), 3b=4b, 3c=4c, 3a/3b/3c are consistent, and 4a/4b/4c are 
consistent.  All Good Things.
It is [2] that mystifies me, in that it re-draws the line.  All of the 
3x/4x citations above apply target required to empty, null, and dangling 
references.  Not all of those had clear definitions at the time, so that 
may be the source of my mystery.  [2] (and [1], its alternate form) say 
that we should re-define target required to apply only to dangling 
references, removing its current application to empty and null references.
I believe our discussions about how to recognize empty and null references 
syntactically, coupled with lack of precise definitions for same coupled 
with an evolving understanding of what constitutes a valid reference 
scheme definition, led us down a garden path... maybe several.  To the 
degree possible below, I am going to worry about the semantics only; once 
we have agreement on semantics we can worry about how to syntactically 
recognize each semantically unique case.

Empty refs, as the minutes suggest, we were discussing in the sense of 
"element reference containing no content", e.g. <foo sml:ref="true" 
xsi:nil="true"> .  As we now understand the universe of possible scheme 
definitions, such an element is not necessarily devoid of reference scheme 
content.  An attribute-based reference scheme for example, <foo 
sml:ref="true" xsi:nil="true" my:schemeuri="/">, satisfies the definition 
of empty we were using yet clearly has reference scheme content.  Even 
with element-based schemes like sml:uri, the xsi:nil="true" test fails to 
capture the right semantics, I assert.  Using EnrolledCourse from the SML 
spec, if one adds sml:ref="true" to the third instance of example two then 
it becomes
    <EnrolledCourse sml:ref="true">
      <Name>SocialSkills</Name>
      <Grade>F</Grade>
    </EnrolledCourse>
Is this fragment an empty reference?  I could say no (because xsi:nil is 
not true, i.e. for syntactic reasons), I could say yes (because my 
consumer code does not recognize any of the element's information items as 
matching any reference scheme definition it understands), but either way I 
doubt a human looking at this example would leap to the conclusion that 
<EnrolledCourse> is an empty reference element.  This is essentially 
because we are engaging in a practice some call "multi-typing"; 
<EnrolledCourse> has a well-known Schema type, but we are logically saying 
through annotations like sml:ref="true" that it may be/is _also, 
concurrently_ compliant to another type (semantically as humans understand 
things, not in the strict Schema sense of type definition).  For each 
reference scheme we define a set of content that when mixed into the 
element's content allows us to treat the element as if it is "a 
reference", i.e. an instance of some reference type.  Others might call 
this composition.
Note that the determination of whether or not a reference is empty, 
meaning sml:ref="true" but it is otherwise devoid of attributes, child 
elements, etc (information items) associated with a reference scheme will 
_always_ be a function of the reference schemes known to the consumer. 
This might cause some to tilt at first (I did), but think about the 
extreme case where a scheme is defined with no associated information 
items (one of Sandy's perverse inventions... once an element with 
sml:ref="true" is found, in the absence of any other recognized reference 
scheme, it would resolve the reference to a fixed point ... document root, 
reference element, whatever). 
I'm not sure how much practical use we can make of the "empty reference" 
semantic unless we introduce a clear syntactic way to recognize it and 
distinguish it from "no content from schemes recognized by the consumer", 
and before introducing new syntax I'd want to be very sure we have a use 
for it important enough to justify the added complexity.  Assuming we keep 
the semantic, if the reference is empty and target required is true on the 
element's type definition, why would that not be an error?  Target 
required = true means that the reference must resolve to an element w/in 
the model, I see no way for an empty reference to do so.  I can see "no, 
not an error" making sense only if people were assuming empty = null ... 
lacking any concrete definition of "empty reference", I cannot be sure of 
this.

Null refs: semantically, lots of fields have found such a concept useful 
(Java, XML Schema validation, ...).  I remember Sandy asserting that such 
a concept would be useful to distinguish between two cases: (1) producer 
intentionally omitted reference scheme content (2) producer erroneously 
omitted reference scheme content ... in other words to declare the 
difference of intent.  His example was a swizzle on <EnrolledCourse> ... 
assume each course has 1-2 instructors (required primary and optional 
secondary).  What if any constraints do we want to place on the 
specification of primary/alternate instructors (more generally, on 
optional references)?
Alternative A: primary specifies sml:ref="true" at the schema (type) 
level; secondary must not specify sml:ref="true" at the schema (type) 
level, and each instance of secondary has sml:ref="false" implicitly 
and/or explicitly.  When no secondary exists, it has sml:ref="false" (not 
a reference).
Alternative B: primary specifies sml:ref="true" at the schema (type) 
level; secondary specifies sml:ref="true" at the schema (type) level, and 
each instance of secondary has either a reference instance or is a null 
reference.    When no secondary exists, it has sml:ref="true" (because it 
is required by its type definition) and is a null reference (however that 
is syntactically recognized).
There are doubtless other variations, but A and B may be sufficient to 
highlight the issues around this choice.  Static analysis is the obvious 
one.  Code with knowledge of the instructor schema (note I am not 
asserting schema validation at run time... could be a model similar to 
Java client stub generation for web services, which generates Java code 
based on the WSDL for the input message(s)) would not know in alternative 
A that secondary instructors are references (when present).  In 
alternative B that knowledge is available.  This is not a trivial 
difference.
If we want to keep the semantic of null references, we need a consistent 
way to syntactically recognize them that encompasses our broad view of how 
reference schemes can be defined (the FPWD definition of xsi:nil="true" is 
insufficient, as noted above under "empty refs").  Let's settle on whether 
or not the semantic has value before proposing syntaxes.  Assuming we 
retain null references, I'm not sure what it means to talk about resolving 
one (which evaluation of target required="true" would necessitate).  In 
this case I can see where we might want to say, given there is no input to 
the resolution process, that target required is irrelevant (alternatively, 
that a null element is trivially part of every model, just like the null 
set is a member of every set in mathematics).  I could also see that we 
might want to say the opposite (a null reference is never in any model, so 
target required = true always conflicts with a null reference; if you want 
to allow null refs, target required must be false)... this view is 
equivalent to saying that XML Schema nillable="true" for element 
declarations is analogous to sml:targetRequired="false" for reference 
information items).


Best Regards, John

Street address: 2455 South Road, Poughkeepsie, NY USA 12601
Voice: 1+845-435-9470      Fax: 1+845-432-9787
Received on Wednesday, 22 August 2007 14:29:45 UTC