Scope of IDs from noah_mendelsohn@us.ibm.com on 2003-08-14 (xml-dist-app@w3.org from August 2003)

From: <noah_mendelsohn@us.ibm.com>
Date: Thu, 14 Aug 2003 11:24:54 -0400
To: public-xml-id@w3.org
Cc: xml-dist-app@w3.org, lehors@us.ibm.com, w3c-xml-schema-ig@w3.org
Message-ID: <OFBFD80434.F39EC04E-ON85256D82.005154E6@lotus.com>
Congratulations on the publication of the ID Requirements working draft
[1].  Although this comment relates in part to SOAP, it is my personal
opinion and does not represent an official position of the XMLP WG (or my
employer, for that matter).

My concern is that the working draft should clarify the scope of the ID
mechanisms to be considered, and thus the should make clear the use cases
to be handled.

Traditional XML IDs are effectively scoped to the document:  ID's must be
unique to the document, and IDREFs or fragment identifiers "targeting" such
an ID are basically used to pick out markup within the document.  The
situation in XML Schema is somewhat complicated by the fact that Schema
validation can in principle be attempted on any particular element
information item [3], but schema does its best to reproduce the document
scoping of XML [4].  In the common case where the root of the schema
assessment is the root element of a document, the xsd:ID type has
uniqueness constraints and reference scope similar to those of XML id.

By contrast, and I think this is the source of some confusion, SOAP IDs [5]
are not scoped to a whole document, at least not quite in the sense above.
SOAP IDs are an aspect of and scoped to the SOAP encoding, which must be
"activated" by use of an "encodingStyle"  attribute.  Thus in the following
document, only some of the ids and hrefs can be used (comments in the
document show which are OK).

<soap:Envelope>

  <!-- NOT OK:  no IDs on Envelope Markup -->
  <soap:Header enc:id="E">
  </soap:Header>

  <soap:Body>
   <myelement1
soap:encodingStyle="http://www.w3.org/2003/05/soap-encoding">
     <!-- OK, in the scope of encoding style-->
     <a enc:id="A"/>
     <!-- OK -->
     <q enc:href="A"/>
   </ymelement1>

   <myelement2>
     <!-- NOT OK: not in scope of SOAP encoding -->
     <x enc:id="x"/>
   </myelement2>

   <myelement3
soap:encodingStyle="http://www.w3.org/2003/05/soap-encoding">
     <!-- OK -->
     <b enc:id="B"/>
   <myelement3 >

   <myelement4 >
     <!-- NOT OK: not in scope of SOAP encoding -->
     <r enc:href="A"/>
   </myelement3 >

  </soap:Body>

</Soap:Envelope>

For the record, such tangled activation and deactivation of encoding would
be very unlikely in practice,  Most users open the encoding scope once and
do all their work within that.  Still, the key point is that the whole
purpose of SOAP ids is different from XML and Schema IDs, or at best it's a
constrained use of the concept.  The purpose of SOAP IDs is to mark up a
particular type of graph encoding, and the SOAP syntax says that such graph
nodes may be defined only by markup within the scope of a suitable encoding
style.  To see why this is important, consider a situation in which someone
someday invents a second encoding style, perhaps for a slightly different
sort of graph, and both are used in the same document:

<soap:Envelope>

  <soap:Header >
   <myheader soap:encodingStyle="http://www.w3.org/2003/05/soap-encoding">
     <!-- OK -->
     <a enc:id="A"/>
     <!-- OK -->
     <q enc:href="A"/>
   </myheader>
  </soap:Header>

  <soap:Body>
   <myelement2 soap:encodingStyle="http://www.w3.org/2006/01/NEWENCODING2">
     <!-- OK -->
     <a enc2:id="A"/>
     <!-- OK -->
     <q enc2:href="A"/>
   </myelement2>
  </soap:Body>

</Soap:Envelope>

Here we have two completely disjoint graphs, one in myheader encoded with
today's encoding, another in the body encoded with the new encoding.  The
specification for NEWENCODING2 has the freedom to define it's own reference
markup (enc2:id, enc2:href) and to decide that it represents an ID scope
disjoint from the one defined by the current soap encoding.  Thus, the fact
that the body uses another id="a" might not be a conflict.  This is in fact
very handy if SOAP messages are to be composed modularly;  it would be a
nuisance to have to rewrite all the IDs in the body just to avoid conflict
with the header.

Thus, SOAP IDs are really solving a different (though related) problem from
XML IDs, which is why in my opinion the XMLP group adopted its own
attributes.  I think it would be helpful if your requirements document
acknowledged these sorts of use cases and made clear whether they are in or
out of scope for your work.  My suggestion would be:  you should focus
mainly on document scope and traditional uses of XML ID;  you should
consider use cases like SOAP at least enough to convince yourself that
separate mechanisms are indeed the right way to solve SOAP's problem.  If
so, say so and keep working on the document level problem.  If you decide
that a unifying mechanism is appropriate for all such use cases, then I
think you have to take on the whole problem.  In case, I think you should
set a requirement or goal to clearly outline in any final recommendation or
finding the intended range of uses for any mechanisms you might propose.

Thank you as always for the opportunity to comment.  I am taking the
liberty of cross posting to both the schema and distApp mailing lists, but
I suggest that further responses be sent only to public-xml-id to avoid a
cross-posting-mess.

Noah

[1] http://www.w3.org/TR/2003/WD-xml-id-req-20030806/
[2] http://www.w3.org/TR/REC-xml#id
[3] http://www.w3.org/TR/xmlschema-1/#validation_outcome
[4] http://www.w3.org/TR/xmlschema-1/#cvc-id
[5] http://www.w3.org/TR/soap12-part2/#encodingedgesandnodes

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Thursday, 14 August 2003 11:28:13 UTC