This (mostly editorial) proposal is about SML-IF inter-document references. It covers bugs (needs agreement) 4755, 4777, 5119 (partially), 5120, 5171, 5201 and may affect the following (editorial) 4819 (partially), 5114, 5117, 5121 (partially).

Table of Contents

1. Introduction
2. Informal Description
    2.1 Packaging
    2.2 |↓Inter-document References↓|↑URI References↑
    2.3 Rule Document Bindings
3. SML Interchange Format Definition (Normative)
    3.1 Conformance Criteria
    3.2 Terminology
    3.3 SML-IF Documents
        3.3.1 Embedded Documents
        3.3.2 Document References
    3.4 |↓Inter-document References↓|↑URI References↑
        3.4.1 URI equivalence
        |↓3.4.2 Definition of inter-document references in SML-IF
        |↓3.4.3 SML reference schemes that are not SML-IF inter-document references
        3.4.|↓4↓|↑2↑ Document aliases
        |↓3.4.5 Relative references
        |↓3.4.6 Resolving Inter-document References
        |↑3.4.3 URI Reference Processing
    3.5 Rule Document Bindings
        3.5.1 URI prefix matching
        3.5.2 Bindings defined
4. References
    4.1 Normative References

Appendices

A. SML-IF Schema
B. Open Issues (Non-Normative)


2. Informal Description

...

Explicit |↓inter-document references: The documents to be interchanged may explicitly refer to one another and to documents that are not packaged with the documents being interchanged. SML uses such references for many purposes, and permits many different addressing mechanisms. |↓The arcs in SML models↓ |↑SML references among SML model instance documents↑ are an obvious example. Less obvious are such references as |↓xsi:schemaLocation.↓ |↑certain schemaLocation attributes in schema documents↑. |↓The SML-IF specifies a uniform mechanism for unambiguously resolving |↑some such↑ references among the documents being interchanged.↓

...

2.1 Packaging

...

The first child of each document is typically a docInfo element that (indirectly) contains a list of alias elements whose content is a URI with no fragment component|↓s (i.e., one with no "#" in it). Each of these URIs serves as a name that other documents can use to refer to this document. Examples of how aliases are used to |↓resolve inter-document↓ |↑handle certain URI-based↑ references are given |↓below.↓ |↑in 2.2 URI References.↑

...

2.2 |↓Inter-document References↓ |↑URI References↑

|↓Explicit inter-document references can appear in SML documents as elements. For example, model arcs are represented by elements marked with the global attribute sml:ref="true". Inter-document references can also appear as attributes. For example, an xsi:schemaLocation attribute provides a hint about where to find a relevant schema document.↓

|↑When processing the SML model packaged inside an SML-IF document, certain URI references (as defined in RFC 3986 [IETF RFC 3986]) may need to be processed to find their corresponding target. For example, to assess SML validity of the interchanged model, SML references using the URI scheme need to be resolved; and to assemble a schema from multiple schema documents as part of SML model validity assessment, the schemaLocation attribute on an <xs:include> element needs to be processed to locate the schema document being included.↑

To see how |↓inter-document↓ |↑these URI↑ references are handled, consider the following SML-IF document:

<?xml version="1.0" encoding="UTF-8"?>

|↓<model xml:base="http://www.university.example.org/sml/models/"

      xmlns="http://www.w3.org/2007/09/sml-if"

      xmlns:sml="http://www.w3.org/2007/09/sml"

      xmlns:xml="http://www.w3.org/XML/1998/namespace"

      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">↓

|↑<model xmlns="http://www.w3.org/2007/09/sml-if" version="1.0">↑

  <identity>

    <name>http://www.university.example.org/sml/models/Sample/InterDocReferences</name>

  </identity>

|  <definitions>

    <document>

      <data>

        <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

          <xs:include schemaLocation="http://www.university.example.org/university/enrollmodel.xsd"/>

        </xs:schema>

      </data>

    </document>

  </definitions>↑

  <instances>

    <document>

      <data>

        <Student xmlns="http://www.university.example.org/ns"

|↓          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

          xsi:schemaLocation="http://www.university.example.org/ns

                              http://www.university.example.org/university/enrollmodel.xsd">↓

|↑                 xmlns:sml="http://www.w3.org/2007/09/sml">↑

          <ID>1000</ID>

          <Name>John Doe</Name>

          <EnrolledCourses>

            <EnrolledCourse sml:ref="true">

              <sml:uri>

|↑                <!-- Reference to a course Inside the interchange set -->↑

                http://www.university.example.org/Universities/MIT/Courses.xml

                 #xmlns(u=http://www.university.example.org/ns)

                  xpointer(/u:Courses/u:Course[u:Name='PHY101'])

              </sml:uri>

            </EnrolledCourse>

            <EnrolledCourse sml:ref="true">

              <sml:uri>

|↑                <!-- Reference to a course OUTside the interchange set -->↑

                http://www.university.example.org/Universities/Capella/Courses.xml

                 #xmlns(u=http://www.university.example.org/ns)

                  xpointer(/u:Courses/u:Course[u:Name='LIT103'])

              </sml:uri>

            </EnrolledCourse>

          </EnrolledCourses>

        </Student>

       </data>

    </document>

    <document>

|↑      <!-- One of the courses referenced above -->↑

      <docInfo>

        <aliases>

|↓          <alias>/Universities/MIT/Courses.xml</alias>↓

|↑          <alias>http://www.university.example.org/Universities/MIT/Courses.xml</alias>↑

        </aliases>

      </docInfo>

      <data>

        <Courses xmlns="http://www.university.example.org/ns">

          <Course>

            <Name>PHY101</Name>

          </Course>

          <Course>

            <Name>MAT200</Name>

          </Course>

        </Courses>

      </data>

    </document>

  </instances>

</model>

|↓SML-IF uses equivalence of URIs to resolve inter-document references among documents being interchanged. For more on URI equivalence and resolving inter-documents references see sections 3.4.1 URI equivalence and 3.4.6 Resolving Inter-document References. Although the example above defines the xml:base attribute on the document element, the xml:base attribute may also be defined on other element information items.↓

|↓With the exceptions explained in the normative part of this specification, content whose type is anyURI or a type derived from anyURI and is contained in a document in the interchange set is considered to be an inter-document reference.↓

|↑Formal rules about how URI references are processed are defined in section 3.4 URI Reference Processing. When not packaged in an SML-IF document, certain URI references (e.g. values of <sml:uri> elements or certain schemaLocation attributes) are dereferenced to find their corresponding document. When these references are packaged in an SML-IF document, consumers of the SML-IF document need to first examine whether the target document or element is packaged in the same SML-IF document. To determine this, the fragment component, if any, is temporarily ignored to form a URI. This URI is then compared against alias URIs of packaged model documents.↑

If the URI |↓in such a reference↓ is equivalent to the URI in an alias |↑(see 3.4.1 URI equivalence)↑, |↓the reference is to the document with that alias, irrespective of any↓ |↑the consumer will not attempt to look for targets of this URI outside of the SML-IF document, although there may exist a↑ document |↓that may be↓ retrievable at this URI. If the |↑original↑ reference has a fragment, the fragment is applied to the referred-to document to establish |↓which element the reference points to↓|↑the reference target(s) according to the corresponding reference scheme definition;↑ otherwise the reference is to the root element of the referred-to document.

If the |↓element in a reference↓ |↑URI↑ is not equivalent to the URI in any alias, |↓the reference is to some (element of a) document not included in the interchange set. Such references are called unresolved references.↓ |↑then the SML-IF document does not contain the corresponding target of the original reference. The consumer may or may not attempt to look for targets outside of the SML-IF document, depending on the nature of the reference.↑

|↓SML-IF specifies how sets of SML documents are interchanged. If and how an SML-IF document's interchange set relates to a complete (i.e., "validatible") model is implementation-defined and is not a part of this specification. One common relationship is that the interchange set constitutes a complete SML model. In such a case, the documents that unresolved references refer to are simply unavailable to the validation process. When SML-IF documents are used in other contexts, such as when they are the content of Web services requests or responses, the set of documents constituting a complete model may include more or fewer documents than those in the interchange set, and the conventions with respect to unresolved references may be different. For example, the convention may specify how to (attempt to) resolve such references.↓

|↓The absolute URI form of every alias in an interchange, like all absolute URIs, contains an "authority" component. A reasonable interpretation of SML-IF aliases is that the SML-IF document containing them is asserting that the content marked with a given alias is a true copy of the content identified by that URI and issued by the authority in the alias URI. Since this may or may not be true, consumers need to be cautious with this interpretation.↓

Referring now to the example SML-IF document|↓,↓ above. The reference:

http://www.university.example.org/Universities/MIT/Courses.xml

 #xmlns(u=http://www.university.example.org/ns)

  xpointer(/u:Courses/u:Course[u:Name='PHY101'])

|↑, after removing the fragment, becomes

http://www.university.example.org/Universities/MIT/Courses.xml

, which↑ is equivalent to the URI listed in the alias accompanying the Courses document. |↓(i.e., "http://www.university.example.org/Universities/MIT/Courses.xml").↓ So, by applying the fragment in the reference to the Courses document, we determine that the reference is to the Course element whose Name element has "PHY101" as its content.

The |↑fragment-free part of the↑ reference|↓:↓

http://www.university.example.org/Universities/Capella/Courses.xml

 #xmlns(u=http://www.university.example.org/ns)

  xpointer(/u:Courses/u:Course[u:Name='LIT103'])

|↑is

http://www.university.example.org/Universities/Capella/Courses.xml

which is not equivalent to the URI in any alias. This means that it is an unresolved |↑SML↑ reference. |↓Since the reference is written as a relative reference, URI coming out of the reference resolution algorithm:

http://www.university.example.org/Universities/Capella/Courses.xml

 #xmlns(u=http://www.university.example.org/ns)

  xpointer(/u:Courses/u:Course[u:Name='LIT103'])

is a hint for where to find the document.↓

The URI:

http://www.university.example.org/university/enrollmodel.xsd

|↓(in the content of xsi:schemaLocation for the Student document)↓ |↑(value of the schemaLocation attribute on the <include> element)↑ is not equivalent to any alias. |↑The consumer may or may not attempt to locate a schema document using this URI reference.↑ |↓Since it is written as an absolute URI, the reference is to the document the URL locates. The same is true for all of the absolute URIs in the documents being interchanged:

http://www.university.example.org/ns

and

http://www.w3.org/2001/XMLSchema-instance

3. SML Interchange Format Definition (Normative)

This section normatively defines the Service Modeling Language Interchange Format (SML-IF). It is not intended as motivational or introductory material. For such material, please see the non-normative informal description, above. Instead, this section is intended to concisely define the requirements that SML-IF documents must adhere to and to define how |↓inter-document↓ |↑URI↑ references contained in them are to be interpreted by consumers of SML-IF documents.

3.4 |↓Inter-document References↓ |↑URI References↑

3.4.1 URI equivalence

SML-IF uses equivalence of URIs extensively to |↓resolve↓ |↑handle↑ references among documents in the interchange set. To determine whether two URIs are equivalent, consumers MUST perform case sensitive |↓simple string comparison based on↓ codepoint-by-codepoint comparison of the corresponding characters in the URI|↑ reference↑s. |↓Whenever a relative |↓URI↓ |↑reference↑ is tested for equivalence with another URI, SML-IF uses the [base URI] property as specified in the Infoset [XML Information Set] to define a base URI for relative |↓URIs.↓ |↑references.↑ The [base URI] property can be defined on any of the information items in the interchange set.↓

|↓3.4.2 Definition of inter-document references in SML-IF

[Definition: In the context of SML-IF, an inter-document reference is any content in a document in the interchange set whose type is xs:anyURI or a type derived from xs:anyURI and whose context in the document implies that the URI can (given the right permissions and connectivity) be dereferenced using the default retrieval action for the URI's scheme.]

Note:

This definition WILL change to not require PSVI once the schema binding issue is resolved.

For example, an xsi:schemaLocation attribute is defined to be of type list of xs:anyURI. These come in pairs, one for the namespace name, and one for a hint as to the location of a schema document defining names for that namespace name. This makes the "hint" URIs in xsi:schemaLocation attributes inter-document references in the context of SML-IF. Similarly, an sml:uri element contained in an element marked with sml:ref="true" is an inter-document reference because its content is of type xs:anyURI and the definition of sml:uri is that the referred-to document can be obtained by dereferencing the URI using the default retrieval action.

In contrast, the wsa:address in a Web Services Addressing [WS-Addressing Core] endpoint reference is not an inter-document reference in the context of an SML-IF document. This is because, although a wsa:address is defined to be of type anyURI, the action needed to dereference the URI is not the default retrieval action for the scheme of the URI. Instead, the action required is defined by protocol binding used to interact with the endpoint.

SML-IF Consumers MUST interpret xsi:schemaLocation hints and sml:uri content used as SML reference schemes as inter-document references. SML-IF Consumers MUST NOT interpret wsa:address content as inter-document references.|↓

|↓ 3.4.3 SML reference schemes that are not SML-IF inter-document references

SML [SML 1.1] defines two reference schemes, the sml:uri scheme and the EPR scheme. It also permits new schemes to be created without limit. Schemes that do not use URIs or whose use of URIs does not imply that the URIs may be dereferenced for retrieval using the default action (e.g., for the HTTP scheme, the GET method) are not inter-document references in the context of SML-IF. Three consequences flow from this.

First, to successfully interchange documents using such schemes, the sml:ref elements containing them MUST also contain an sml:ref scheme that is an inter-document reference in the SML-IF context. For example, an sml:ref that contains an EPR scheme reference (which is not an inter-document reference in SML-IF) could also contain an sml:uri scheme reference (which is).

Second, the producer of the SML-IF document and the consumer of must agree on the scheme(s) being used since SML-IF only requires consumers to understand the sml:uri scheme.

Third, when creating a new sml:ref scheme, authors MUST be explicit about whether the scheme is an SML-IF inter-document reference.|↓

3.4.|↓4↓|↑2↑ Document aliases

In addition to containing or referring to one of the documents in the interchange set, each document element may (indirectly) contain a list of alias elements. Each alias contains a URI. The set of alias URIs for a given document constitutes the set of identifiers by which documents in the interchange set may |↓make inter-document↓ |↑have↑ references to the document in question.

A document element containing no alias elements signals that the document in question has no aliases. |↓By implication having no alias also signals that there can be no inter-document references to it.↓

|↑Alias URIs MUST comply with the “absolute-URI” production as defined in RFC 3986 [IETF RFC 3986]. This implies that they do not contain fragment components.

All alias URIs in an SML-IF document MUST be unique.↑

|↑3.4.3 URI Reference Processing

When processing an SML-IF document, there are 3 categories of URI references that may need to be resolved:

1.      schemaLocation attributes on <xs:include> and <xs:redefine> in schema documents, when they are model definition documents. [WG attention: bug 4774 reference to schema binding section.]

2.      Certain URI references used in SML reference schemes. For a URI reference to be in this category, its non-fragment URI components have all the information to uniquely identify at most one model document that potentially contains the target(s) of the URI reference.

3.      URI references used in SML reference schemes but are not in category #2.

It is clear which references fall into category #1. An example of category #2 is URI references used in SML references that use the SML URI reference scheme. An example of category #3 is URI references used in SML references that use the SML EPR reference scheme. This is a consequence of how the respective schemes are defined. Similarly, when new references schemes that use URI references are defined, whether they fall into category #2 or #3 will be clear from the reference scheme definitions.

Resolution of URI references in category #3 is defined in their respective scheme definitions. It is also possible to have reference schemes that do not use URI references. Again, their resolution is governed by their scheme definitions and is not covered by this section.

To process a URI reference UR that is within categories #1 or #2 above, the following steps are performed:

[WG attention: is the following list comprehensible? Or do we prefer prose over lists? Or both, where the list is normative and the prose is not? This is purely editorial and can be done after we adopt the proposal.]

1.      Determine the document D that possibly contains the target:

a.       If UR contains only a fragment component, then D is the model document that contains UR.

b.      Otherwise

                                                         i.            If UR has a fragment component, then let UR' be the URI referenced formed by removing the fragment component; otherwise let UR' be UR.

                                                       ii.            If UR' is a relative reference, then let B be the [base URI] property of the element information item containing UR, and transform UR' to form an (absolute) URI U, using B as the base URI, as defined in section “5 Reference Resolution” of RFC 3986 [IETF RFC 3986]; otherwise let U be UR'. [WG attention: bug 5181: xml:base changes]

                                                      iii.            If there exists a model document with an alias URI that is equivalent to U (3.4.1 URI equivalence), then D is that document; otherwise D has no value.

2.      If D has no value, then

a.       If UR is within category #1 (schemaLocation), then the SML-IF document does not contain the target schema document. (Whether the consumer continues to dereference UR or U is governed by other sections of this specification.) [WG attention: bug 4774 reference to schema binding section.]

b.      Otherwise (UR is within category #2, used in an SML reference) UR has no target.

3.      If D is a schema document that is also a model definition document in the interchange set, then

a.       If UR is within category #1 (schemaLocation), and it does not contain a fragment component, then UR targets the root element of D.

b.      Otherwise (UR contains a fragment component or is within category #2) UR has no target.

4.      If D is a model instance document in the interchange set, then

a.       If UR is within category #1 (schemaLocation), then it has no target.

b.      Otherwise following the corresponding reference scheme definition to locate 0, 1, or many target elements in D using UR (possibly applying its fragment component, if any).

5.      Otherwise (D is another kind of document in the interchange set) UR has no target.

|↓3.4.5 Relative references

[TODO: bug 5181: xml:base changes]

If |↓any inter-document reference or any alias of any document in the interchange set↓ |↑a URI reference to be resolved using the default retrieval action↑ is a relative |↓URI,↓ |↑reference,↑ then |↑before resolving it,↑ the |↑???nearest???↑ [base URI] property as defined by Infoset [XML Information Set] MUST be used |↓to specify a↓ |↑as the↑ base URI for |↓these references.↓ |↑resolving this relative reference to an (absolute) URI.↑

|↓ 3.4.6 Resolving Inter-document References

If the URI representing an inter-document reference contains only a fragment, the inter-document reference is to the document in which it occurs. Otherwise, if the URI representing an inter-document reference is equivalent to a URI that is an alias of some document in the interchange set, the inter-document reference is to that document. In either case, such a reference is called "a resolved inter-document reference." If neither of these cases applies, the inter-document reference is to a document not included in the interchange set. Such a reference is called "an unresolved inter-document reference."

If the URI representing a resolved inter-document reference has no fragment, the reference is to the root element of the referred-to document.

If the URI representing a resolved inter-document reference has a fragment, the reference is to the element obtained by applying the fragment to the referred-to document starting with its root element.↓