Re: XInclude, schema validity-assessment, xml:base and xml:lang

/ ht@inf.ed.ac.uk (Henry S. Thompson) was heard to say:
| [Note that this email is Bcc'd to the member-only
|  w3c-xml-schema-ig@w3.org list -- please reply to
|  public-xml-core-wg@w3.org and repeat the Bcc to keep both WGs involved]
|
| Position 1: xml:lang and xml:base should be understood as out-of-band
| mechanisms for notating aspects of the infoset (the *[language]* and
| *[base URI]* properties) which would otherwise be inexpressible.  As
| such they are _not_ part of any specific XML application, and are
| arguably in the same category as namespace declarations, that is,
| using attribute syntax but not really attributes at all.  If so, it
| was a mistake to treat them as attributes, and not as declarations, in
| deciding how to treat them wrt XML Schema validity assessment.  We
| should accordingly at least commit to removing them from the scope of
| validity assessment in XML Schema 1.1, and possibly do so for XML
| Schema 1.0 as well via an erratum.

This suggests that some xml:* things are attributes (xml:space,
xml:id) and some are not (xml:lang, xml:base). That makes me very
uncomfortable.

Generalizing to say that all xml:* things are not attributes means
that I can't control where, for example, xml:space and xml:id can
occur, and you'd be hard pressed to convince me that that was "the
right thing".

On those grounds, I'm not a fan of position 1.

| Position 2:  xml:base and xml:lang are attributes like any others.  To
| make it easier to manage them, we should provide some mechanism
| to make it easy to declare 'universal' attributes.

I suppose. I think there are already plenty of ways to do this, once
the schema designer has been made aware of the problem.

| Aha, so _here's_ a lightweight interim solution.
|
| Provide a schema document at http://www.w3.org/2001/XMLSchemaXBL.xsd
| as follows:
|
| <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
|            targetNamespace="http://www.w3.org/2001/XMLSchema">
|
|  <xs:import namespace="http://www.w3.org/XML/1998/namespace"/>
|
|  <xs:redefine schemaLocation="http://www.w3.org/2001/XMLSchema.xsd">
|   <xs:complexType name="anyType">
|    <xs:complexContent>
|     <xs:restriction base="xs:anyType">
|      <xs:sequence>
|       <xs:any processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
|      </xs:sequence>
|      <xs:attribute ref="xml:base"/>
|      <xs:attribute ref="xml:lang"/>
|      <xs:anyAttribute processContents="lax"/>
|     </restriction>
|    </complexContent>
|   </complexType>
|  </xs:redefine>
| </xs:schema>

Interesting. Is redefine widely supported?

| Then if you invoke your schema processor with that schema document
| alongside (before, logically) your own, the right thing will happen!
|
| Seems to me that really is a solution we could try to sell to Oracle
| and Microsoft, perhaps via a Working Group Note . . .

Sounds plausible.

| Position 3: XML Core and Schema WGs issue a joint WG Note defining a
| sort of XInclude-on-steriods, which is an XML application which
| does the following:
|
|  1) Runs XInclude on its 'input';
|  2) Remove xml:base from the resulting infoset;
|  3) (Optional) Do schema-validity assessment on the resulting infoset
|     with zero or more specified schema documents;
|  4) (Optional) Absolutise any relative URIs wrt the appropriate [base
|     URI] value either looking for EIIs or AIIs in the resulting
|     (possible PSV)infoset which
|       a) Match a specified XPath
|      or
|       b) (if (3) was done) match element(*,xs:anyURI) or attribute(*,xs:anyURI)
|  5) Serialise the resulting infoset.

For some documents, this will do the right thing, for others it won't.
And in the general case where there is no schema and no XPath
expression that identifies all the URIs, you've just about guaranteed
that the result of parsing that serialized form will be wrong.

I'm not a fan of position 3 either.

                                        Be seeing you,
                                          norm

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.

Received on Monday, 18 April 2005 18:46:31 UTC