[Bug 6513] [XQuery] inconsistent terminology in definition of derives-from()

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6513





--- Comment #11 from Jonathan Robie <jonathan.robie@redhat.com>  2009-03-17 09:52:45 ---
(In reply to comment #10)
> Let's suppose module M1 does "import schema namespace S1 at s1.xsd", and s1.xsd
> in turn does <xs:import namespace="S2" schemaLocation="s2.xsd"/>. 
> 
> Let's suppose s1.xsd contains the definition
> 
> <xs:simpleType name="heptagon">
>   <xs:restriction base="s2:polygon">
>     <xs:length value="7"/>
>   </xs:restriction>
> </xs:simpleType>
> 
> Is M1 allowed to use the expression "$x instance of s2:polygon"?
> 
> I think our current specification doesn't give a clear answer to this question.
> My preference is that the answer should be no, because that seems to be
> suggested by the analogy with "import module" and <xs:import> (Though not with
> <xsl:import>). However, it could go either way. Clearly if the answer is "no",
> then s2:polygon is a statically known type even though it's not an in-scope
> type: the processor knows about it even though the user isn't allowed to
> mention it by name in that particular module.


To me, the open question is whether the assembled schema is imported, after all
schema import, include, redefine, etc. has been done. When we've discussed this
before, my memory is that we intended the answer to be "yes". Unlike modules,
XML Schema was not designed to do encapsulation.

But if the answer were "no", this would not be a statically known type in M1 as
defined in the XQuery specification. It's not in the ISSD of the module. I
don't think the specification is ambiguous on this.

> Now suppose that M1 does "import module namespace M2 at m2.xq". The rules
> discussed in the last couple of comments make it clear (a) that schema
> namespaces imported into M2 are not automatically imported into M1, and (b)
> that the names used in functions and variables of M2 that are referenced from
> M1 must be in the ISSD of both modules. This at least creates the possibility
> that the processor for module M1 statically knows about types imported into M2
> even though it does not make them part of the ISSD of M1.

The term "known types" for a given module does not refer to known types in a
different module. If a processor takes advantage of these types when processing
M1, it is using a static typing extension. If the same base schema is extended
in different ways, though, this may be a dangerous static typing extension. One
module may import one extended version of the schema, another may import a
different extended version of the same base schema. If M1 uses only the base
schema, it can use either module. This was a design criterion way back when we
did this initially.

> There may also be other types known statically to the processor. For example,
> it may have a cache of types lying around from previous compilations of
> unrelated queries. It may even have access to a database containing a vast
> selection of all known types. The fact that these types are known does not and
> should not make them available for reference in M1 - that is, the type names
> are not "in scope" unless their namespace is imported.

Sure, these types may exist in an implementation, and the processor may have
static typing extensions that we know nothing about. Our specification does not
describe them. 

Or it might leverage these types for optimizations. It can do that without
documenting anything. Our specification does not need to say anything about
that.

> So, the above discussion suggests three reasons why there may be types known to
> the processor that are not in the ISSD.

But if they are not in the ISSD, they are not statically known types for the
module, as understood in our specification. If there are static typing
extensions in a processor, that processor should document them. It's not our
job to anticipate what kinds of static typing extensions might exist.

> Now, getting back to this bug, examine the text:
> 
> "An unknown schema type might be encountered, for example, if a source document
> has been validated using a schema that was not imported into the static
> context. In this case, an implementation is allowed (but is not required) to
> provide an implementation-dependent mechanism for determining whether the
> unknown schema type is derived from the expected schema type. For example, an
> implementation might maintain a data dictionary containing information about
> type hierarchies."
> 
> it seems on the face of it to be saying that the processor might have knowledge
> about unknown types. 

I believe the original intent was to say that a processor may have dynamic
knowledge of types that are not in the static context. It might discover types
by examining the schema for a document that is being queried, for instance.
This was known as the "winged horse" proposal, because it said that an
implementation need not be completely circumscribed by the ISSD.

To me, what is confusing about this is the "For example" part, because it
specifies information that would be available statically. If there were a data
dictionary, I would prefer to implement this using static typing extensions
rather than a winged horse. I would change this text to say:

"For example, an implementation might explore the schema that was used to
validate a document to discover type hierarchies dynamically."

> That's pretty contradictory, until you realize that it's
> using "known" to mean "types in the ISSD". If we rewrite it to say:
> 
> "The given schema type may be "in-scope" (defined in the in-scope schema
> definitions), or "out-of-scope" (not defined in the in-scope schema
> definitions). An out-of-scope schema type might be encountered, for example, if
> a source document has been validated using schema components in a namespace
> that was not imported into the static context. In this case, an implementation
> is allowed (but is not required) to provide an implementation-dependent
> mechanism for determining whether the not-in-scope schema type is derived from
> the expected schema type. For example, an implementation might maintain a data
> dictionary containing information about type hierarchies."
> 
> then it seems to me to make a lot more sense.

I don't think this "in-scope", "out-of-scope" distinction adds clarity. Nothing
in the specification of our language depends on this distinction. An
implementation can choose to make this distinction in its documentation of
static typing extensions, or it's documentation of the implementation-defined
winged horse extensions.

Jonathan


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Tuesday, 17 March 2009 09:52:56 UTC