- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Sun, 14 Apr 2002 12:07:58 +0100
- To: james anderson <james.anderson@setf.de>
- CC: xml-dev@lists.xml.org, xml-names-editor@w3.org, www-xml-blueberry-comments@w3.org
Hi James, You're proposing enhancing the processing that goes on in XML parsers, correct? My main question is: how does the parser get to know about which colons in element and attribute values indicate QNames, and which are literal colons? For example, given: <my_document xml:qnames="resolve"> <para type="foo:bar"> Here: a colon that doesn't indicate a QName. And a literal string that looks like a QName, but isn't: <xmp>my:type</xmp>. </para> </my_document> How does the parser know that the QName in the type attribute should be resolved (and constitutes a namespace error since its prefix isn't declared), but that the first colon in the content of the para element isn't a malformed QName, and that 'my:type', which looks like a QName, isn't intended to be resolved? This occurs even within a particular value: given an XPath, it is not enough to simply go through and change everything that looks like a QName into a resolved qualified name. For example: <xsl:value-of select="foo:bar/*[name() = 'fred:barney']" /> In the select attribute, there are two things that look like QNames: "foo:bar" and "fred:barney". However, "fred:barney" is a literal string, and therefore shouldn't be resolved. To be able to spot that it shouldn't resolve "fred:barney", a QName-in-content-aware parser would have to know that an attribute was an XPath attribute, *and* be able to parse that XPath attribute so that it could recognise that "fred:barney" was in a literal string and that therefore no resolution should be attempted. XML Schema deals with this partially. Elements and attributes that are of the type xs:QName are resolved during parsing by schema validators, and the resolved qualified name is passed through to the application in the PSVI. What XML Schema doesn't help with is dealing with XPaths or attributes that hold namespace prefixes (such as 'extension-element-prefixes' in XSLT). XML Schema can say that a value is a QName or a space-separated list of QNames, but it can't say that a value is an XPath, a prefix or a list of prefixes, or for that matter a comma-separated list of QNames and other uses of prefixes or QNames in values that we haven't thought of yet. Still, there might be a solution around here somewhere. Certainly, standardising the lexical representation "{namespace-uri}local-name" for QNames would help a whole lot in other areas, and might facilitate incremental resolution of QNames in content and attribute values. (Although perhaps "{namespace-uri}prefix:local-name" would be better -- the prefix might be meaningless to a processor, but it's almost always meaningful to people and therefore important to keep around.) Another possibility (which I expect to get shot down immediately) would be to treat all colons that are not preceded or followed by whitespace as indications of QNames, which would mean adding a sixth built-in entity, &cln;, say, to escape literal colons. The examples above would then be: <my_document> <para type="foo:bar"> Here: a colon that doesn't indicate a QName. And a literal string that looks like a QName, but isn't: <xmp>my&cln;type</xmp>. </para> </my_document> <xsl:value-of select="foo:bar/*[name() = 'fred&cln;barney']" /> This does have the advantage that a basic namespace-aware parser, with no knowledge of schemas or the particular markup language, would be able to know which QNames to resolve and which to leave alone. It still wouldn't, however, be able to deal with attributes holding namespace prefixes not involved in QNames. Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/
Received on Sunday, 14 April 2002 07:08:03 UTC