- From: Rick Jelliffe <ricko@topologi.com>
- Date: Sat, 3 Aug 2002 17:03:16 +1000
- To: <w3c-xml-plenary@w3.org>, <w3c-i18n-ig@w3.org>
- Cc: <xml-editor@w3.org>, <w3c-xml-core-wg@w3.org>
From: "John Cowan" <jcowan@reutershealth.com> > The W3C XML Core WG has decided to allow the value of xml:lang, the > attribute for indicating the natural language of character data, to > be an empty string in order to allow the explicit expression of > language-less text inside language-marked text. Here's an example: > > <p lang="en"> > Here is an example of some C code: > <pre xml:lang=""> > #include "stdio.h" > main() {printf("Hello world!"};} > </pre> > </p> > > By the present rules, there is no way to express the fact that the > content of the pre element is not in English. (Computer languages are out > of scope for RFC 3066 and have no codes.) I am in favour of an erratum to XML 1.0 saying 'xml:lang="" means unknown or undefined'. However, I do not believe it should apply in the example given. Xml:lang should merely be a general hint for font-selection, speech synthesizers, indexing robots etc. and only needs to be extended as far as supporting those kinds of needs. W hen some text is not in a natural language, the best practise should be to mark it up with an attribute to clearly specify its notation. We need a mechanism for positive markup, not negative markup. Here where I suggest we need to end up: <p lang="en"> Here is an example of some C code: <pre xsi:type="c-notation" > #include "stdio.h" main() {printf("Hello world!"};} </pre> </p> <p lang="en"> Here is an example of some C code: <pre xsi:type="c-notation" xml:lang="de" > #include "stdio.h" main() {printf("Etwas anderes!"};} </pre> </p> where the type in the schema specifies some appropriate MIME type or the FPI of a notation. (This, yet again, shows the real weakness of XML Schemas for use in practical publishing, where we want a schema language to be able to say interesting things about mixed content just as much as we want to constrain so-called simple types. ) But let me step back, and suggest that there is a deeper issue here, providing a solution to which would help XML users and vendors. The scoping of the effect of attributes, whether W3C-defined or user-defined, should have a systematic solution. Addressing it piecemeal in this fashion just creates a spaghetti of special cases: namespaces, xml:lang, xml:space, xml:base, etc. The fact that scoping is important has been obscured by specifications such as DOM and Infoset (which work at the level before scoping and inheritance takes effect) and XML Schemas and XQuery (which are trying to limit their domain to atomics of data in trees.) The result is that document types which make use of scoping and inheritence either have to have specific APIs which build these in, or they have to use far more complex XPaths in which the inheritence is built-into the query. So rather than being able to say: x/in-scope-attribute:y we have to have x/ancestor-or-parent:*[self::a or self::b][1]/attribute::y XML needs a scoping language for specifying the scoping properties and behaviours of attributes, not only the W3C built-in ones such as xml:lang. Other use cases for such a language might be to express what goes on in SVG, and to express inherited values of attributes. A possible use case might be related to efficient queries, to know when an implementation should provide parent pointers or not. Cheers Rick Jelliffe
Received on Saturday, 3 August 2002 02:48:12 UTC