- From: <bob_buxton@uk.ibm.com>
- Date: Wed, 4 Aug 1999 18:53:27 +0100
- To: www-xml-schema-comments@w3.org, w3c-xml-schema-ig@w3.org
I realize that the authors have deliberately excluded some of these issues from the current drafts however I feel that we at least need to establish some signposts towards the road ahead ! Comments are based on the 6-May-1999 draft. Lexical representation, Internationalization: In my view the schema serves (at least) two distinct purposes: For a generic schema aware tool such as an XML editor the schema should allow the editor to provide considerable added value compared to a merely DTD aware application. For example, if it sees a date or number based data type it can accept input in the format appropriate for the user's locale and preferences and convert it into the appropriate representation for storage in the XML document. For user defined data types and enumerations there needs to be additional information in the schema or some form of side-information to allow the editor to know what alternatives can be used. You might wish to validate that a post code is 5 or 9 digits for a US user whilst it is a mixture of letters and numbers for a British user and to giver a French user the option of entering Verte/Rouge in a colour choice selection. For the writer of an application which uses a validating parser to interpret an XML document it removes the need for the application itself to validate the contents of the element or attribute. The parser is not aware of the date and decimal point conventions of the document's creator so it can only validate that the data fits within the declared lexical representation, it can not be expected to interpret ambiguous date or number formats. The application does not want to know what options the user was given at document creation time - it will expect to handle colours Green/Red even if the user originally entered Verte/Rouge. I think there is a need for a Map function to equate the values of an enumeration with another enumeration to allow for aliases. Mapping an enumeration onto one with an ordered base type would then allow for comparison between enumerated values. Example: A size enumeration <datatype name="sizeEnglish"> <basetype name="string"/> <enumeration> <literal>Extra Small</literal> <literal>Small</literal> <literal>Large</literal> <literal>Extra Large</literal> </enumeration></datatype> <datatype name="sizeEngAbbr"> <basetype name="string"/> <enumeration> <literal>XS</literal> <literal>Sl</literal> <literal>L</literal> <literal>XL</literal> </enumeration></datatype> <datatype name="sizeFrench"> <basetype name="string"/> <enumeration> <literal>Tres Petite</literal> <literal>Petite</literal> <literal>Grande/literal> <literal>Tres Grande</literal> </enumeration></datatype> <datatype name="sizeCode"> <basetype name="integer"/> <enumeration> <literal>10</literal> <literal>20</literal> <literal>30</literal> <literal>40</literal> </enumeration></datatype> We need a syntax that allows us to say that the four enumerations are equivalent with one being the one to returned to a application by a parser whilst the others can be used as alternatives. There would also need to be a way for an XML editor application to know that a French user would wish to see the sizeFrench list in a pull down selection list whilst an English speaker would wish to see sizeEnglish and/or sizeEngAbbr. Similarly there is a need for an XML editor application to be able to choose the appropriate lexical representation out of several possibilities for the user's locale and preferences. Documentation: There is a need for several different types of documentation associated with a schema and for the documentation to exist in the national languages of the users of the schema. It is probably desirable that some of the documentation be kept in the schema itself (especially where translation is not expected to be a requirement) but it should also be possible to keep documentation in separate documents and that there should be a straightforward way of linking to the documentation in the appropriate language. I don't regard coding a URI for each of the French, Spanish, German ... documents in each elementType definition as straightforward - especially if I want to add Japanese documentation at a later date. As for the types of documentation required I can see the need for the following: Design/programming information for use by the schema designers and those writing applications based on the schema (using an HTML subset) Short text description, a one line plain text title for an element/attribute that an application could use in place of the element/attribute name as a more meaningful label. Long text description, one or paragraphs formatted using a subset of HTML tag language to be displayed as a result of a context sensitive help button. It might include links to even more detailed information. Icons that a application might wish to use to represent the element/attribute in, for example, a tool palette. All documentation is, of course, optional Versioning: Currently there is a single version='M.n' attribute on the schema element which does not give any indication as to what may have changed since the previous version of the schema. I would like to see a more formal change control methodology introduced to be able to mark up a schema and show what was changed by who, when and why. This would have value for human readers of the schema avoiding the need to find and compare the old version of the schema but is much more important when you have two applications communicating and they might understand different versions of the schema. To prevent the down level application being sent data that he can't understand the higher level application may wish to send data that fits the previous level of the schema. This is easier to achieve using change flags than by attempting to compare two schemas at run time. We would need to be able to mark what was new in the schema, what was deleted and changes by a delete of the old and add of the new. A possible syntax might be: <schema version="1.2" ...> <changehistory> <version name="1.1" by="Me" date="1999-08-04">Add the panda element</version> <version name="1.2" by="AN other" date="2000-01-01">Remove the widget attribute from the panda element</version> </changehistory> ... <change version="1.1" type="add"> <elementType name="panda"> ... <change version="1.2" type="delete"> <attrDecl name="widget"/> </change> </elementType> </change> Background: CPSM is the System Management component of IBM's CICS Transaction Server. It collects data from CICS transaction servers running in a network and uses it for automated operations, passing to application programs, and formatting to give human operators a single point of control. Since the network is potentially global it is not realistic or desirable to mandate that they are all running at the same product level, hence the interest in internationalization and version control. Currently the data is transported as simple data structures and we provide C, Cobol, PL/I and Assembler mappings for applications to use. I am looking to define schema for the data structures and then generate the various programming language mappings from the schema. I would be interested in hearing of any existing work in this area. Bob Buxton CPSM development, MP 208, Hursley Ext 248193, External 01962-818193
Received on Wednesday, 4 August 1999 13:54:42 UTC