Re: XML Schema: Have we raised "LocalizeGIs" during the last call phase? from C. M. Sperberg-McQueen on 2000-10-07 (www-xml-schema-comments@w3.org from October to December 2000)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Sat, 07 Oct 2000 05:00:09 -0600
To: muraw3c@attglobal.net
Cc: w3c-i18n-ig@w3.org, www-xml-schema-comments@w3.org
Message-Id: <4.3.2.7.1.20001007044100.02191468@espanola.com>

At 2000-10-06 21:24, muraw3c@attglobal.net wrote:
>I thought that localization of GIs is in the list of last-call
>issues.  Apparently, it is not.
>
>It was listed in "XML Schema Issues List" at:
>
>http://www.w3.org/XML/Group/xmlschema-current/issues.html#localizeGIs
>
> >Should it be possible to specify alternative natural-language forms
> >for generic identifiers (aka element type names), names of simple and
> >complex types, notation names, etc.? This would be very useful in
> >localization.
>
>Has we (the I18N IG) raised this issue during the last call phase?

I believe not.

A language-neutral schema can be created, however, and localized
in the following straightforward way:

  - in the language-neutral schema, all elements are declared
    abstract, but otherwise the schema is fully normal.
  - every element which ought, in a localized version of the
    schema, to be a concrete element, is defined as allowing
    a substitution group
  - localized versions of the schema include the language-neutral
    part, and then declare concrete elements with names in the
    local language; each concrete element is declared as being
    in the substitution group of the corresponding abstract element.

For example:  I define an language-neutral schema with abstract
elements named 'marc110', 'marc245', and 'marc500' (and others).

I define an English-language schema by writing a schema document
which (a) includes the language-neutral schema (i suppose it
might import it, instead -- I'm not sure it matters, in this
example), and (b) defines English-named elements 'author'
(in the substitution group of 'marc110'), 'title' (= 'marc245'),
and 'subject' (='marc500').

(If any librarians are reading this, please forgive any errors
I have made in remembering whether 110 is a corporate or
personal author, and in associating 500 casually with 'subject'.
I am working without a MARC manual handy.)

I can define a German-language schema in a similar way, defining
'Autor', 'Titel', and 'Schlagwort' elements.

This has the advantage of providing a visible, traceable
relation among the language-specific versions of the schema.
It has the disadvantage (if it is a disadvantage) of requiring
the schema author to plan ahead.

If the schema author has not planned ahead, the same technique
is still possible, but then it is legal for 'marc110' elements
to occur in instance documents, together with 'author' elements.
(It is late, and I'm tired -- Henry Thompson may have suggested
a method of preventing this from happening, but I cannot remember
whether he did, or I only wished he had, and if he did I
cannot remember what it was.)

Renaming of attributes, on the other hand, is not currently
feasible.  This I regard as an irritating limitation -- but
then, in the six years since the TEI Guidelines were published,
I cannot remember anyone complaining about the fact that, in the
TEI DTD, it is possible to provide localized names for element
types, but not for attributes.  In fact, I don't believe
anyone has ever used the TEI element-renaming facilities in a
production DTD, which continues to surprise me (but then,
most production uses of TEI are in English-speaking countries).

Another alternative is to use architectural forms to establish
the mapping one needs between a schema and a localized version
of that schema.

-C. M. Sperberg-McQueen
  World Wide Web Consortium

Received on Saturday, 7 October 2000 01:02:30 UTC