- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 07 Mar 2005 01:33:31 +0900
- To: "Yves Savourel" <ysavourel@translate.com>, <public-i18n-its@w3.org>
[I just renamed this thread to "localization properties". I started writing the mail a week or so ago, I hope it's still useful.] At 05:51 05/03/02, Yves Savourel wrote: > >> Can we say then that ITS does not try to cover the localisation >> properties case (separate file, with information that can be >> applied uniformly across the entire XML instance file)? And focus >> only on l10n directives (embedded information in XML instance file)? > >Yes. As Martin mentionned, that is specified in the charter. I very much agree, except for the word 'case' at the end of Masaki's first sentence. The WG is definitely not chartered to work on localization properties. But this should not mean that the WG should not try to, to some extent, cover the "localization property *case*". I could for example immagine something like the following being included in the guidelines we produce: "If you design (your application specific) markup, try to do it so that e.g. different elements are used for translatable and non-translatable text, if you can make this match reasonably well with your application needs." In other words, we should give whatever advice we can to schema designers so that the resulting schema is well internationalized and documents can easily be localized. In some cases, that may be because the application-specific markup covers these needs, and this application-specific markup can be picked up by localization properties. In other cases, it will be because the schema uses the tags and attributes we provide, and these can be picked up directly by a tool (we can immagine that the tool will have built-in localization properties for what we define). Masaki also characterizes the "localization property case" as a "separate file, with information that can be applied uniformly across the entire XML instance file". Localization properties are definitely separate from the primary document. But I'm not totally sure about "uniformly across the entire instance". First, it should usually be a set of instances using the same schema. If one has to redesign/rewrite localization properties for each instance, then that might be an indication that our guidelines have failed. (of course there may be cases where there are two or more different localization property sets applied to the same schema; an example I can immagine, based on my limited experience in the localization industry, would be that for a document containing UI text and explanatory text (not in itself necessarily a good idea), most of the text is translated into some major languages, whereas only the UI part is translated into some other languages. The other point is that localization properties may be designed so that they are more selective (e.g. using something like XPath or CSS selectors) and apply less uniformly to a document. Simple examples might be that the contents of certain elements is translated in the body of the document, but not in the header. >However, we probably have to think about schemas (like XSD). One can imagine to >have our tag set used there (for translating the doumentation inside the schema >for example). That is a very good use case example, but should not be a special case, because it's text that may need to be translated like any other text. >But a schema seems also somewhat of a logical place to specify >"properties" associated with elements and attributes. That's a very interesting idea. In general, I would say that how to embedd localization properties into a schema should be worked on by whatever group takes care of localization properties. But it's definitely an area where we should make sure we are well coordinated. >It's actually already the >case: For example, Martin mentioned a mechanism to define what characters are >allowed in a given element, there is also existing ways to set length >limitation, etc. I'm not sure yet how much we should worry about that aspect of >"embedded properties". I don't think I have thought this specific example through. The following criteria come to mind for me: 1) static vs. dynamic: some requirements or issues can be seen or defined in a static way. Length would be such an exmple. I can just say that the resulting field should not be longer than some value. A tool can try to pick that up in any way it wants. But it can also be easily tested before or after localization, independent of the actual localization that went on. Other issues are more dynamic, e.g. whether something gets translated or not. There are probably ways to check the result automatically, but that's getting into more heuristics that we may want to. 2) schema vs. (document) instance vs. individual document or item,... 3) generic technology vs. specific technology: If something is already in XML Schema, or we think that's the place it belongs, because it's much more general than internationalization/localization, then we should reuse it, or work with the XML Schema WG to get it in there. >Maybe you could concentrate first to identify the different issues, regardless >whether their solutions will need to be specified inside the document instance >or outside as a general property (or both). It's probably reasonable to assume >that only few, if any, need to be solve only as a general property. Yes. And even for those where they are always solved via properties, we may want to give advice to DTD designers. BTW, I was planning to use this mail also to say how unhappy I was with the term "localization properties". But somehow I made the connection with CSS properties, and the term now makes sense to me. Maybe that parallel with CSS properties was always obvious for some of you, but I think it would be good to call it out for those who don't get it that quickly (like me). Regards, Martin.
Received on Sunday, 6 March 2005 16:39:33 UTC