W3C home > Mailing lists > Public > www-international@w3.org > April to June 2007

Re: Updated Working Draft "Best Practices for XML Internationalization"

From: Mark Davis <mark.davis@icu-project.org>
Date: Sat, 30 Jun 2007 11:45:27 -0700
Message-ID: <30b660a20706301145h6dbc7e6dm2ba9733f21cff695@mail.gmail.com>
To: "CE Whitehead" <cewcathar@hotmail.com>
Cc: www-international@w3.org, fsasaki@w3.org
The title appears misleading. There are multiple ways to internationalize
XML documents. Only a few of the practices are general; the thrust of the
document appears to be using ITS to do so, so a more apt title would be

>Best Practices for XML Internationalization
=>
Best Practices for XML Internationalization using ITS

> Include xml:lang <http://www.w3.org/TR/REC-xml/#sec-lang-tag> in your DTD
or schema to allow to specify the natural language of the content
=>
Where necessary, include
xml:lang<http://www.w3.org/TR/REC-xml/#sec-lang-tag>in your DTD or
schema to allow to specify the natural language of the
content.

[why? because an XML document that just has locale-independent information
like inventory counts of part numbers doesn't want to have this. Ditto
below.]

> Make sure the xml:lang <http://www.w3.org/TR/REC-xml/#sec-lang-tag>attribute is available for the root element of your document, and for any
element where a change of language may occur.

=>
If you documents can contain text of different languages, make sure the
xml:lang <http://www.w3.org/TR/REC-xml/#sec-lang-tag> attribute is available
for the root element of your document. If it can contain mixed languages,
make sure it is available for any element where a change of language may
occur.

Same changes for other cases, like #2, #7,...

>Best Practice 19: Use CDATA sections with
caution<http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/#AuthCDATA>
I'd like to see this be:
=> Best Practice 19: Avoid CDATA sections wherever
possible<http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/#AuthCDATA>

> Best Practice 21: Ensure any inserted text is context-independent<http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/#AuthInsText>
While nice in theory, in practice this is impractical. People need to
substitute variable values. This phrasing is approaching it from the wrong
end. Instead of saying:
> Make sure any piece of inserted text is grammatically independent of its
surrounding context.
You need to say:
=> Structure surrounding text so that inserted text will be grammatically
independent.

That is, in each example you have of bad practice, you need to show how to
restructure it as good practice. Eg.

Bad:
<p>Using an <term conref="termbase#t123"/> raise the vehicle from the
ground.</p>
Good:
If the terms to be substituted are a closed, small set, then replace the
message by the multiple substituted phrases.
OR
Restructure the text using a "form" style, where the substituted term is
separated grammatically from the rest of the sentence.
<p>Raise the vehicle from the ground, by using: <term
conref="termbase#t123"/></p>

Mark

On 6/30/07, CE Whitehead <cewcathar@hotmail.com> wrote:
>
>
>
>
> I'm just commenting on the English; sorry; and I have only gotten through
> BP1, I will try to get to  look at more of this sometime this weekend or
> something!
>
> (This is for the draft,
> http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/#DevLang)
>
>
> * * *
> Best Practice One
> RULE:
>
> "to allow to specify the natural language of the content."
>
>
> Aggh!  Do not use "to" twice like this without a new subject.
> Je dirais ici/I would say in this case:
>
> "to allow one to specify the natural language of content"
>
> (I inserted a second subject, "one")
>
> but never
> "to allow to specify"
>
> Alternatively, without using a second subject for the second infinitive,
> "to
> specify,"
> I'd say,
>
> "to allow specification of "
>
> * * *
>
> NOTE:
>
> "Note: The scope of the xml:lang attribute applies to both the attributes
> and the content of the element where it appears, therefore one cannot
> specify different languages for an attribute and the element content. ITS
> does not provide remedy for this. Instead, it is recommended to not use
> attributes for translatable text."
>
> "does not provide remedy" is perfectly understandable & does not sound
> that
> bad in English (or I have gotten used to English as spoken by non-natives;
> someone at TESOL asked whose English is it anyway; it belongs to the users
> after all);
> but I'd say, "does not provide a remedy"
>
> * * *
>
> Again, NOTE:
>
> "Note: If not the language of the content, but a natural language value as
> data or meta-data about something external to the document has to be
> specified, an attribute different from xml:lang (like hreflang in XHTML)
> should be used."
>
>
> "because data or meta-data . . . "
>
> also a comma before and or because would help make this sentence more
> readable!
>
> * * *
>
> WHY DO THIS?
>
> "It is not recommended to use your own attribute or element to specify the
> language of the content. The xml:lang attribute is supported by various
> XML
> technologies such as XPath and XSL (e.g. the lang() function). Using
> something different would diminish the interoperability of your documents
> and reduce your capability to take advantage of some XML applications."
>
>
> TRY:
>
>
> "It is not recommended that you use" ???
>
> or
>
> "Using you own attribute or . . .  is not recommended"
>
> or
>
> "We do not recommend that you use . . ."
>
> (People do say, "It is not recommended to go . . ." and stuff, but the
> above
> sentence nevertheless sounds very awkward,
> probably because the person who should not use his/her own attribute or
> element is referred to "you" as later while the subject of "is
> recommended"
> is "It" [that is, impersonal].
>
> So use the subjunctive in this case [yes English has the subjunctive;
> people
> do not know it & we do not study it.]
>
>
> * * *
>
> ALSO  Introductory Stuff
>
> 1.1
>
> * "The fist is intended to the designers and developers . . ."
>
> We say "intended for"
>
> *  "The second is for the XML content authors"
>
>
> We use the articles, a, an, the, some, etc., except for with non-count
> nouns!
>
> --C. E. Whitehead
>
> >
> >The Internationalization Tag Set Working Group has published an updated
> >Working Draft of "Best Practices for XML Internationalization."
> >
> >http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/
> >
> >These best practices are a complement to the International Tag Set W3C
> >Recommendation http://www.w3.org/TR/its/ , and are written for designers
> >and developers of XML applications, XML content authors as well as users
> >and translators.
> >
> >The Working Group is following the discussion on the www-international
> list
> >and would appreciate especially feedback on the Best Practices 1-6 , see
> >the links below.
>
> See above!
>
> >
> >Best Practice 1: Provide xml:lang to specify natural language content
> >http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/#DevLang
> >
> All I've provided feedback on, and I've only provided feedback on the
> English; if that's not what you are soliciting let me know!  I won't
> provide
> more.  (I always appreciate when anyone corrects my language use, my
> English
> or anything else, but it's up to you.)
>
> _________________________________________________________________
> Like puzzles? Play free games & earn great prizes. Play Clink now.
> http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2
>
>
>


-- 
Mark
Received on Saturday, 30 June 2007 18:45:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:13 GMT