W3C home > Mailing lists > Public > public-i18n-ws@w3.org > May 2004

Re: sec 4.11

From: Tex Texin <tex@xencraft.com>
Date: Mon, 10 May 2004 20:58:02 -0400
Message-ID: <40A0251A.D7C418B@xencraft.com>
To: aphillips@webmethods.com
Cc: Web Services <public-i18n-ws@w3.org>

Seems ok to me.
I liked your additions around the UCA.

Should the "a" before Modern, be there?
"as with  Traditional or a Modern ordering in Spanish"

Maybe replace this :
"may be incorrect or be perceived to be incorrect by a human observer."

with
"may be incorrect or inconsistent with expectations."

It's not a big deal either way. (When I read about perceptions by an observer
it makes me think of Schroedinger's cat... ;-)
In the context of web services we should try to minimize the references to
humans and users.)

I didn't notice any other changes but it seems ok to me.
Upcorp it!

tex


"Addison Phillips [wM]" wrote:
> 
> Hi Tex,
> 
> I reworked your material a little. Are you okay with:
> 
> ------
> <div2><head>Ordering, Grouping, and Collation</head>
> 
> <p>The ordering or collation of textual data items is a general concern for
> internationalized software. The problem is exacerbated when the data can be
> multilingual in nature. For Web services, in scenarios where the ordering of
> textual data is critical to its correct utilization, it can be difficult to
> identify the appropriate collation rules to use with sufficient precision to
> insure those rules are either followed by any services that operate on the
> data or that appropriate action is taken to compensate for any services that
> do not use the desired collation rules. (For example, by re-sorting the data
> downstream).
> </p>
> 
> <p>A brief list of these collation issues are described here. An important
> reference is the  Unicode Collation Algorithm (UCA), described by: <bibref
> ref="UTR10"/>. Although the UCA is a mature standard, it should be noted
> that there is wide variance in the implementation of collation algorithms;
> that few of these implementations are based on UCA; and that there is little
> or no general agreement on identifiers for collation preferences.</p>
> 
> <p>Collation rules cannot be inferred solely from a language identifier or a
> locale, as the identifiers do not indicate which sort ordering should be
> used within a locale. A language identifier may be suggestive as to whether
> a requester expects a particular sort ordering (as with  Traditional or a
> Modern ordering in Spanish, for example) but it may not be definitive.</p>
> 
> <p>Some examples of sort orderings include: telephone, dictionary, phonetic,
> binary, stroke-radical or radical-stroke. In the latter two cases, the
> reference (source standard) for stroke count may also need to be cited.
> </p>
> 
> <p>Different components or subsystems which are used by a software process
> may employ different sort orderings. For example, a User Agent may provide a
> drop-down list which sorts the elements of the list at run-time differently
> from the other components of the agent. Information retrieved from a
> database may be ordered by an index which has no correlation with the
> requester's requirements. When different components or subsystems of a Web
> Service use different collation rules, then errors can occur. They are not
> always hard errors (i.e. those that generate faults) but the resulting data,
> operations, or events, may be incorrect or be perceived to be incorrect by a
> human observer.</p>
> 
> <p>In the case of services that might use a binary collation (ordering by
> the code points of text data) there can be differences in ordering
> introduced by different components using UTF-8 vs. UTF-16 internally.
> </p>
> 
> <p>Knowing the language of the requester does not prescribe how sensitive
> the collation should be. Should text elements that are different by case or
> accent be treated as distinct? Should certain characters be ignored? For
> example, hyphens are often ignored so that "e-mail" and "email" sort
> together.
> </p>
> 
> <p>Where case is considered distinct, it may be important to describe
> whether all lowercase characters precede all uppercase characters, vice
> versa, or whether they should be intermixed.
> </p>
> 
> <p>Often the performance of an application is impacted by collation. For
> example, if a service returns results in an unknown ordering, the requester
> may have to sort the results using its local collation rules. This can
> consume resources and delay the further use of the results until the entire
> set can be collated. Alternatively, if results are returned in the order
> needed by the requester, then the requester can begin to process the first
> records returned without waiting for the remaining records to arrive.
> </p>
> 
> <p>Of course, collation can be performed at different stages of data
> processing and timing can be an important consideration. Database indexes
> are updated as the data is added to the database, not at the time a request
> arrives. Requests that can use the preordained collation of the index have a
> significant performance advantage over requests that either cannot use
> indexes or must re-sort the results.
> </p>
> 
> <p>See <xspecref href="#S-009">I-009</xspecref> and  <xspecref
> href="#I-013">I-013</xspecref>for a some examples.</p>
> </div2>
> 
> ------
> 
> Addison P. Phillips
> Director, Globalization Architecture
> webMethods | Delivering Global Business Visibility
> http://www.webMethods.com
> Chair, W3C Internationalization (I18N) Working Group
> Chair, W3C-I18N-WG, Web Services Task Force
> http://www.w3.org/International
> 
> Internationalization is an architecture.
> It is not a feature.
> 
> > -----Original Message-----
> > From: Tex Texin [mailto:tex@xencraft.com]
> > Sent: Monday, May 10, 2004 4:44 PM
> > To: Addison Phillips [wM]; Web Services
> > Subject: sec 4.11
> >
> >
> > attached
> >
> > --
> > -------------------------------------------------------------
> > Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
> > Xen Master                          http://www.i18nGuy.com
> >
> > XenCraft                          http://www.XenCraft.com
> > Making e-Business Work Around the World
> > -------------------------------------------------------------
> >
> >

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------
Received on Monday, 10 May 2004 21:32:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:53 GMT