- From: Tex Texin <tex@xencraft.com>
- Date: Mon, 10 May 2004 20:58:02 -0400
- To: aphillips@webmethods.com
- Cc: Web Services <public-i18n-ws@w3.org>
Seems ok to me. I liked your additions around the UCA. Should the "a" before Modern, be there? "as with Traditional or a Modern ordering in Spanish" Maybe replace this : "may be incorrect or be perceived to be incorrect by a human observer." with "may be incorrect or inconsistent with expectations." It's not a big deal either way. (When I read about perceptions by an observer it makes me think of Schroedinger's cat... ;-) In the context of web services we should try to minimize the references to humans and users.) I didn't notice any other changes but it seems ok to me. Upcorp it! tex "Addison Phillips [wM]" wrote: > > Hi Tex, > > I reworked your material a little. Are you okay with: > > ------ > <div2><head>Ordering, Grouping, and Collation</head> > > <p>The ordering or collation of textual data items is a general concern for > internationalized software. The problem is exacerbated when the data can be > multilingual in nature. For Web services, in scenarios where the ordering of > textual data is critical to its correct utilization, it can be difficult to > identify the appropriate collation rules to use with sufficient precision to > insure those rules are either followed by any services that operate on the > data or that appropriate action is taken to compensate for any services that > do not use the desired collation rules. (For example, by re-sorting the data > downstream). > </p> > > <p>A brief list of these collation issues are described here. An important > reference is the Unicode Collation Algorithm (UCA), described by: <bibref > ref="UTR10"/>. Although the UCA is a mature standard, it should be noted > that there is wide variance in the implementation of collation algorithms; > that few of these implementations are based on UCA; and that there is little > or no general agreement on identifiers for collation preferences.</p> > > <p>Collation rules cannot be inferred solely from a language identifier or a > locale, as the identifiers do not indicate which sort ordering should be > used within a locale. A language identifier may be suggestive as to whether > a requester expects a particular sort ordering (as with Traditional or a > Modern ordering in Spanish, for example) but it may not be definitive.</p> > > <p>Some examples of sort orderings include: telephone, dictionary, phonetic, > binary, stroke-radical or radical-stroke. In the latter two cases, the > reference (source standard) for stroke count may also need to be cited. > </p> > > <p>Different components or subsystems which are used by a software process > may employ different sort orderings. For example, a User Agent may provide a > drop-down list which sorts the elements of the list at run-time differently > from the other components of the agent. Information retrieved from a > database may be ordered by an index which has no correlation with the > requester's requirements. When different components or subsystems of a Web > Service use different collation rules, then errors can occur. They are not > always hard errors (i.e. those that generate faults) but the resulting data, > operations, or events, may be incorrect or be perceived to be incorrect by a > human observer.</p> > > <p>In the case of services that might use a binary collation (ordering by > the code points of text data) there can be differences in ordering > introduced by different components using UTF-8 vs. UTF-16 internally. > </p> > > <p>Knowing the language of the requester does not prescribe how sensitive > the collation should be. Should text elements that are different by case or > accent be treated as distinct? Should certain characters be ignored? For > example, hyphens are often ignored so that "e-mail" and "email" sort > together. > </p> > > <p>Where case is considered distinct, it may be important to describe > whether all lowercase characters precede all uppercase characters, vice > versa, or whether they should be intermixed. > </p> > > <p>Often the performance of an application is impacted by collation. For > example, if a service returns results in an unknown ordering, the requester > may have to sort the results using its local collation rules. This can > consume resources and delay the further use of the results until the entire > set can be collated. Alternatively, if results are returned in the order > needed by the requester, then the requester can begin to process the first > records returned without waiting for the remaining records to arrive. > </p> > > <p>Of course, collation can be performed at different stages of data > processing and timing can be an important consideration. Database indexes > are updated as the data is added to the database, not at the time a request > arrives. Requests that can use the preordained collation of the index have a > significant performance advantage over requests that either cannot use > indexes or must re-sort the results. > </p> > > <p>See <xspecref href="#S-009">I-009</xspecref> and <xspecref > href="#I-013">I-013</xspecref>for a some examples.</p> > </div2> > > ------ > > Addison P. Phillips > Director, Globalization Architecture > webMethods | Delivering Global Business Visibility > http://www.webMethods.com > Chair, W3C Internationalization (I18N) Working Group > Chair, W3C-I18N-WG, Web Services Task Force > http://www.w3.org/International > > Internationalization is an architecture. > It is not a feature. > > > -----Original Message----- > > From: Tex Texin [mailto:tex@xencraft.com] > > Sent: Monday, May 10, 2004 4:44 PM > > To: Addison Phillips [wM]; Web Services > > Subject: sec 4.11 > > > > > > attached > > > > -- > > ------------------------------------------------------------- > > Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com > > Xen Master http://www.i18nGuy.com > > > > XenCraft http://www.XenCraft.com > > Making e-Business Work Around the World > > ------------------------------------------------------------- > > > > -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Monday, 10 May 2004 21:32:28 UTC