- From: Phil Archer <phila@w3.org>
- Date: Mon, 22 Aug 2016 10:33:36 +0100
- To: Deirdre Lee <deirdre@derilinx.com>, "Phillips, Addison" <addison@lab126.com>, Bernadette Farias Lóscio <bfl@cin.ufpe.br>, Annette Greiner <amgreiner@lbl.gov>
- Cc: "ishida@w3.org" <ishida@w3.org>, "public-dwbp-comments@w3.org" <public-dwbp-comments@w3.org>, www International <www-international@w3.org>
Dear all, I have taken further steps on this. The result can be seen at http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata 1. Addision's text used more or less verbatim; 1a. taken account of Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR with references 2. title of the BP changed to Use locale-neutral data representations 3. moved to Data Formats section as resolved in WG meeting on Friday; 4. added R-FormatMachineRead to list of evidence and thereby updated the UCR cross matching; 5. updated the Challenges SVG diagram; 6. updated my Pull request. NB, I *retained* the old ID for the BP so that any links to #LocaleParametersMetadata will still work. I know there are some of these, for example, in the Share-PSI project. HTH Phil. On 22/08/2016 08:52, Deirdre Lee wrote: > HI, > > Thank you for your comments Addison. I think they make sense and should > be straight-forward to incorporate. > > The title of the BP should probably also be updated to something like > 'Provide locale-neutral data' > > Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 to > the Data Formats section from the Metadata section, which would make it > BP14, right? > > Kind regards, > > Deirdre > > > > On 19/08/2016 17:39, Phillips, Addison wrote: >> Hi Phil, >> >> Thanks for starting on this. I think the pull request is a good start. >> I have some comments on it. >> >> My main concern is that this BP is really backwards. It recommends to >> "locale parameter metadata" and then says that the simplest way to do >> this is to use locale-neutral formats. The recommendation should be >> more like "use locale-neutral formats or provide locale/language >> information where that's not possible". The pull request captures the >> use of locale-neutral, but doesn't really explain about when to >> provide locale and language information. >> >> I would change this: >> >> -- >> <p class="practicedesc">Provide metadata about locale parameters >> (date, time, and number formats, language).</p> >> -- >> >> To say: >> >> -- >> <p class="practicedesc">Use locale-neutral data structures and values, >> or, where that is not possible, provide metadata about the locale used >> by data values.</p> >> -- >> >> I would change: >> >> -- >> <p>The simplest method is to use local-neutral representations of the >> actual data, and then add metadata to provide relevant locale >> information. For example, rather than storing "€2000.00" as a string, >> it's strongly preferred to exchange a data structure such as:</p> >> -- >> >> To say: >> >> -- >> <p>Most common data representations are locale neutral. For example, >> XML Schema types such as xsd:integer and xsd: date are intended for >> locale-neutral data interchange. Using locale-neutral representations >> allows the data values to be processed accurately without complex >> parsing or misinterpretation and also allows the data to be presented >> in the format most comfortable for the consumer of the data. For >> example, rather than storing "€2000,00" as a string, it's strongly >> preferred to exchange a data structure such as:</p> >> -- >> >> Also, note the misspelling of "locale-neutral" in the pull request. >> >> I would then go on to add some text about when locale parameters are >> needed. Something like: >> >> -- >> Some datasets contain values that are not or cannot be rendered into a >> locale-neutral format. This is particularly true of any natural >> language text values. For each data field that can contain locale >> affected or natural language text, there should be an associated >> language tag used to indicate the language and locale of the data. >> This locale information can be used in parsing the data or to ensure >> proper presentation and processing of the value by the consumer. >> -- >> >> (Sorry for not generating a pull request of my own) >> >> Addison >> >>> -----Original Message----- >>> From: Phil Archer [mailto:phila@w3.org] >>> Sent: Friday, August 19, 2016 8:37 AM >>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>> <amgreiner@lbl.gov> >>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; public-dwbp- >>> comments@w3.org; www International <www-international@w3.org> >>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>> representation #187 >>> >>> I took an action on today's call to try and address this in BP3. You >>> can see the >>> results at >>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>> >>> This uses some of Addison's text directly and highlights the value of >>> the xsd >>> datatypes - but retains enough of the original BP for it to be an >>> amendment >>> rather than a whole new one - I hope. >>> >>> This addresses most of the resolution taken today [1] but I have not >>> moved >>> the BP to the formats section. I leave that to the editors who may >>> want to >>> make further changes - or argue for it to be left where it is, or add >>> references >>> from the formats section or, or, or... >>> >>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>> >>> Phil. >>> >>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>> >>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>> Dear Ishida, >>>> >>>> This comment [1] is still under discussion [4] and we'd like to ask >>>> your opinion about two of our proposals: >>>> >>>> 1. to include locale-neutral representation ideas as part of BP3 [2], >>>> or 2. to include a paragraph at the introduction of Section 8.8 Data >>>> Formats [3] to discuss the relevance of having local-neutral >>>> representations. >>>> >>>> We also discussed the proposal of having a new BP and we agreed that >>>> we won't have a lot of time for a broader review of the new BP and to >>>> collect feedback from the community. >>>> >>>> Thanks a lot! >>>> DWBP editors >>>> >>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>> 2016Jul/0028.html >>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>> [4] >>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html >>>> >>>> >>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>> >>>>> Hi Addison, >>>>> >>>>> Thanks for your response, and it does make sense. I think what I am >>>>> still missing is whether there is guidance we can point to as to how >>>>> to represent the "locale-neutral" data so that it can most easily be >>>>> made locale specific by existing tools. You mention "pre-made >>>>> standards for the basic data types". Is there a recommended list we >>>>> could >>> reference? >>>>> Thanks for your help! >>>>> -Annette >>>>> >>>>> >>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>> >>>>>> Hi Annette, >>>>>> >>>>>> Thanks for the note. This is a personal reply not on behalf of the >>>>>> WG. >>>>>> >>>>>> Locale neutral formats are quite common on the Web and the Internet >>>>>> in general. One familiar format referenced by your document, for >>>>>> example, is XML Schema. While the representations of numbers, dates, >>>>>> and the like in XML Schema would be "more appropriate" for some >>>>>> languages/locales than others if given as plain text, what >>>>>> distinguishes them is that they are all machine readable and >>>>>> intended to >>> be read by machines for later processing. >>>>>> The display of values is a separate, local, concern for the data's >>>>>> consumer. This necessarily means choosing specific separators (such >>>>>> as decimal separators) over other, more localized values. Save for >>>>>> "free >>> text" >>>>>> (natural language) data, most data formats are locale neutral and >>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>>> >>>>>> Not every possible data structure or data value is, of course, >>>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>>> have many different common measurement units defined internally. To >>>>>> transmit these in a locale-neutral manner, we need to construct our >>>>>> own data schemas and identifiers. There are profoundly many ways to >>>>>> measure shoes, dresses, auto parts, hats, drone propellers, and so >>>>>> forth. But it would be a nightmare to have to deal with localized >>> presentation formats on top of that. >>>>>> But there are pre-made standards for the basic data types and these >>>>>> are what are needed to build almost any data structure necessary for >>>>>> global interchange of data. >>>>>> >>>>>> Does that make sense? >>>>>> >>>>>> Addison >>>>>> >>>>>> Addison Phillips >>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>> >>>>>> Internationalization is not a feature. >>>>>> It is an architecture. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>> Cc: www International <www-international@w3.org> >>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>> locale-neutral representation #187 >>>>>>> >>>>>>> Hello on behalf of the DWBP WG, >>>>>>> >>>>>>> We're interested in pursuing this concept in our best practice >>>>>>> document, but we would like some clarification of the practice of >>>>>>> locale neutrality. >>>>>>> You >>>>>>> mention the variation across locales in decimal symbol, grouping >>>>>>> symbol, number of grouping digits, digit shapes, etc., and you give >>>>>>> an example of a locale-neutral data structure for monetary values. >>>>>>> But this structure alone does not appear to address differences in >>>>>>> decimal symbol, grouping symbol, number of grouping digits, or >>>>>>> digit shapes. It does provide a mechanism to separately specify the >>>>>>> units, and the example uses an ISO-4217 currency code, both of >>>>>>> which we agree are good ideas. Is there a broad standard (beyond >>>>>>> just monetary) for addressing the other symbol/representation >>>>>>> issues you raised that we can address briefly in our best practice? >>>>>>> Do you consider SI units consistent with a locale-neutral approach? >>>>>>> Is there a locale-neutral standard for representing decimal numbers >>>>>>> (perhaps using a period and no grouping, as in your example)? >>>>>>> >>>>>>> -Annette >>>>>>> >>>>>>> >>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>> >>>>>>>> [raised by aphillips] >>>>>>>> >>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>> >>>>>>>> Best practice #3 introduces itself as: >>>>>>>> >>>>>>>> Providing locale parameters helps humans and computer applications >>>>>>>> to work accurately with things like dates, currencies and numbers >>>>>>>> that may look similar but have different meanings in different >>>>>>>> locales. >>>>>>>> >>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>> representations that are interpreted/displayed to end-users in a >>>>>>>> locale-appropriate manner. For example, instead of storing the >>>>>>>> string "€2000.00", exchanging a data structure like the following >>>>>>>> is strongly >>>>>>>> preferred: >>>>>>>> >>>>>>>> ``` >>>>>>>> "price" { >>>>>>>> "value": 2000.00, >>>>>>>> "currency": "EUR" >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>>> excellent example of using a locale-neutral format. >>>>>>>> >>>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's because >>>>>>>> there can be wide variation (sometimes open to misinterpretation) >>>>>>>> that sending a locale neutral format is preferred for data values. >>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>> dependent on the locale. In France it would be normal to write >>> 2000.00 € rather than €2000.00. >>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> Annette Greiner >>>>>>> NERSC Data and Analytics Services >>>>>>> Lawrence Berkeley National Laboratory >>>>>>> >>>>>>> >>>>> -- >>>>> Annette Greiner >>>>> NERSC Data and Analytics Services >>>>> Lawrence Berkeley National Laboratory >>>>> >>>>> >>>>> >>>> >>> -- >>> >>> >>> Phil Archer >>> W3C Data Activity Lead >>> http://www.w3.org/2013/data/ >>> >>> http://philarcher.org >>> +44 (0)7887 767755 >>> @philarcher1 > -- Phil Archer W3C Data Activity Lead http://www.w3.org/2013/data/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Received on Monday, 22 August 2016 09:31:07 UTC