- From: Deirdre Lee <deirdre@derilinx.com>
- Date: Mon, 22 Aug 2016 13:48:04 +0100
- To: Phil Archer <phila@w3.org>, "Phillips, Addison" <addison@lab126.com>, Bernadette Farias Lóscio <bfl@cin.ufpe.br>, Annette Greiner <amgreiner@lbl.gov>
- Cc: "ishida@w3.org" <ishida@w3.org>, "public-dwbp-comments@w3.org" <public-dwbp-comments@w3.org>, www International <www-international@w3.org>
Looks good, thanks Phil. On 22/08/2016 10:33, Phil Archer wrote: > Dear all, > > I have taken further steps on this. The result can be seen at > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > > 1. Addision's text used more or less verbatim; > 1a. taken account of Annette's suggestion; > 1b. replaced inline links to BCP47 and CLDR with references > 2. title of the BP changed to Use locale-neutral data representations > 3. moved to Data Formats section as resolved in WG meeting on Friday; > 4. added R-FormatMachineRead to list of evidence and thereby updated > the UCR cross matching; > 5. updated the Challenges SVG diagram; > 6. updated my Pull request. > > NB, I *retained* the old ID for the BP so that any links to > #LocaleParametersMetadata will still work. I know there are some of > these, for example, in the Share-PSI project. > > HTH > > Phil. > > > > On 22/08/2016 08:52, Deirdre Lee wrote: >> HI, >> >> Thank you for your comments Addison. I think they make sense and should >> be straight-forward to incorporate. >> >> The title of the BP should probably also be updated to something like >> 'Provide locale-neutral data' >> >> Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 to >> the Data Formats section from the Metadata section, which would make it >> BP14, right? >> >> Kind regards, >> >> Deirdre >> >> >> >> On 19/08/2016 17:39, Phillips, Addison wrote: >>> Hi Phil, >>> >>> Thanks for starting on this. I think the pull request is a good start. >>> I have some comments on it. >>> >>> My main concern is that this BP is really backwards. It recommends to >>> "locale parameter metadata" and then says that the simplest way to do >>> this is to use locale-neutral formats. The recommendation should be >>> more like "use locale-neutral formats or provide locale/language >>> information where that's not possible". The pull request captures the >>> use of locale-neutral, but doesn't really explain about when to >>> provide locale and language information. >>> >>> I would change this: >>> >>> -- >>> <p class="practicedesc">Provide metadata about locale parameters >>> (date, time, and number formats, language).</p> >>> -- >>> >>> To say: >>> >>> -- >>> <p class="practicedesc">Use locale-neutral data structures and values, >>> or, where that is not possible, provide metadata about the locale used >>> by data values.</p> >>> -- >>> >>> I would change: >>> >>> -- >>> <p>The simplest method is to use local-neutral representations of the >>> actual data, and then add metadata to provide relevant locale >>> information. For example, rather than storing "€2000.00" as a string, >>> it's strongly preferred to exchange a data structure such as:</p> >>> -- >>> >>> To say: >>> >>> -- >>> <p>Most common data representations are locale neutral. For example, >>> XML Schema types such as xsd:integer and xsd: date are intended for >>> locale-neutral data interchange. Using locale-neutral representations >>> allows the data values to be processed accurately without complex >>> parsing or misinterpretation and also allows the data to be presented >>> in the format most comfortable for the consumer of the data. For >>> example, rather than storing "€2000,00" as a string, it's strongly >>> preferred to exchange a data structure such as:</p> >>> -- >>> >>> Also, note the misspelling of "locale-neutral" in the pull request. >>> >>> I would then go on to add some text about when locale parameters are >>> needed. Something like: >>> >>> -- >>> Some datasets contain values that are not or cannot be rendered into a >>> locale-neutral format. This is particularly true of any natural >>> language text values. For each data field that can contain locale >>> affected or natural language text, there should be an associated >>> language tag used to indicate the language and locale of the data. >>> This locale information can be used in parsing the data or to ensure >>> proper presentation and processing of the value by the consumer. >>> -- >>> >>> (Sorry for not generating a pull request of my own) >>> >>> Addison >>> >>>> -----Original Message----- >>>> From: Phil Archer [mailto:phila@w3.org] >>>> Sent: Friday, August 19, 2016 8:37 AM >>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>> <amgreiner@lbl.gov> >>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>> public-dwbp- >>>> comments@w3.org; www International <www-international@w3.org> >>>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>>> representation #187 >>>> >>>> I took an action on today's call to try and address this in BP3. You >>>> can see the >>>> results at >>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> >>>> This uses some of Addison's text directly and highlights the value of >>>> the xsd >>>> datatypes - but retains enough of the original BP for it to be an >>>> amendment >>>> rather than a whole new one - I hope. >>>> >>>> This addresses most of the resolution taken today [1] but I have not >>>> moved >>>> the BP to the formats section. I leave that to the editors who may >>>> want to >>>> make further changes - or argue for it to be left where it is, or add >>>> references >>>> from the formats section or, or, or... >>>> >>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>> >>>> Phil. >>>> >>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>> >>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>> Dear Ishida, >>>>> >>>>> This comment [1] is still under discussion [4] and we'd like to ask >>>>> your opinion about two of our proposals: >>>>> >>>>> 1. to include locale-neutral representation ideas as part of BP3 [2], >>>>> or 2. to include a paragraph at the introduction of Section 8.8 Data >>>>> Formats [3] to discuss the relevance of having local-neutral >>>>> representations. >>>>> >>>>> We also discussed the proposal of having a new BP and we agreed that >>>>> we won't have a lot of time for a broader review of the new BP and to >>>>> collect feedback from the community. >>>>> >>>>> Thanks a lot! >>>>> DWBP editors >>>>> >>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>> 2016Jul/0028.html >>>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>> [4] >>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html >>>>> >>>>> >>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>> >>>>>> Hi Addison, >>>>>> >>>>>> Thanks for your response, and it does make sense. I think what I am >>>>>> still missing is whether there is guidance we can point to as to how >>>>>> to represent the "locale-neutral" data so that it can most easily be >>>>>> made locale specific by existing tools. You mention "pre-made >>>>>> standards for the basic data types". Is there a recommended list we >>>>>> could >>>> reference? >>>>>> Thanks for your help! >>>>>> -Annette >>>>>> >>>>>> >>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>> >>>>>>> Hi Annette, >>>>>>> >>>>>>> Thanks for the note. This is a personal reply not on behalf of the >>>>>>> WG. >>>>>>> >>>>>>> Locale neutral formats are quite common on the Web and the Internet >>>>>>> in general. One familiar format referenced by your document, for >>>>>>> example, is XML Schema. While the representations of numbers, >>>>>>> dates, >>>>>>> and the like in XML Schema would be "more appropriate" for some >>>>>>> languages/locales than others if given as plain text, what >>>>>>> distinguishes them is that they are all machine readable and >>>>>>> intended to >>>> be read by machines for later processing. >>>>>>> The display of values is a separate, local, concern for the data's >>>>>>> consumer. This necessarily means choosing specific separators (such >>>>>>> as decimal separators) over other, more localized values. Save for >>>>>>> "free >>>> text" >>>>>>> (natural language) data, most data formats are locale neutral and >>>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>>>> >>>>>>> Not every possible data structure or data value is, of course, >>>>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>>>> have many different common measurement units defined internally. To >>>>>>> transmit these in a locale-neutral manner, we need to construct our >>>>>>> own data schemas and identifiers. There are profoundly many ways to >>>>>>> measure shoes, dresses, auto parts, hats, drone propellers, and so >>>>>>> forth. But it would be a nightmare to have to deal with localized >>>> presentation formats on top of that. >>>>>>> But there are pre-made standards for the basic data types and these >>>>>>> are what are needed to build almost any data structure necessary >>>>>>> for >>>>>>> global interchange of data. >>>>>>> >>>>>>> Does that make sense? >>>>>>> >>>>>>> Addison >>>>>>> >>>>>>> Addison Phillips >>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>> >>>>>>> Internationalization is not a feature. >>>>>>> It is an architecture. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>> locale-neutral representation #187 >>>>>>>> >>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>> >>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>> document, but we would like some clarification of the practice of >>>>>>>> locale neutrality. >>>>>>>> You >>>>>>>> mention the variation across locales in decimal symbol, grouping >>>>>>>> symbol, number of grouping digits, digit shapes, etc., and you >>>>>>>> give >>>>>>>> an example of a locale-neutral data structure for monetary values. >>>>>>>> But this structure alone does not appear to address differences in >>>>>>>> decimal symbol, grouping symbol, number of grouping digits, or >>>>>>>> digit shapes. It does provide a mechanism to separately specify >>>>>>>> the >>>>>>>> units, and the example uses an ISO-4217 currency code, both of >>>>>>>> which we agree are good ideas. Is there a broad standard (beyond >>>>>>>> just monetary) for addressing the other symbol/representation >>>>>>>> issues you raised that we can address briefly in our best >>>>>>>> practice? >>>>>>>> Do you consider SI units consistent with a locale-neutral >>>>>>>> approach? >>>>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>> numbers >>>>>>>> (perhaps using a period and no grouping, as in your example)? >>>>>>>> >>>>>>>> -Annette >>>>>>>> >>>>>>>> >>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>> >>>>>>>>> [raised by aphillips] >>>>>>>>> >>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>> >>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>> >>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>> applications >>>>>>>>> to work accurately with things like dates, currencies and numbers >>>>>>>>> that may look similar but have different meanings in different >>>>>>>>> locales. >>>>>>>>> >>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>> representations that are interpreted/displayed to end-users in a >>>>>>>>> locale-appropriate manner. For example, instead of storing the >>>>>>>>> string "€2000.00", exchanging a data structure like the following >>>>>>>>> is strongly >>>>>>>>> preferred: >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> "price" { >>>>>>>>> "value": 2000.00, >>>>>>>>> "currency": "EUR" >>>>>>>>> } >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>>>> excellent example of using a locale-neutral format. >>>>>>>>> >>>>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>> because >>>>>>>>> there can be wide variation (sometimes open to misinterpretation) >>>>>>>>> that sending a locale neutral format is preferred for data >>>>>>>>> values. >>>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>> dependent on the locale. In France it would be normal to write >>>> 2000.00 € rather than €2000.00. >>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>> Annette Greiner >>>>>>>> NERSC Data and Analytics Services >>>>>>>> Lawrence Berkeley National Laboratory >>>>>>>> >>>>>>>> >>>>>> -- >>>>>> Annette Greiner >>>>>> NERSC Data and Analytics Services >>>>>> Lawrence Berkeley National Laboratory >>>>>> >>>>>> >>>>>> >>>>> >>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C Data Activity Lead >>>> http://www.w3.org/2013/data/ >>>> >>>> http://philarcher.org >>>> +44 (0)7887 767755 >>>> @philarcher1 >> > -- ------------------------------------ Deirdre Lee, CEO & Founder Derilinx - Linked & Open Data Solutions Web: www.derilinx.com Email: deirdre@derilinx.com Address: 11/12 Baggot Court, Dublin 2, D02 F891 Tel: +353 (0)1 254 4316 Mob: +353 (0)87 417 2318 Linkedin: ie.linkedin.com/in/leedeirdre/ Twitter: @deirdrelee
Received on Monday, 22 August 2016 12:48:41 UTC