- From: Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk>
- Date: Fri, 30 May 2014 19:36:48 +0000
- To: Jeni Tennison <jeni@theodi.org>, "public-csv-wg@w3.org" <public-csv-wg@w3.org>
- CC: "Tim Davies (Web Foundation)" <timdavies@webfoundation.org>, "david.megginson@megginson.com" <david.megginson@megginson.com>
Oh - and I should say that I focused on the HXL example rather than the "360 giving" one because it touched on both the issues raised in the email from Tim Davies. Jeremy > -----Original Message----- > From: Tandy, Jeremy [mailto:jeremy.tandy@metoffice.gov.uk] > Sent: 30 May 2014 18:04 > To: Jeni Tennison; public-csv-wg@w3.org > Cc: Tim Davies (Web Foundation); david.megginson@megginson.com > Subject: New i18n use case [WAS: CSV use case] > > Hi - following Jeni's earlier message, I have now added another use > case to the document to describe the concerns raised: " Use Case #23 - > Collating humanitarian information for crisis response" > <http://w3c.github.io/csvw/use-cases-and-requirements/#UC- > CollatingHumanitarianResponseInformation> ... > > You'll see this has introduced two new requirements: > > - <http://w3c.github.io/csvw/use-cases-and-requirements/#R- > MultilingualContent> > - <http://w3c.github.io/csvw/use-cases-and-requirements/#R- > ListsAsRepeatedFields> > > Comments welcome ... especially from Tim Davies and David Megginson :-) > One issue I have raised is whether HXL is still predicated on RDF; > whether the conversion from tabular HXL into an RDF format is an > accurate portrayal. > > Jeremy > > PS: you'll also notice that I've removed the references to "DDR" (as > pointed out by AndyS recently, this was unhelpful additional > terminology) and removed the empty "Terminology" section. > > > -----Original Message----- > > From: Jeni Tennison [mailto:jeni@theodi.org] > > Sent: 27 May 2014 12:31 > > To: public-csv-wg@w3.org > > Cc: Tim Davies (Web Foundation); david.megginson@megginson.com > > Subject: Fw: Re: CSV use case > > > > Some extra use cases re internationalisation of CSVs. > > > > Jeni > > > > ------------------------------------------------------ > > From: Tim Davies timdavies@webfoundation.org > > Reply: Tim Davies timdavies@webfoundation.org > > Date: 20 May 2014 at 23:36:13 > > To: Jeni Tennison jeni@theodi.org, david.megginson@megginson.com > > david.megginson@megginson.com > > Subject: Re: CSV use case > > > > > Hello Jeni, > > > > > > Good to hear from you. Yes, so there are two main cases and two > > > approaches here. One based on the work David Megginson is doing on > > > Humanitarian Exchange Language (I've copied David in so he can > > correct > > > me when I misrepresent their work...;) - and one based on the 360 > > > Giving Data Standard I worked on. > > > > > > > > > *Issue 1:*Tabular data needs to be created, read by and exchanged > > > between people speaking different languages. Many of these are > basic > > > spreadsheet users who will find it far easier to use data with > > natural > > > and clear language in the column headings. Having the column > > > headings in their own language will make creating and interpreting > > > the data a > > lot easier. > > > > > > > > > *Issue 2:* > > > Tabular data needs to be created that contains literal values in > > > multiple languages. For example, the name of a town in English, > > French and Arabic. > > > The total number of languages that the data might be expressed in > > > cannot be easily determined in advance, and it should be possible > > > for a user to introduce a new language variant of a column easily. > > > > > > *The HXL approach* > > > See https://groups.google.com/forum/#!topic/hxlproject/8cLoE5cqV1Y > > > > > > - A data dictionary is created with numerical codes equating to > > > field definitions > > > - Providing the column header contains the numerical code, all > other > > > values in the column heading can be arbitrary (i.e. can be in plain > > > language of the template creators choice) > > > - A parser extracts just the code and uses this to interpret the > > > meaning of the column > > > - Language codes can be attached onto the end of column codes to > > > indicate a language variant. E.g. if 010 is 'Source description' > > > then there can an '010/en' column with 'Doctors without Borders' > and > > > an > > '010/fr' > > > column containing 'Medicine sans fronteirs' > > > > > > This had advantage of being robust to people messing around with > > > column titles (extra spaces etc.) as long as they don't mess with > > > the > > ID. > > > > > > *The 360 Giving Approach* > > > > > > See http://threesixtygiving.github.io/standard/ > > > > > > As yet - not multilingual version of this is implemented - but the > > > idea is > > > that: > > > > > > - The CSV serialisation is based on an underlying Ontology > > > (available at > > > https://github.com/ThreeSixtyGiving/prototype-tools) which means > > there > > > is a URI for each column (the final part of which provides a > > > machine-readable column ID), and labels, which can be expressed in > > > various languages. > > > - When a version of the spreadsheet for humans is created, the > > > column ID is replaced with the English language label, or labels > > > from some other language. > > > - A conversion tool is created to map between IDs and labels. > > > > > > As yet a way to address to Issue 2 has not been proposed in this > > approach. > > > > > > I'm personally leaning more towards the HXL approach over the > > > long-run, though perhaps linked to an ontology with IDs for fields > > > also rather than just a data dictionary to support more > > > idiomatically friendly JSON and XML representations. > > > > > > > > > Let me know if this covers what you needed, or if write up in some > > > other style would be useful, > > > > > > Would also welcome any feedback on whether we're missing good ideas > > > and approaches from the wider CSV standardisation work that we > > > should be thinking about... > > > > > > All the best > > > > > > Tim > > > > > > > > > On Sun, May 18, 2014 at 5:28 PM, Jeni Tennison wrote: > > > > > > > Tim, > > > > > > > > I hope you’re well? > > > > > > > > When we met up a little while ago, you talked about a CSV-based > > > > format that you were putting together where you wanted the > general > > > > format to be the same across languages, but wanted the headers to > > be > > > > different so that they were understandable to > > > > non-English-language- > > speakers. > > > > > > > > I wonder if you could write a little description of the issue and > > > > send me a couple of example files that show how that works, so > > > > that I can include them as a use case for the CSV WG? > > > > > > > > Thanks, > > > > > > > > Jeni > > > > -- > > > > Jeni Tennison, Technical Director theODI.org > > > > +44 (0) 7974 420 482 @JeniT > > > > > > > > > > > > > > > > > -- > > > -- > > > Tim Davies > > > Research Coordinator, Open Data Research Network > > > +44 7834 856 303 > > > @timdavies | @odrnetwork | www.opendataresearch.org > > > > > > *World Wide Web Foundation | **1110 Vermont Ave NW, Suite 500, > > > Washington DC 20005, USA** | www.webfoundation.org | > > > Twitter: @webfoundation* > > > > > > > -- > > Jeni Tennison, Technical Director theODI.org > > +44 (0) 7974 420 482 @JeniT > >
Received on Friday, 30 May 2014 19:37:21 UTC