RE: New i18n use case [WAS: CSV use case]

Oh - and I should say that I focused on the HXL example rather than the "360 giving" one because it touched on both the issues raised in the email from Tim Davies.

Jeremy

> -----Original Message-----
> From: Tandy, Jeremy [mailto:jeremy.tandy@metoffice.gov.uk]
> Sent: 30 May 2014 18:04
> To: Jeni Tennison; public-csv-wg@w3.org
> Cc: Tim Davies (Web Foundation); david.megginson@megginson.com
> Subject: New i18n use case [WAS: CSV use case]
> 
> Hi - following Jeni's earlier message, I have now added another use
> case to the document to describe the concerns raised: " Use Case #23 -
> Collating humanitarian information for crisis response"
> <http://w3c.github.io/csvw/use-cases-and-requirements/#UC-

> CollatingHumanitarianResponseInformation> ...
> 
> You'll see this has introduced two new requirements:
> 
> - <http://w3c.github.io/csvw/use-cases-and-requirements/#R-

> MultilingualContent>
> - <http://w3c.github.io/csvw/use-cases-and-requirements/#R-

> ListsAsRepeatedFields>
> 
> Comments welcome ... especially from Tim Davies and David Megginson :-)
> One issue I have raised is whether HXL is still predicated on RDF;
> whether the conversion from tabular HXL into an RDF format is an
> accurate portrayal.
> 
> Jeremy
> 
> PS: you'll also notice that I've removed the references to "DDR" (as
> pointed out by AndyS recently, this was unhelpful additional
> terminology) and removed the empty "Terminology" section.
> 
> > -----Original Message-----
> > From: Jeni Tennison [mailto:jeni@theodi.org]
> > Sent: 27 May 2014 12:31
> > To: public-csv-wg@w3.org
> > Cc: Tim Davies (Web Foundation); david.megginson@megginson.com
> > Subject: Fw: Re: CSV use case
> >
> > Some extra use cases re internationalisation of CSVs.
> >
> > Jeni
> >
> > ------------------------------------------------------
> > From: Tim Davies timdavies@webfoundation.org
> > Reply: Tim Davies timdavies@webfoundation.org
> > Date: 20 May 2014 at 23:36:13
> > To: Jeni Tennison jeni@theodi.org, david.megginson@megginson.com
> > david.megginson@megginson.com
> > Subject:  Re: CSV use case
> >
> > > Hello Jeni,
> > >
> > > Good to hear from you. Yes, so there are two main cases and two
> > > approaches here. One based on the work David Megginson is doing on
> > > Humanitarian Exchange Language (I've copied David in so he can
> > correct
> > > me when I misrepresent their work...;) - and one based on the 360
> > > Giving Data Standard I worked on.
> > >
> > >
> > > *Issue 1:*Tabular data needs to be created, read by and exchanged
> > > between people speaking different languages. Many of these are
> basic
> > > spreadsheet users who will find it far easier to use data with
> > natural
> > > and clear language in the column headings. Having the column
> > > headings in their own language will make creating and interpreting
> > > the data a
> > lot easier.
> > >
> > >
> > > *Issue 2:*
> > > Tabular data needs to be created that contains literal values in
> > > multiple languages. For example, the name of a town in English,
> > French and Arabic.
> > > The total number of languages that the data might be expressed in
> > > cannot be easily determined in advance, and it should be possible
> > > for a user to introduce a new language variant of a column easily.
> > >
> > > *The HXL approach*
> > > See https://groups.google.com/forum/#!topic/hxlproject/8cLoE5cqV1Y

> > >
> > > - A data dictionary is created with numerical codes equating to
> > > field definitions
> > > - Providing the column header contains the numerical code, all
> other
> > > values in the column heading can be arbitrary (i.e. can be in plain
> > > language of the template creators choice)
> > > - A parser extracts just the code and uses this to interpret the
> > > meaning of the column
> > > - Language codes can be attached onto the end of column codes to
> > > indicate a language variant. E.g. if 010 is 'Source description'
> > > then there can an '010/en' column with 'Doctors without Borders'
> and
> > > an
> > '010/fr'
> > > column containing 'Medicine sans fronteirs'
> > >
> > > This had advantage of being robust to people messing around with
> > > column titles (extra spaces etc.) as long as they don't mess with
> > > the
> > ID.
> > >
> > > *The 360 Giving Approach*
> > >
> > > See http://threesixtygiving.github.io/standard/

> > >
> > > As yet - not multilingual version of this is implemented - but the
> > > idea is
> > > that:
> > >
> > > - The CSV serialisation is based on an underlying Ontology
> > > (available at
> > > https://github.com/ThreeSixtyGiving/prototype-tools) which means
> > there
> > > is a URI for each column (the final part of which provides a
> > > machine-readable column ID), and labels, which can be expressed in
> > > various languages.
> > > - When a version of the spreadsheet for humans is created, the
> > > column ID is replaced with the English language label, or labels
> > > from some other language.
> > > - A conversion tool is created to map between IDs and labels.
> > >
> > > As yet a way to address to Issue 2 has not been proposed in this
> > approach.
> > >
> > > I'm personally leaning more towards the HXL approach over the
> > > long-run, though perhaps linked to an ontology with IDs for fields
> > > also rather than just a data dictionary to support more
> > > idiomatically friendly JSON and XML representations.
> > >
> > >
> > > Let me know if this covers what you needed, or if write up in some
> > > other style would be useful,
> > >
> > > Would also welcome any feedback on whether we're missing good ideas
> > > and approaches from the wider CSV standardisation work that we
> > > should be thinking about...
> > >
> > > All the best
> > >
> > > Tim
> > >
> > >
> > > On Sun, May 18, 2014 at 5:28 PM, Jeni Tennison wrote:
> > >
> > > > Tim,
> > > >
> > > > I hope you’re well?
> > > >
> > > > When we met up a little while ago, you talked about a CSV-based
> > > > format that you were putting together where you wanted the
> general
> > > > format to be the same across languages, but wanted the headers to
> > be
> > > > different so that they were understandable to
> > > > non-English-language-
> > speakers.
> > > >
> > > > I wonder if you could write a little description of the issue and
> > > > send me a couple of example files that show how that works, so
> > > > that I can include them as a use case for the CSV WG?
> > > >
> > > > Thanks,
> > > >
> > > > Jeni
> > > > --
> > > > Jeni Tennison, Technical Director theODI.org
> > > > +44 (0) 7974 420 482 @JeniT
> > > >
> > > >
> > >
> > >
> > > --
> > > --
> > > Tim Davies
> > > Research Coordinator, Open Data Research Network
> > > +44 7834 856 303
> > > @timdavies | @odrnetwork | www.opendataresearch.org
> > >
> > > *World Wide Web Foundation | **1110 Vermont Ave NW, Suite 500,
> > > Washington DC 20005, USA** | www.webfoundation.org |
> > > Twitter: @webfoundation*
> > >
> >
> > --
> > Jeni Tennison, Technical Director theODI.org
> > +44 (0) 7974 420 482 @JeniT
> >

Received on Friday, 30 May 2014 19:37:21 UTC