- From: Andy Seaborne <andy@apache.org>
- Date: Sat, 31 May 2014 15:07:00 +0100
- To: public-csv-wg@w3.org
On 30/05/14 20:36, Tandy, Jeremy wrote: > Oh - and I should say that I focused on the HXL example rather than the "360 giving" one because it touched on both the issues raised in the email from Tim Davies. > > Jeremy > >> -----Original Message----- >> From: Tandy, Jeremy [mailto:jeremy.tandy@metoffice.gov.uk] >> Sent: 30 May 2014 18:04 >> To: Jeni Tennison; public-csv-wg@w3.org >> Cc: Tim Davies (Web Foundation); david.megginson@megginson.com >> Subject: New i18n use case [WAS: CSV use case] >> >> Hi - following Jeni's earlier message, I have now added another use >> case to the document to describe the concerns raised: " Use Case #23 - >> Collating humanitarian information for crisis response" >> <http://w3c.github.io/csvw/use-cases-and-requirements/#UC- >> CollatingHumanitarianResponseInformation> ... >> >> You'll see this has introduced two new requirements: >> >> - <http://w3c.github.io/csvw/use-cases-and-requirements/#R- >> MultilingualContent> "specify the language / locale relevant to each field" Minor terminology point (Rufus has mentioned something similar), "field" here is referring to all the cells in a column? (I'm reading from the general context it isn't a particular (x,y) cell though that isn't unimaginable). >> - <http://w3c.github.io/csvw/use-cases-and-requirements/#R- >> ListsAsRepeatedFields> It could be either list or repeated objects (in RDF speak)? The other case of repeated fields is a repeated row with blanks means "same as above". This relates to hierarchies: concept subconcept subconcept concept subconcept subconcept subconcept of which org charts are an example. Andy >> >> Comments welcome ... especially from Tim Davies and David Megginson :-) >> One issue I have raised is whether HXL is still predicated on RDF; >> whether the conversion from tabular HXL into an RDF format is an >> accurate portrayal. >> >> Jeremy >> >> PS: you'll also notice that I've removed the references to "DDR" (as >> pointed out by AndyS recently, this was unhelpful additional >> terminology) and removed the empty "Terminology" section. >> >>> -----Original Message----- >>> From: Jeni Tennison [mailto:jeni@theodi.org] >>> Sent: 27 May 2014 12:31 >>> To: public-csv-wg@w3.org >>> Cc: Tim Davies (Web Foundation); david.megginson@megginson.com >>> Subject: Fw: Re: CSV use case >>> >>> Some extra use cases re internationalisation of CSVs. >>> >>> Jeni >>> >>> ------------------------------------------------------ >>> From: Tim Davies timdavies@webfoundation.org >>> Reply: Tim Davies timdavies@webfoundation.org >>> Date: 20 May 2014 at 23:36:13 >>> To: Jeni Tennison jeni@theodi.org, david.megginson@megginson.com >>> david.megginson@megginson.com >>> Subject: Re: CSV use case >>> >>>> Hello Jeni, >>>> >>>> Good to hear from you. Yes, so there are two main cases and two >>>> approaches here. One based on the work David Megginson is doing on >>>> Humanitarian Exchange Language (I've copied David in so he can >>> correct >>>> me when I misrepresent their work...;) - and one based on the 360 >>>> Giving Data Standard I worked on. >>>> >>>> >>>> *Issue 1:*Tabular data needs to be created, read by and exchanged >>>> between people speaking different languages. Many of these are >> basic >>>> spreadsheet users who will find it far easier to use data with >>> natural >>>> and clear language in the column headings. Having the column >>>> headings in their own language will make creating and interpreting >>>> the data a >>> lot easier. >>>> >>>> >>>> *Issue 2:* >>>> Tabular data needs to be created that contains literal values in >>>> multiple languages. For example, the name of a town in English, >>> French and Arabic. >>>> The total number of languages that the data might be expressed in >>>> cannot be easily determined in advance, and it should be possible >>>> for a user to introduce a new language variant of a column easily. >>>> >>>> *The HXL approach* >>>> See https://groups.google.com/forum/#!topic/hxlproject/8cLoE5cqV1Y >>>> >>>> - A data dictionary is created with numerical codes equating to >>>> field definitions >>>> - Providing the column header contains the numerical code, all >> other >>>> values in the column heading can be arbitrary (i.e. can be in plain >>>> language of the template creators choice) >>>> - A parser extracts just the code and uses this to interpret the >>>> meaning of the column >>>> - Language codes can be attached onto the end of column codes to >>>> indicate a language variant. E.g. if 010 is 'Source description' >>>> then there can an '010/en' column with 'Doctors without Borders' >> and >>>> an >>> '010/fr' >>>> column containing 'Medicine sans fronteirs' >>>> >>>> This had advantage of being robust to people messing around with >>>> column titles (extra spaces etc.) as long as they don't mess with >>>> the >>> ID. >>>> >>>> *The 360 Giving Approach* >>>> >>>> See http://threesixtygiving.github.io/standard/ >>>> >>>> As yet - not multilingual version of this is implemented - but the >>>> idea is >>>> that: >>>> >>>> - The CSV serialisation is based on an underlying Ontology >>>> (available at >>>> https://github.com/ThreeSixtyGiving/prototype-tools) which means >>> there >>>> is a URI for each column (the final part of which provides a >>>> machine-readable column ID), and labels, which can be expressed in >>>> various languages. >>>> - When a version of the spreadsheet for humans is created, the >>>> column ID is replaced with the English language label, or labels >>>> from some other language. >>>> - A conversion tool is created to map between IDs and labels. >>>> >>>> As yet a way to address to Issue 2 has not been proposed in this >>> approach. >>>> >>>> I'm personally leaning more towards the HXL approach over the >>>> long-run, though perhaps linked to an ontology with IDs for fields >>>> also rather than just a data dictionary to support more >>>> idiomatically friendly JSON and XML representations. >>>> >>>> >>>> Let me know if this covers what you needed, or if write up in some >>>> other style would be useful, >>>> >>>> Would also welcome any feedback on whether we're missing good ideas >>>> and approaches from the wider CSV standardisation work that we >>>> should be thinking about... >>>> >>>> All the best >>>> >>>> Tim >>>> >>>> >>>> On Sun, May 18, 2014 at 5:28 PM, Jeni Tennison wrote: >>>> >>>>> Tim, >>>>> >>>>> I hope you’re well? >>>>> >>>>> When we met up a little while ago, you talked about a CSV-based >>>>> format that you were putting together where you wanted the >> general >>>>> format to be the same across languages, but wanted the headers to >>> be >>>>> different so that they were understandable to >>>>> non-English-language- >>> speakers. >>>>> >>>>> I wonder if you could write a little description of the issue and >>>>> send me a couple of example files that show how that works, so >>>>> that I can include them as a use case for the CSV WG? >>>>> >>>>> Thanks, >>>>> >>>>> Jeni >>>>> -- >>>>> Jeni Tennison, Technical Director theODI.org >>>>> +44 (0) 7974 420 482 @JeniT >>>>> >>>>> >>>> >>>> >>>> -- >>>> -- >>>> Tim Davies >>>> Research Coordinator, Open Data Research Network >>>> +44 7834 856 303 >>>> @timdavies | @odrnetwork | www.opendataresearch.org >>>> >>>> *World Wide Web Foundation | **1110 Vermont Ave NW, Suite 500, >>>> Washington DC 20005, USA** | www.webfoundation.org | >>>> Twitter: @webfoundation* >>>> >>> >>> -- >>> Jeni Tennison, Technical Director theODI.org >>> +44 (0) 7974 420 482 @JeniT >>> >
Received on Saturday, 31 May 2014 14:07:30 UTC