Re: Model / Syntax Updates from Ivan Herman on 2014-02-25 (public-csv-wg@w3.org from February 2014)

From: Ivan Herman <ivan@w3.org>
Date: Tue, 25 Feb 2014 09:49:32 +0100
To: Yakov Shafranovich <yakov-ietf@shaftek.org>
Cc: Jeni Tennison <jeni@jenitennison.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-Id: <BB4783A4-BACC-4200-AA39-AB9023D8269B@w3.org>
On 24 Feb 2014, at 18:46 , Yakov Shafranovich <yakov-ietf@shaftek.org> wrote:

> If there are any suggestions towards updating RFC 4180 with I18N
> considerations in mind, that may be useful as well.

We can of course ask them to have a look at RFC 4180, too. But I am not sure they will have the time to do that.

B.t.w., there was another I18N issue that I was wondering about: the current syntax document says that the '-' character should be used to replace spaces in a coloumn name. I do not know whether that character is a natural 'filler' character for all writing systems. It is for Latin, probably for cyrillic or greek; but is this true for arabic, for example? I just do not know.

(There are also languages where the space character is not widely used, like Chinese or Thai... but that is all right).

Ivan


> 
> Thanks,
> Yakov
> 
> On Mon, Feb 24, 2014 at 12:33 PM, Ivan Herman <ivan@w3.org> wrote:
>> 
>> On 24 Feb 2014, at 18:15 , Jeni Tennison <jeni@jenitennison.com> wrote:
>> 
>>> Hi Ivan,
>>> 
>>> Yes, good point about regions. I'll rephrase to "a set of rows and columns and all the fields within the rows for those columns" or something.
>>> 
>>> Similarly, good catch about I18N. What's the right way to approach the I18N people?
>> 
>> Well, when we feel it is o.k., we will have to contact the I18N WG:
>> 
>> Chair: Addison Phillips, addison@lab126.com, and the staff contact Richard Ishida, ishida@w3.org.
>> 
>> It is good if we do it on time, because they usually have lots of things on their plate. But I guess that asking specific questions might help.
>> 
>> Cheers
>> 
>> Ivan
>> 
>> 
>>> Jeni
>>> 
>>> ------------------------------------------------------
>>> From: Ivan Herman ivan@w3.org
>>> Reply: Ivan Herman ivan@w3.org
>>> Date: 24 February 2014 at 09:51:09
>>> To: Jeni Tennison jeni@jenitennison.com
>>> Subject:  Re: Model / Syntax Updates
>>> 
>>>> 
>>>> Hi Jeni,
>>>> 
>>>> Thanks!
>>>> 
>>>> one specific technical question... In 2.2 an annotated region
>>>> seems to be any loose set of fields, without any further restriction.
>>>> Ie, an annotated region is not necessarily a tabular sub-area
>>>> within the whole table, it can be a loose set of fields without
>>>> any structure. I wonder whether, for practical reasons, it is
>>>> not worth defining a tabular region, that could be mapped, logically,
>>>> onto a tabular data of its own.
>>>> 
>>>> I was also wondering about the I18N aspects of the definition.
>>>> For example, the text says that to fix up CSV files that have blank
>>>> columns, the parser should fix this up by indexing the column
>>>> names. Your text does not say, but the examples suggest that this
>>>> is done in a left-to-right manner at least in the syntax; I am not
>>>> sure that would be o.k. with right-to-left writing systems.
>>>> Possible constraints on the column names should also be cross-checked
>>>> with other writing systems. In general, we should probably have
>>>> the text reviewed by I18N people early on in the process (and not
>>>> wait until the text gets closer to the publication when it is always
>>>> more difficult to change).
>>>> 
>>>> thanks
>>>> 
>>>> Ivan
>>>> 
>>>> 
>>>> On 23 Feb 2014, at 19:23 , Jeni Tennison
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Following the call last week, I have made some updates to the
>>>> "Syntax for Tabular Data on the Web" document at
>>>>> 
>>>>> http://w3c.github.io/csvw/syntax/
>>>>> 
>>>>> Namely:
>>>>> 
>>>>> * I have separated out three levels of data model:
>>>>> * a core data model which is just tables/columns/rows/fields
>>>>> * an annotated data model in which each of these can be annotated
>>>>> * a grouped data model in which there are multiple tables in a
>>>> group
>>>>> 
>>>>> * I have stated that the ordering of columns is significant in
>>>> the core data model
>>>>> 
>>>>> I have defined the annotated data model extremely loosely:
>>>> it just says that tables, columns, rows, fields and regions can
>>>> be annotated, but it doesn't say anything about what those annotations
>>>> might look like (eg that one of the annotations might be the *type*
>>>> of a value). I think the direction I'd like to take that is to retain
>>>> this very loose definition and then state that there are certain
>>>> annotations (eg 'type', 'unique') that are understood by particular
>>>> types of applications (eg validators, converters) in particular
>>>> ways. Does that seem like a reasonable approach?
>>>>> 
>>>>> I haven't made any attempt to tackle the syntax for annotated
>>>> or grouped tables as yet.
>>>>> 
>>>>> Jeni
>>>>> --
>>>>> Jeni Tennison
>>>>> http://www.jenitennison.com/
>>>>> 
>>>> 
>>>> 
>>>> ----
>>>> Ivan Herman, W3C
>>>> Digital Publishing Activity Lead
>>>> Home: http://www.w3.org/People/Ivan/
>>>> mobile: +31-641044153
>>>> GPG: 0x343F1A3D
>>>> FOAF: http://www.ivan-herman.net/foaf
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> - signature.asc, 210 bytes
>>> 
>>> --
>>> Jeni Tennison
>>> http://www.jenitennison.com/
>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> GPG: 0x343F1A3D
>> FOAF: http://www.ivan-herman.net/foaf
>> 
>> 
>> 
>> 
>> 
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
FOAF: http://www.ivan-herman.net/foaf
Received on Tuesday, 25 February 2014 08:50:03 UTC