Re: Model / Syntax Updates from Ivan Herman on 2014-02-24 (public-csv-wg@w3.org from February 2014)

From: Ivan Herman <ivan@w3.org>
Date: Mon, 24 Feb 2014 10:50:00 +0100
To: Jeni Tennison <jeni@jenitennison.com>
Cc: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-Id: <D8E9C17B-B101-4A15-AAF9-2552A7D7FA4C@w3.org>

Hi Jeni,

Thanks!

one specific technical question... In 2.2 an annotated region seems to be any loose set of fields, without any further restriction. Ie, an annotated region is not necessarily a tabular sub-area within the whole table, it can be a loose set of fields without any structure. I wonder whether, for practical reasons, it is not worth defining a tabular region, that could be mapped, logically, onto a tabular data of its own. 

I was also wondering about the I18N aspects of the definition. For example, the text says that to fix up CSV files that have blank columns, the parser should fix this up by indexing the column names. Your text does not say, but the examples suggest that this is done in a left-to-right manner at least in the syntax; I am not sure that would be o.k. with right-to-left writing systems. Possible constraints on the column names should also be cross-checked with other writing systems. In general, we should probably have the text reviewed by I18N people early on in the process (and not wait until the text gets closer to the publication when it is always more difficult to change).

thanks

Ivan

On 23 Feb 2014, at 19:23 , Jeni Tennison <jeni@jenitennison.com> wrote:

> Hi,
> 
> Following the call last week, I have made some updates to the "Syntax for Tabular Data on the Web” document at
> 
>   http://w3c.github.io/csvw/syntax/
> 
> Namely:
> 
>   * I have separated out three levels of data model:
>     * a core data model which is just tables/columns/rows/fields
>     * an annotated data model in which each of these can be annotated
>     * a grouped data model in which there are multiple tables in a group
> 
>   * I have stated that the ordering of columns is significant in the core data model
> 
> I have defined the annotated data model extremely loosely: it just says that tables, columns, rows, fields and regions can be annotated, but it doesn’t say anything about what those annotations might look like (eg that one of the annotations might be the *type* of a value). I think the direction I’d like to take that is to retain this very loose definition and then state that there are certain annotations (eg 'type', 'unique') that are understood by particular types of applications (eg validators, converters) in particular ways. Does that seem like a reasonable approach?
> 
> I haven’t made any attempt to tackle the syntax for annotated or grouped tables as yet.
> 
> Jeni
> --  
> Jeni Tennison
> http://www.jenitennison.com/
> 

----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
FOAF: http://www.ivan-herman.net/foaf

Received on Monday, 24 February 2014 09:50:18 UTC