Re: [ACTION-15] General text on conversion from Andy Seaborne on 2014-05-14 (public-csv-wg@w3.org from May 2014)

From: Andy Seaborne <andy@apache.org>
Date: Wed, 14 May 2014 12:58:22 +0100
To: Jeni Tennison <jeni@jenitennison.com>, Ivan Herman <ivan@w3.org>
CC: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <53735A5E.80507@apache.org>
On 14/05/14 12:40, Jeni Tennison wrote:
> Ivan, Andy,
>
> I’ve created a section called ‘Processing Tables’ in the Metadata spec here:
>
>    http://w3c.github.io/csvw/metadata/#processing-tables

2.4 Converting Tables
http://w3c.github.io/csvw/metadata/#h3_converting-tables

>
> which (will) include any general constraints about the interpretation of metadata and what conversion specifications have to do. This is all based on Ivan’s wording below.
>
> I think this is useful?
>
> Jeni
>
> ------------------------------------------------------
> From: Ivan Herman ivan@w3.org
> Reply: Ivan Herman ivan@w3.org
> Date: 14 May 2014 at 12:29:47
> To: Andy Seaborne andy@apache.org
> Cc: W3C CSV on the Web Working Group public-csv-wg@w3.org
> Subject:  Re: [ACTION-15] General text on conversion
>
>>
>> On 14 May 2014, at 11:54 , Andy Seaborne wrote:
>>
>>> On 08/05/14 12:26, Ivan Herman wrote:
>>>> Guys
>>>>
>>>> my action from yesterday[1] refers to a text that should be added to
>>>> the RDF conversion document. I have come up with something, based on
>>>> the email discussion,
>>>
>>> Thank you.
>>>
>>>> but also some additional issues; however, I am
>>>> not sure whether the RDF conversion document is the right place for
>>>> this. I wonder whether adding this as a separate section in the
>>>> syntax document is not a better choice.
>>>
>>> Gregg has suggested that if all the conversions are based around the
>>> template mechanism, then there be one conversions document for all of
>>> RDF, JSON and XML.
>>>
>>> That makes sense to me although I also think that someone arrives at the
>>> doc wanting, say, the details of JSON conversion, having them all in one place makes
>> for a less focused document.
>>
>> I think this is something we will see evolving as it comes. At the moment, the text refers
>> to "format specific property-value annotation pairs", meaning that the templates
>> are rdf or json or xml specific; I am not yet sure whether some of those can be abstracted
>> out, so to say.
>>
>> Also: maybe, say, generating a URI using a template makes sense for RDF but maybe the same
>> cell should be put into a literal for JSON or XML. In which case separating the various
>> syntaxes does make sense.
>>
>> Bottom line: I am not sure...
>>
>>>
>>>>
>>>> After discussion and probably some word-smithing I am happy to put
>>>> the text into either of the documents themselves.
>>>>
>>>> So here we go...
>>>>
>>>> [[[
>>>>
>>>> This specification defines some general principles for the conversion
>>>> of CSV to other formats. These are:
>>>>
>>>> * The conversions are defined on tabular data, as defined in
>>>> the "Model for Tabular Data and Metadata on the Web" specification
>>>> [[!Tabular-Data-Model]]. This means that some of the specificities
>>>> (like Right-to-Left writing modes, or empty rows in the source file)
>>>> of CSV files are to be handled by the parsing step yielding the
>>>> tabular data.
>>>>
>>>> * A conversion specification MUST define a "default" mapping; i.e., a
>>>> mapping from core tabular data (as opposed to annotated tabular
>>>> data).
>>>>
>>>> * For the conversion of annotated tabular data:
>>>>
>>>> ** A conversion specification MUST specify how certain property-value
>>>> pairs, provided by the by the "Metadata Vocabulary for Tabular Data"
>>>> [[!Tabular-Metadata]], is mapped on the output. These are:
>>>
>>>> *** @id
>>>
>>> Clarification - there are two uses of "@id" in the "issue one" illustration.
>>>
>>> You mean "@id" in "columns", where it is indicating the way to generate an identifier
>> for the row; it's a template-like thing for the subject.
>>
>> Yes, that is what I meant.
>>
>>>
>>> Not the "@id" that is the JSON-LD construct referring to within the metadata file.
>>>
>>> (I'd prefer different names - comment on the metadata doucment)
>>>
>>
>> +1
>>
>>>> *** @type
>>>
>>> and here we have "type" and "@type"
>>> You're referring to "type" in "columns"?
>>
>> Ah, again, I did not realize the @type and type. Yes, I meant for columns, like 'string'
>> or 'date'.
>>
>>>
>>>> *** field types
>>>> *** Primary Key
>>>> *** Foreign Key
>>>
>>> Links across files are URIs.
>>
>> I am not sure I understand what you say here.:-(
>>
>>>
>>>> ** A conversion specification MAY specify how other property-value
>>>> pairs, like column names, may be used on the output (e.g., as
>>>> additional metadata in the output)
>>>>
>>>> * The conversion specification MAY specify a number of additional
>>>> metadata on the output, regardless of whether that particular
>>>> information is present in the annotations of the tabular data
>>>>
>>>> * The conversion specification MAY specify a number of format
>>>> specific property-value annotation pairs. These pairs are part of the
>>>> tabular data annotations, i.e., the metadata field descriptors
>>>> (@@@REF@@@), but only relevant for the specific output format.
>>>> Examples may be flags to specify whether a specific field should be
>>>> output as an XML element or an XML attribute, or a patterns
>>>> generating a URI for the RDF object (rather than using a literal).
>>>>
>>>> * The conversion specification MAY also specify a global, format
>>>> specific property (as part for the CSV annotation) specifying an
>>>> external processing step that should occur on the generated output.
>>>> Example may be a reference to an XSLT file, a literal defining a
>>>> SPARQL CONSTRUCT pattern, or a reference to a Javascript file. The
>>>> specification of those processing steps are not provided by this
>>>> Working Group. ]]]
>>>
>>> Not sure the conversion has to talk about that because it's outside the spec. Can't stop
>> people doing additional stuff! The conversion can be aware of the possibility.
>>
>> Yes, it can be aware, but what I mean here is that the metadata may contain an rdf-specific
>> field for the table referring to a SPARQL construct. Ie, the standard may provide a placeholder
>> for those.
>>
>>>
>>>> A specific issue: I was wondering whether the usage of, eg, field
>>>> types or primary keys should be a MUST or a MAY. At the moment I set
>>>> it as a MUST, although a conversion specification may say that a
>>>> particular type is simply ignored as a type; But at least this has to
>>>> be specified. Another is to set it as a MAY.
>>>
>>> MUST/MAY are about conformance criteria.
>>>
>>> For defining the requirements of a conversion (that we are writing), we are not formally
>> defining/testing conformance - or rather, it's just normal consistency across documents.
>> No test suite.
>>
>> Well... I must admit what I had in my mind is that this guidelines, if put into the syntax
>> document, may also provide some rules if, in future, somebody else comes and writes a
>> conversion for some other format that we do not know yet. In that case, the MUST/MAY is
>> something that conversion specification writers should take into account.
>>
>> Ivan
>>
>>
>>
>>
>>>
>>>>
>>>> I realize that this formulation means that the RDF conversion may
>>>> need some serious editing (not conceptually, just the way things are
>>>> presented). Sorry...
>>>>
>>>> Thoughts?
>>>>
>>>> Ivan
>>>
>>> Andy
>>>
>>>>
>>>>
>>>> [1] http://www.w3.org/2013/csvw/track/actions/15
>>>>
>>>> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D
>>>> WebID: http://www.ivan-herman.net/foaf#me
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> GPG: 0x343F1A3D
>> WebID: http://www.ivan-herman.net/foaf#me
>>
>>
>>
>>
>>
>>
>
> --
> Jeni Tennison
> http://www.jenitennison.com/
>
Received on Wednesday, 14 May 2014 11:58:52 UTC