Re: [ACTION-15] General text on conversion

On 14 May 2014, at 13:40 , Jeni Tennison <jeni@jenitennison.com> wrote:

> Ivan, Andy,
> 
> I’ve created a section called ‘Processing Tables’ in the Metadata spec here:
> 
>   http://w3c.github.io/csvw/metadata/#processing-tables
> 
> which (will) include any general constraints about the interpretation of metadata and what conversion specifications have to do. This is all based on Ivan’s wording below.
> 
> I think this is useful?

It is. Some comments, though:

- Section 2.2.2. relies on the term 'display'. But the whole RTL/LTR issue is not only about 'display'. It is also relevant on the model level: while the syntax does say that the order of the columns are relevant, what this order is v.a.v. the input stream depends on the directionality.

- For section 2.4, there should be an issue on third paragraph. In my text I proposed that some fields MUST be handled by the conversion in some way or other (like primary keys), the text you have made that milder. Which is fine, you may be right, but I do not know; marking that as a specific issue might help the discussion.

I am also not 100% sure that the whole section should be in the metadata document or the syntax document, eventually. That is obviously an editorial issue for later but, at this moment, the borderline between these two documents is a bit fuzzy...

Thanks!

Ivan


> 
> Jeni
> 
> ------------------------------------------------------
> From: Ivan Herman ivan@w3.org
> Reply: Ivan Herman ivan@w3.org
> Date: 14 May 2014 at 12:29:47
> To: Andy Seaborne andy@apache.org
> Cc: W3C CSV on the Web Working Group public-csv-wg@w3.org
> Subject:  Re: [ACTION-15] General text on conversion
> 
>> 
>> On 14 May 2014, at 11:54 , Andy Seaborne wrote:
>> 
>>> On 08/05/14 12:26, Ivan Herman wrote:
>>>> Guys
>>>> 
>>>> my action from yesterday[1] refers to a text that should be added to
>>>> the RDF conversion document. I have come up with something, based on
>>>> the email discussion,
>>> 
>>> Thank you.
>>> 
>>>> but also some additional issues; however, I am
>>>> not sure whether the RDF conversion document is the right place for
>>>> this. I wonder whether adding this as a separate section in the
>>>> syntax document is not a better choice.
>>> 
>>> Gregg has suggested that if all the conversions are based around the
>>> template mechanism, then there be one conversions document for all of
>>> RDF, JSON and XML.
>>> 
>>> That makes sense to me although I also think that someone arrives at the
>>> doc wanting, say, the details of JSON conversion, having them all in one place makes  
>> for a less focused document.
>> 
>> I think this is something we will see evolving as it comes. At the moment, the text refers  
>> to "format specific property-value annotation pairs", meaning that the templates  
>> are rdf or json or xml specific; I am not yet sure whether some of those can be abstracted  
>> out, so to say.
>> 
>> Also: maybe, say, generating a URI using a template makes sense for RDF but maybe the same  
>> cell should be put into a literal for JSON or XML. In which case separating the various  
>> syntaxes does make sense.
>> 
>> Bottom line: I am not sure...
>> 
>>> 
>>>> 
>>>> After discussion and probably some word-smithing I am happy to put
>>>> the text into either of the documents themselves.
>>>> 
>>>> So here we go...
>>>> 
>>>> [[[
>>>> 
>>>> This specification defines some general principles for the conversion
>>>> of CSV to other formats. These are:
>>>> 
>>>> * The conversions are defined on tabular data, as defined in
>>>> the "Model for Tabular Data and Metadata on the Web" specification
>>>> [[!Tabular-Data-Model]]. This means that some of the specificities
>>>> (like Right-to-Left writing modes, or empty rows in the source file)
>>>> of CSV files are to be handled by the parsing step yielding the
>>>> tabular data.
>>>> 
>>>> * A conversion specification MUST define a "default" mapping; i.e., a
>>>> mapping from core tabular data (as opposed to annotated tabular
>>>> data).
>>>> 
>>>> * For the conversion of annotated tabular data:
>>>> 
>>>> ** A conversion specification MUST specify how certain property-value
>>>> pairs, provided by the by the "Metadata Vocabulary for Tabular Data"
>>>> [[!Tabular-Metadata]], is mapped on the output. These are:
>>> 
>>>> *** @id
>>> 
>>> Clarification - there are two uses of "@id" in the "issue one" illustration.
>>> 
>>> You mean "@id" in "columns", where it is indicating the way to generate an identifier  
>> for the row; it's a template-like thing for the subject.
>> 
>> Yes, that is what I meant.
>> 
>>> 
>>> Not the "@id" that is the JSON-LD construct referring to within the metadata file.
>>> 
>>> (I'd prefer different names - comment on the metadata doucment)
>>> 
>> 
>> +1
>> 
>>>> *** @type
>>> 
>>> and here we have "type" and "@type"
>>> You're referring to "type" in "columns"?
>> 
>> Ah, again, I did not realize the @type and type. Yes, I meant for columns, like 'string'  
>> or 'date'.
>> 
>>> 
>>>> *** field types
>>>> *** Primary Key
>>>> *** Foreign Key
>>> 
>>> Links across files are URIs.
>> 
>> I am not sure I understand what you say here.:-(
>> 
>>> 
>>>> ** A conversion specification MAY specify how other property-value
>>>> pairs, like column names, may be used on the output (e.g., as
>>>> additional metadata in the output)
>>>> 
>>>> * The conversion specification MAY specify a number of additional
>>>> metadata on the output, regardless of whether that particular
>>>> information is present in the annotations of the tabular data
>>>> 
>>>> * The conversion specification MAY specify a number of format
>>>> specific property-value annotation pairs. These pairs are part of the
>>>> tabular data annotations, i.e., the metadata field descriptors
>>>> (@@@REF@@@), but only relevant for the specific output format.
>>>> Examples may be flags to specify whether a specific field should be
>>>> output as an XML element or an XML attribute, or a patterns
>>>> generating a URI for the RDF object (rather than using a literal).
>>>> 
>>>> * The conversion specification MAY also specify a global, format
>>>> specific property (as part for the CSV annotation) specifying an
>>>> external processing step that should occur on the generated output.
>>>> Example may be a reference to an XSLT file, a literal defining a
>>>> SPARQL CONSTRUCT pattern, or a reference to a Javascript file. The
>>>> specification of those processing steps are not provided by this
>>>> Working Group. ]]]
>>> 
>>> Not sure the conversion has to talk about that because it's outside the spec. Can't stop  
>> people doing additional stuff! The conversion can be aware of the possibility.
>> 
>> Yes, it can be aware, but what I mean here is that the metadata may contain an rdf-specific  
>> field for the table referring to a SPARQL construct. Ie, the standard may provide a placeholder  
>> for those.
>> 
>>> 
>>>> A specific issue: I was wondering whether the usage of, eg, field
>>>> types or primary keys should be a MUST or a MAY. At the moment I set
>>>> it as a MUST, although a conversion specification may say that a
>>>> particular type is simply ignored as a type; But at least this has to
>>>> be specified. Another is to set it as a MAY.
>>> 
>>> MUST/MAY are about conformance criteria.
>>> 
>>> For defining the requirements of a conversion (that we are writing), we are not formally  
>> defining/testing conformance - or rather, it's just normal consistency across documents.  
>> No test suite.
>> 
>> Well... I must admit what I had in my mind is that this guidelines, if put into the syntax  
>> document, may also provide some rules if, in future, somebody else comes and writes a  
>> conversion for some other format that we do not know yet. In that case, the MUST/MAY is  
>> something that conversion specification writers should take into account.
>> 
>> Ivan
>> 
>> 
>> 
>> 
>>> 
>>>> 
>>>> I realize that this formulation means that the RDF conversion may
>>>> need some serious editing (not conceptually, just the way things are
>>>> presented). Sorry...
>>>> 
>>>> Thoughts?
>>>> 
>>>> Ivan
>>> 
>>> Andy
>>> 
>>>> 
>>>> 
>>>> [1] http://www.w3.org/2013/csvw/track/actions/15
>>>> 
>>>> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D
>>>> WebID: http://www.ivan-herman.net/foaf#me
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> GPG: 0x343F1A3D
>> WebID: http://www.ivan-herman.net/foaf#me
>> 
>> 
>> 
>> 
>> 
>> 
> 
> --  
> Jeni Tennison
> http://www.jenitennison.com/


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me

Received on Wednesday, 14 May 2014 11:57:10 UTC