Re: [ACTION-15] General text on conversion

On 14 May 2014, at 13:58 , Andy Seaborne <andy@apache.org> wrote:

> On 14/05/14 12:40, Jeni Tennison wrote:
>> Ivan, Andy,
>> 
>> I’ve created a section called ‘Processing Tables’ in the Metadata spec here:
>> 
>>   http://w3c.github.io/csvw/metadata/#processing-tables
> 
> 2.4 Converting Tables
> http://w3c.github.io/csvw/metadata/#h3_converting-tables

I was too quick, so I gave comments for 2.2 as well:-)

ivan

> 
>> 
>> which (will) include any general constraints about the interpretation of metadata and what conversion specifications have to do. This is all based on Ivan’s wording below.
>> 
>> I think this is useful?
>> 
>> Jeni
>> 
>> ------------------------------------------------------
>> From: Ivan Herman ivan@w3.org
>> Reply: Ivan Herman ivan@w3.org
>> Date: 14 May 2014 at 12:29:47
>> To: Andy Seaborne andy@apache.org
>> Cc: W3C CSV on the Web Working Group public-csv-wg@w3.org
>> Subject:  Re: [ACTION-15] General text on conversion
>> 
>>> 
>>> On 14 May 2014, at 11:54 , Andy Seaborne wrote:
>>> 
>>>> On 08/05/14 12:26, Ivan Herman wrote:
>>>>> Guys
>>>>> 
>>>>> my action from yesterday[1] refers to a text that should be added to
>>>>> the RDF conversion document. I have come up with something, based on
>>>>> the email discussion,
>>>> 
>>>> Thank you.
>>>> 
>>>>> but also some additional issues; however, I am
>>>>> not sure whether the RDF conversion document is the right place for
>>>>> this. I wonder whether adding this as a separate section in the
>>>>> syntax document is not a better choice.
>>>> 
>>>> Gregg has suggested that if all the conversions are based around the
>>>> template mechanism, then there be one conversions document for all of
>>>> RDF, JSON and XML.
>>>> 
>>>> That makes sense to me although I also think that someone arrives at the
>>>> doc wanting, say, the details of JSON conversion, having them all in one place makes
>>> for a less focused document.
>>> 
>>> I think this is something we will see evolving as it comes. At the moment, the text refers
>>> to "format specific property-value annotation pairs", meaning that the templates
>>> are rdf or json or xml specific; I am not yet sure whether some of those can be abstracted
>>> out, so to say.
>>> 
>>> Also: maybe, say, generating a URI using a template makes sense for RDF but maybe the same
>>> cell should be put into a literal for JSON or XML. In which case separating the various
>>> syntaxes does make sense.
>>> 
>>> Bottom line: I am not sure...
>>> 
>>>> 
>>>>> 
>>>>> After discussion and probably some word-smithing I am happy to put
>>>>> the text into either of the documents themselves.
>>>>> 
>>>>> So here we go...
>>>>> 
>>>>> [[[
>>>>> 
>>>>> This specification defines some general principles for the conversion
>>>>> of CSV to other formats. These are:
>>>>> 
>>>>> * The conversions are defined on tabular data, as defined in
>>>>> the "Model for Tabular Data and Metadata on the Web" specification
>>>>> [[!Tabular-Data-Model]]. This means that some of the specificities
>>>>> (like Right-to-Left writing modes, or empty rows in the source file)
>>>>> of CSV files are to be handled by the parsing step yielding the
>>>>> tabular data.
>>>>> 
>>>>> * A conversion specification MUST define a "default" mapping; i.e., a
>>>>> mapping from core tabular data (as opposed to annotated tabular
>>>>> data).
>>>>> 
>>>>> * For the conversion of annotated tabular data:
>>>>> 
>>>>> ** A conversion specification MUST specify how certain property-value
>>>>> pairs, provided by the by the "Metadata Vocabulary for Tabular Data"
>>>>> [[!Tabular-Metadata]], is mapped on the output. These are:
>>>> 
>>>>> *** @id
>>>> 
>>>> Clarification - there are two uses of "@id" in the "issue one" illustration.
>>>> 
>>>> You mean "@id" in "columns", where it is indicating the way to generate an identifier
>>> for the row; it's a template-like thing for the subject.
>>> 
>>> Yes, that is what I meant.
>>> 
>>>> 
>>>> Not the "@id" that is the JSON-LD construct referring to within the metadata file.
>>>> 
>>>> (I'd prefer different names - comment on the metadata doucment)
>>>> 
>>> 
>>> +1
>>> 
>>>>> *** @type
>>>> 
>>>> and here we have "type" and "@type"
>>>> You're referring to "type" in "columns"?
>>> 
>>> Ah, again, I did not realize the @type and type. Yes, I meant for columns, like 'string'
>>> or 'date'.
>>> 
>>>> 
>>>>> *** field types
>>>>> *** Primary Key
>>>>> *** Foreign Key
>>>> 
>>>> Links across files are URIs.
>>> 
>>> I am not sure I understand what you say here.:-(
>>> 
>>>> 
>>>>> ** A conversion specification MAY specify how other property-value
>>>>> pairs, like column names, may be used on the output (e.g., as
>>>>> additional metadata in the output)
>>>>> 
>>>>> * The conversion specification MAY specify a number of additional
>>>>> metadata on the output, regardless of whether that particular
>>>>> information is present in the annotations of the tabular data
>>>>> 
>>>>> * The conversion specification MAY specify a number of format
>>>>> specific property-value annotation pairs. These pairs are part of the
>>>>> tabular data annotations, i.e., the metadata field descriptors
>>>>> (@@@REF@@@), but only relevant for the specific output format.
>>>>> Examples may be flags to specify whether a specific field should be
>>>>> output as an XML element or an XML attribute, or a patterns
>>>>> generating a URI for the RDF object (rather than using a literal).
>>>>> 
>>>>> * The conversion specification MAY also specify a global, format
>>>>> specific property (as part for the CSV annotation) specifying an
>>>>> external processing step that should occur on the generated output.
>>>>> Example may be a reference to an XSLT file, a literal defining a
>>>>> SPARQL CONSTRUCT pattern, or a reference to a Javascript file. The
>>>>> specification of those processing steps are not provided by this
>>>>> Working Group. ]]]
>>>> 
>>>> Not sure the conversion has to talk about that because it's outside the spec. Can't stop
>>> people doing additional stuff! The conversion can be aware of the possibility.
>>> 
>>> Yes, it can be aware, but what I mean here is that the metadata may contain an rdf-specific
>>> field for the table referring to a SPARQL construct. Ie, the standard may provide a placeholder
>>> for those.
>>> 
>>>> 
>>>>> A specific issue: I was wondering whether the usage of, eg, field
>>>>> types or primary keys should be a MUST or a MAY. At the moment I set
>>>>> it as a MUST, although a conversion specification may say that a
>>>>> particular type is simply ignored as a type; But at least this has to
>>>>> be specified. Another is to set it as a MAY.
>>>> 
>>>> MUST/MAY are about conformance criteria.
>>>> 
>>>> For defining the requirements of a conversion (that we are writing), we are not formally
>>> defining/testing conformance - or rather, it's just normal consistency across documents.
>>> No test suite.
>>> 
>>> Well... I must admit what I had in my mind is that this guidelines, if put into the syntax
>>> document, may also provide some rules if, in future, somebody else comes and writes a
>>> conversion for some other format that we do not know yet. In that case, the MUST/MAY is
>>> something that conversion specification writers should take into account.
>>> 
>>> Ivan
>>> 
>>> 
>>> 
>>> 
>>>> 
>>>>> 
>>>>> I realize that this formulation means that the RDF conversion may
>>>>> need some serious editing (not conceptually, just the way things are
>>>>> presented). Sorry...
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>>> Ivan
>>>> 
>>>> Andy
>>>> 
>>>>> 
>>>>> 
>>>>> [1] http://www.w3.org/2013/csvw/track/actions/15
>>>>> 
>>>>> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D
>>>>> WebID: http://www.ivan-herman.net/foaf#me
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> ----
>>> Ivan Herman, W3C
>>> Digital Publishing Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> GPG: 0x343F1A3D
>>> WebID: http://www.ivan-herman.net/foaf#me
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> --
>> Jeni Tennison
>> http://www.jenitennison.com/
>> 
> 
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me

Received on Wednesday, 14 May 2014 12:00:27 UTC