Re: [ACTION-15] General text on conversion from Jeni Tennison on 2014-05-14 (public-csv-wg@w3.org from May 2014)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Wed, 14 May 2014 12:40:53 +0100
To: Ivan Herman <ivan@w3.org>, Andy Seaborne <andy@apache.org>
Cc: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <etPan.53735645.2f305def.5e3e@jenit.local>
Ivan, Andy,

I’ve created a section called ‘Processing Tables’ in the Metadata spec here:

  http://w3c.github.io/csvw/metadata/#processing-tables

which (will) include any general constraints about the interpretation of metadata and what conversion specifications have to do. This is all based on Ivan’s wording below.

I think this is useful?

Jeni

------------------------------------------------------
From: Ivan Herman ivan@w3.org
Reply: Ivan Herman ivan@w3.org
Date: 14 May 2014 at 12:29:47
To: Andy Seaborne andy@apache.org
Cc: W3C CSV on the Web Working Group public-csv-wg@w3.org
Subject:  Re: [ACTION-15] General text on conversion

>  
> On 14 May 2014, at 11:54 , Andy Seaborne wrote:
>  
> > On 08/05/14 12:26, Ivan Herman wrote:
> >> Guys
> >>
> >> my action from yesterday[1] refers to a text that should be added to
> >> the RDF conversion document. I have come up with something, based on
> >> the email discussion,
> >
> > Thank you.
> >
> >> but also some additional issues; however, I am
> >> not sure whether the RDF conversion document is the right place for
> >> this. I wonder whether adding this as a separate section in the
> >> syntax document is not a better choice.
> >
> > Gregg has suggested that if all the conversions are based around the
> > template mechanism, then there be one conversions document for all of
> > RDF, JSON and XML.
> >
> > That makes sense to me although I also think that someone arrives at the
> > doc wanting, say, the details of JSON conversion, having them all in one place makes  
> for a less focused document.
>  
> I think this is something we will see evolving as it comes. At the moment, the text refers  
> to "format specific property-value annotation pairs", meaning that the templates  
> are rdf or json or xml specific; I am not yet sure whether some of those can be abstracted  
> out, so to say.
>  
> Also: maybe, say, generating a URI using a template makes sense for RDF but maybe the same  
> cell should be put into a literal for JSON or XML. In which case separating the various  
> syntaxes does make sense.
>  
> Bottom line: I am not sure...
>  
> >
> >>
> >> After discussion and probably some word-smithing I am happy to put
> >> the text into either of the documents themselves.
> >>
> >> So here we go...
> >>
> >> [[[
> >>
> >> This specification defines some general principles for the conversion
> >> of CSV to other formats. These are:
> >>
> >> * The conversions are defined on tabular data, as defined in
> >> the "Model for Tabular Data and Metadata on the Web" specification
> >> [[!Tabular-Data-Model]]. This means that some of the specificities
> >> (like Right-to-Left writing modes, or empty rows in the source file)
> >> of CSV files are to be handled by the parsing step yielding the
> >> tabular data.
> >>
> >> * A conversion specification MUST define a "default" mapping; i.e., a
> >> mapping from core tabular data (as opposed to annotated tabular
> >> data).
> >>
> >> * For the conversion of annotated tabular data:
> >>
> >> ** A conversion specification MUST specify how certain property-value
> >> pairs, provided by the by the "Metadata Vocabulary for Tabular Data"
> >> [[!Tabular-Metadata]], is mapped on the output. These are:
> >
> > > *** @id
> >
> > Clarification - there are two uses of "@id" in the "issue one" illustration.
> >
> > You mean "@id" in "columns", where it is indicating the way to generate an identifier  
> for the row; it's a template-like thing for the subject.
>  
> Yes, that is what I meant.
>  
> >
> > Not the "@id" that is the JSON-LD construct referring to within the metadata file.
> >
> > (I'd prefer different names - comment on the metadata doucment)
> >
>  
> +1
>  
> > > *** @type
> >
> > and here we have "type" and "@type"
> > You're referring to "type" in "columns"?
>  
> Ah, again, I did not realize the @type and type. Yes, I meant for columns, like 'string'  
> or 'date'.
>  
> >
> > > *** field types
> > > *** Primary Key
> > > *** Foreign Key
> >
> > Links across files are URIs.
>  
> I am not sure I understand what you say here.:-(
>  
> >
> >> ** A conversion specification MAY specify how other property-value
> >> pairs, like column names, may be used on the output (e.g., as
> >> additional metadata in the output)
> >>
> >> * The conversion specification MAY specify a number of additional
> >> metadata on the output, regardless of whether that particular
> >> information is present in the annotations of the tabular data
> >>
> >> * The conversion specification MAY specify a number of format
> >> specific property-value annotation pairs. These pairs are part of the
> >> tabular data annotations, i.e., the metadata field descriptors
> >> (@@@REF@@@), but only relevant for the specific output format.
> >> Examples may be flags to specify whether a specific field should be
> >> output as an XML element or an XML attribute, or a patterns
> >> generating a URI for the RDF object (rather than using a literal).
> >>
> >> * The conversion specification MAY also specify a global, format
> >> specific property (as part for the CSV annotation) specifying an
> >> external processing step that should occur on the generated output.
> >> Example may be a reference to an XSLT file, a literal defining a
> >> SPARQL CONSTRUCT pattern, or a reference to a Javascript file. The
> >> specification of those processing steps are not provided by this
> >> Working Group. ]]]
> >
> > Not sure the conversion has to talk about that because it's outside the spec. Can't stop  
> people doing additional stuff! The conversion can be aware of the possibility.
>  
> Yes, it can be aware, but what I mean here is that the metadata may contain an rdf-specific  
> field for the table referring to a SPARQL construct. Ie, the standard may provide a placeholder  
> for those.
>  
> >
> >> A specific issue: I was wondering whether the usage of, eg, field
> >> types or primary keys should be a MUST or a MAY. At the moment I set
> >> it as a MUST, although a conversion specification may say that a
> >> particular type is simply ignored as a type; But at least this has to
> >> be specified. Another is to set it as a MAY.
> >
> > MUST/MAY are about conformance criteria.
> >
> > For defining the requirements of a conversion (that we are writing), we are not formally  
> defining/testing conformance - or rather, it's just normal consistency across documents.  
> No test suite.
>  
> Well... I must admit what I had in my mind is that this guidelines, if put into the syntax  
> document, may also provide some rules if, in future, somebody else comes and writes a  
> conversion for some other format that we do not know yet. In that case, the MUST/MAY is  
> something that conversion specification writers should take into account.
>  
> Ivan
>  
>  
>  
>  
> >
> >>
> >> I realize that this formulation means that the RDF conversion may
> >> need some serious editing (not conceptually, just the way things are
> >> presented). Sorry...
> >>
> >> Thoughts?
> >>
> >> Ivan
> >
> > Andy
> >
> >>
> >>
> >> [1] http://www.w3.org/2013/csvw/track/actions/15
> >>
> >> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
> >> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D
> >> WebID: http://www.ivan-herman.net/foaf#me
> >>
> >>
> >>
> >>
> >>
> >
> >
>  
>  
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> WebID: http://www.ivan-herman.net/foaf#me
>  
>  
>  
>  
>  
>  

--  
Jeni Tennison
http://www.jenitennison.com/
Received on Wednesday, 14 May 2014 11:41:21 UTC