- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Wed, 14 May 2014 12:40:53 +0100
- To: Ivan Herman <ivan@w3.org>, Andy Seaborne <andy@apache.org>
- Cc: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Ivan, Andy, I’ve created a section called ‘Processing Tables’ in the Metadata spec here: http://w3c.github.io/csvw/metadata/#processing-tables which (will) include any general constraints about the interpretation of metadata and what conversion specifications have to do. This is all based on Ivan’s wording below. I think this is useful? Jeni ------------------------------------------------------ From: Ivan Herman ivan@w3.org Reply: Ivan Herman ivan@w3.org Date: 14 May 2014 at 12:29:47 To: Andy Seaborne andy@apache.org Cc: W3C CSV on the Web Working Group public-csv-wg@w3.org Subject: Re: [ACTION-15] General text on conversion > > On 14 May 2014, at 11:54 , Andy Seaborne wrote: > > > On 08/05/14 12:26, Ivan Herman wrote: > >> Guys > >> > >> my action from yesterday[1] refers to a text that should be added to > >> the RDF conversion document. I have come up with something, based on > >> the email discussion, > > > > Thank you. > > > >> but also some additional issues; however, I am > >> not sure whether the RDF conversion document is the right place for > >> this. I wonder whether adding this as a separate section in the > >> syntax document is not a better choice. > > > > Gregg has suggested that if all the conversions are based around the > > template mechanism, then there be one conversions document for all of > > RDF, JSON and XML. > > > > That makes sense to me although I also think that someone arrives at the > > doc wanting, say, the details of JSON conversion, having them all in one place makes > for a less focused document. > > I think this is something we will see evolving as it comes. At the moment, the text refers > to "format specific property-value annotation pairs", meaning that the templates > are rdf or json or xml specific; I am not yet sure whether some of those can be abstracted > out, so to say. > > Also: maybe, say, generating a URI using a template makes sense for RDF but maybe the same > cell should be put into a literal for JSON or XML. In which case separating the various > syntaxes does make sense. > > Bottom line: I am not sure... > > > > >> > >> After discussion and probably some word-smithing I am happy to put > >> the text into either of the documents themselves. > >> > >> So here we go... > >> > >> [[[ > >> > >> This specification defines some general principles for the conversion > >> of CSV to other formats. These are: > >> > >> * The conversions are defined on tabular data, as defined in > >> the "Model for Tabular Data and Metadata on the Web" specification > >> [[!Tabular-Data-Model]]. This means that some of the specificities > >> (like Right-to-Left writing modes, or empty rows in the source file) > >> of CSV files are to be handled by the parsing step yielding the > >> tabular data. > >> > >> * A conversion specification MUST define a "default" mapping; i.e., a > >> mapping from core tabular data (as opposed to annotated tabular > >> data). > >> > >> * For the conversion of annotated tabular data: > >> > >> ** A conversion specification MUST specify how certain property-value > >> pairs, provided by the by the "Metadata Vocabulary for Tabular Data" > >> [[!Tabular-Metadata]], is mapped on the output. These are: > > > > > *** @id > > > > Clarification - there are two uses of "@id" in the "issue one" illustration. > > > > You mean "@id" in "columns", where it is indicating the way to generate an identifier > for the row; it's a template-like thing for the subject. > > Yes, that is what I meant. > > > > > Not the "@id" that is the JSON-LD construct referring to within the metadata file. > > > > (I'd prefer different names - comment on the metadata doucment) > > > > +1 > > > > *** @type > > > > and here we have "type" and "@type" > > You're referring to "type" in "columns"? > > Ah, again, I did not realize the @type and type. Yes, I meant for columns, like 'string' > or 'date'. > > > > > > *** field types > > > *** Primary Key > > > *** Foreign Key > > > > Links across files are URIs. > > I am not sure I understand what you say here.:-( > > > > >> ** A conversion specification MAY specify how other property-value > >> pairs, like column names, may be used on the output (e.g., as > >> additional metadata in the output) > >> > >> * The conversion specification MAY specify a number of additional > >> metadata on the output, regardless of whether that particular > >> information is present in the annotations of the tabular data > >> > >> * The conversion specification MAY specify a number of format > >> specific property-value annotation pairs. These pairs are part of the > >> tabular data annotations, i.e., the metadata field descriptors > >> (@@@REF@@@), but only relevant for the specific output format. > >> Examples may be flags to specify whether a specific field should be > >> output as an XML element or an XML attribute, or a patterns > >> generating a URI for the RDF object (rather than using a literal). > >> > >> * The conversion specification MAY also specify a global, format > >> specific property (as part for the CSV annotation) specifying an > >> external processing step that should occur on the generated output. > >> Example may be a reference to an XSLT file, a literal defining a > >> SPARQL CONSTRUCT pattern, or a reference to a Javascript file. The > >> specification of those processing steps are not provided by this > >> Working Group. ]]] > > > > Not sure the conversion has to talk about that because it's outside the spec. Can't stop > people doing additional stuff! The conversion can be aware of the possibility. > > Yes, it can be aware, but what I mean here is that the metadata may contain an rdf-specific > field for the table referring to a SPARQL construct. Ie, the standard may provide a placeholder > for those. > > > > >> A specific issue: I was wondering whether the usage of, eg, field > >> types or primary keys should be a MUST or a MAY. At the moment I set > >> it as a MUST, although a conversion specification may say that a > >> particular type is simply ignored as a type; But at least this has to > >> be specified. Another is to set it as a MAY. > > > > MUST/MAY are about conformance criteria. > > > > For defining the requirements of a conversion (that we are writing), we are not formally > defining/testing conformance - or rather, it's just normal consistency across documents. > No test suite. > > Well... I must admit what I had in my mind is that this guidelines, if put into the syntax > document, may also provide some rules if, in future, somebody else comes and writes a > conversion for some other format that we do not know yet. In that case, the MUST/MAY is > something that conversion specification writers should take into account. > > Ivan > > > > > > > >> > >> I realize that this formulation means that the RDF conversion may > >> need some serious editing (not conceptually, just the way things are > >> presented). Sorry... > >> > >> Thoughts? > >> > >> Ivan > > > > Andy > > > >> > >> > >> [1] http://www.w3.org/2013/csvw/track/actions/15 > >> > >> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home: > >> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D > >> WebID: http://www.ivan-herman.net/foaf#me > >> > >> > >> > >> > >> > > > > > > > ---- > Ivan Herman, W3C > Digital Publishing Activity Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > GPG: 0x343F1A3D > WebID: http://www.ivan-herman.net/foaf#me > > > > > > -- Jeni Tennison http://www.jenitennison.com/
Received on Wednesday, 14 May 2014 11:41:21 UTC