W3C home > Mailing lists > Public > public-csv-wg@w3.org > December 2015

Re: Questions on the url property in Table annotation an on dialect being a core property

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Wed, 9 Dec 2015 09:52:03 -0800
Cc: W3C CSV on the Web Working Group <public-csv-wg@w3.org>, Jeni Tennison <jeni@jenitennison.com>, Ivan Herman <ivan@w3.org>
Message-Id: <31135E36-E649-49F5-97AC-D8C5171904CC@greggkellogg.net>
To: "Lars G. Svensson" <L.Svensson@dnb.de>
> On Dec 9, 2015, at 12:52 AM, Ivan Herman <ivan@w3.org> wrote:
> 
> (Cc-ing to Gregg & Jeni, as an additional ping to get their attention…)
> 
> Hey Lars,
> 
> 
>> On 4 Dec 2015, at 12:26, Svensson, Lars <L.Svensson@dnb.de> wrote:
>> 
>> Dear all,
>> 
>> While reviewing the WG documents (excellent work, large kudoi to the WG!)
> 
> Thank you!
> 
>> and thinking of how we could produce compatible data at our place, I stumbled over the url annotation on tables as defined in the metadata vocabulary, §5.4 [1]. The specification says that in the table metadata the url (URI?) is mandatory and should point to the table the table description describes, referring to the definition of url in the tabular data model [2] that says that the value of the url might be null.

Yes, the property must be present, but can be null, in which case it is treated as an empty URL, which is resolve relative to the metadata location. Note that the “url” property is a Link property, which means that if the value supplied is not a string, it is treated as an empty string, and so resolved against the metadata base.

>> At first sight my reading would be that for each table I describe with a table annotation in the metadata document, I MUST have a url property pointing from the metadata document to the described table. If so, that would be a major implementation obstacle at our place.

Yes, the “url” property is required, and it’s value must reference the CSV being used to be considered compatible. If it doesn’t, it is incompatible, which basically means that using it will issue a warning, unless you’re validating, in which case it’s an error. The reason for this is to take steps to be sure the metadata is compatible with the tables being interpreted. IIRC, Jeni was the chief proponent of this, and may have something more to say.

>> Our main use case for producing tabular data is that customers can go to the catalogue, select a number of object descriptions and export those as CSV. When the customer downloads the data, we would provide a Link-Header pointing to the metadata document describing the CSV format. It would, however, be almost impossible to point back from that metadata document to _all_ instances of CSV files ever created (particularly since that would also have privacy concerns, since it would be possible to see what other customers have downloaded).

If your intention is to have one metadata file work, with arbitrary columns selected, you’ll run into other problems, as the expectation is that there is a 1:1 relationship between the columns in the CSV and those in the metadata. However, you might create a metadata document the respond to from the link header compatible with the CSV that is downloaded. It won’t validate as being compatible, but it should be useable for generating RDF or JSON from the result, as long as the column descriptions match those in the CSV file.

>> This boils down to the following question(s):
>> 
>> 1) Is my understanding of the use of the url property in the table metadata correct?
>> 2) If so, can I solve it by simply setting it to null?
> 
> That is my reading and, I think, that was our intention. If set to null, that means that the implementation makes the 'pairing' between the metadata and the data itself which, as far as I can see, is exactly what you do.

As I said, I don’t think so. If it’s set to null, it is interpreted as an empty string, which is a relative URL. However, this should just issue a warning. Note, however, that if the CSV and the metadata are both available at the same URL subject to content-negotiation, this would be valid. But, if you’re downloading the CSV and it has no location, there would be no way for the metadata to locate it anyway.

One thing which would be good within the spec, if not within an existing implementation, is to set the Location or Content-Location header to be the same as the metadata. A client which is aware of this would see that the location of the CSV was the same as the metadata referenced using the Link header and consider it compatible.

In any case, it’s an issue of compatibility for the purpose of generating warnings or doing validation, which should not affect an actual transformation, but it would be nice if there were no warnings.

>> And one further question regarding dialect:
>> 
>> The dialect is an optional property in the table description. From my understanding, however, the dialect has major impact on the processing of the table. In the tabular format definition, core annotations are those that have impact on processor behaviour [3]. Does that mean, that dialect should be a core annotation or is that solved by defining default values for the dialect?
> 
> First of all, the dialect is optional. Furthermore, the dialect only provides hints; the parsing algorithm in the model document[1] is non normative. In other words, if your processor produces an annotated data model, that is fine; how the processor gets there, so to say, is not something these recommendations control…

Yes, I don’t believe dialect is considered a core annotation, as that describes the annotated table, rather than the mechanisms used to generate the annotated table, which the dialect is used for. As Ivan says, it is really just a processing hint.

Gregg

> I hope this answers your questions!
> 
> Cheers
> 
> Ivan
> 
> 
> [1] http://www.w3.org/TR/2015/PR-tabular-data-model-20151117/#parsing
> 
>> 
>> [1] http://www.w3.org/TR/2015/PR-tabular-metadata-20151117/#tables
>> [2] http://www.w3.org/TR/2015/PR-tabular-data-model-20151117/#dfn-url
>> [3] http://www.w3.org/TR/2015/PR-tabular-data-model-20151117/#dfn-core-annotations
>> 
>> Thanks for any insight,
>> 
>> Lars
>> 
>> *** Lesen. Hören. Wissen. Deutsche Nationalbibliothek ***
>> --
>> Dr. Lars G. Svensson
>> Deutsche Nationalbibliothek
>> Informationsinfrastruktur
>> Adickesallee 1
>> D-60322 Frankfurt am Main
>> Telefon: +49-69-1525-1752
>> Telefax: +49-69-1525-1799
>> mailto:l.svensson@dnb.de
>> http://www.dnb.de
>> 
>> 
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Wednesday, 9 December 2015 17:52:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 9 December 2015 17:52:35 UTC