- From: David Booth <david@dbooth.org>
- Date: Sun, 06 Apr 2014 22:12:22 -0400
- To: "public-csv-wg@w3.org" <public-csv-wg@w3.org>, Jeni Tennison <jeni@jenitennison.com>, 'Gregg Kellogg' <gregg@greggkellogg.com>
Regarding sec 3.5 Standard Path, http://w3c.github.io/csvw/syntax/#standard-path 1. I am very happy to see mention of standard path ideas, as I think in most cases a standard path will provide the easiest way for people to associate metadata with CSV documents. A big +1 from me on that! 2. I'm also happy to see the condition that "if the metadata file does not explicitly point to the relevant CSV file then it MUST be ignored", as this neatly avoids the problem of URI squatting. Again, +1 from me! 3. However, I think the phrasing of "if the metadata file does not *explicitly* point to the relevant CSV file then it MUST be ignored" may be slightly overstating the requirement. To avoid URI squatting, the important thing is merely that the metadata document explicitly identify itself *as* a CSV metadata document -- not the fact that it is associated with any specific data document. For example, a metadata document placed in a directory may be intended to apply to all CSV files in that directory. So I think the question of whether such a directory-level metadata document should be required to explicitly list all affected CSV documents should be viewed only as a trade-off between: (a) the convenience of not needing to modify the directory-level metadata file each time a new CSV file is added to that directory; and (b) a potential mistake that a publisher might make, in placing a CSV file into a directory containing a directory-level metadata file that was not intended to apply to that CSV file. At this point I think it would be substantially better to lean toward convenience -- option (a). Personally, I hate having to make coordinated changes in two different places. It violates the Don't Repeat Yourself (DRY) principle. And if a directory-level metadata file were required to explicitly list all of the data files to which it applies, I actually think the chances of someone forgetting to add an entry to it when adding another data file would be substantial. I guess one possible middle ground approach would be to require a directory-level metadata file to include a filename pattern (actually a relative URI pattern), to indicate which files in the directory should be governed by that metadata. 4. The current draft (I think) suggests standard locations: CSV file: filename.csv Metadata file: filename.csvm If the metadata itself is encoded as a CSV file, then another possibility to consider would be: CSV file: filename.csv Metadata file: filename.csv.metadata.csv This would have the benefit of using an established .csv extension. It's also less cryptic than .csvm . 5. In theory it would be fine to offer data publishers multiple Standard Path ways to publish a CSV document's metadata, as long as a standard prioritization between them is defined. However, the more ways there are, the more hassle it is to implement software in a standards-compliant way. So from this perspective, I think the fewer ways there are the better, as long as the standard makes it easy *enough* for data publishers. Thanks, David
Received on Monday, 7 April 2014 02:12:50 UTC