Re: Specifying the number of rows in a table

> On Oct 10, 2018, at 1:46 PM, Clark Fitzgerald <clarkfitzg@gmail.com> wrote:
> 
> Hello,
> 
> I would like to use W3's tabular data model to record metadata for local CSV files relevant for statistical analysis using the R language (or Python, Julia). For example, to indicate that the local file "data.csv" contains one million rows in randomized order I might use the following table description:
> 
> {
> "url": "data.csv",
> "notes": [{"numberRows": 1e6, "randomized": true}]
> }
> 
> A couple questions:
> 
> 1. Is this reasonable/correct?

More or less. The metadata document can have a “notes” attribute with arbitrary content. This creates a notes annotation on the table [1]. To be properly treated as JSON-LD/RDF, both “numberOfRows” and “randomized” need to resolve to IRIs. They could be in one of the namespaces defined for CSVW [2], or you can use an absolute IRI for the property, otherwise. WIthout this, the data would be dropped when interpreted, at least by the CSV2RDF process.

> 2. Is there a better way to do it? Perhaps by linking to a document that defines new common properties like numberRows?

This is somewhat problematic, as CSVW doesn’t allow you to define arbitrary JSON-LD contexts, otherwise, you might define your namespace or term mappings in a context within “notes”. It’s not an unreasonable thing to do, though IMHO, and tool creators may be convinced to support this as an extension. Actually, IMHO, an update to this spec could remove the restriction on the value of @context, and/or allow CSVW to be used within other contexts, such as schema.org <http://schema.org/>, but strictly speaking, you can’t do this now.

Gregg

[1] https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/#dfn-table-notes <https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/#dfn-table-notes>
[2] https://www.w3.org/ns/csvw#term-definitions

> Thanks,
> Clark Fitzgerald

Received on Thursday, 11 October 2018 22:10:00 UTC