Re: Binary Files

The CSVW properties are really there to describe information once in a tabular format. There is a section in the tabular-data-model spec [1] on parsing, and many of the details are application specific. The ones described pretty much presume that you can retrieve column and row data.

Although not specifically endorsed by the specs, derivative formats that did include properties for extracting tabular data from binary formats (or any otherwise non-tabular format) could extend the tabular metadata standard. At the time it was published, extending the context for describing metadata beyond the built-in namespaces and terms, was not provided for, under the thought that many toolchains might not include JSON-LD toolchains; today, I think it’s more likely that they do. A derived format could incorporate the standardized vocabulary and provide for generalized JSON-LD context processing (which is even more sophisticated in JSON-LD 1.1) and introduce terms appropriate for binary formats, This would be a great thing to consider contributing to the CSV on the Web Community Group [2], which could consider publishing it as a report, giving it some credence.

Gregg Kellogg
gregg@greggkellogg.net

[1] https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/#parsing <https://www.w3.org/TR/2015/REC-tabular-data-model-20151217/#parsing>
[2] https://www.w3.org/community/csvw/ <https://www.w3.org/community/csvw/>


> On Oct 28, 2021, at 6:33 AM, Erich Bremer <erich@ebremer.com> wrote:
> 
> Does CSVW have any language to indicate file byte order endianness?  - E
> 
> On Wed, Oct 27, 2021 at 7:31 PM Gregg Kellogg <gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>> wrote:
> In principle, CSVW is rather format agnostic, so given an appropriate description of the file format, and a means of reading rows, columns and headers, it should work for binary formats. 
> 
> Gregg Kellogg
> 
> Sent from my iPad
> 
>> On Oct 27, 2021, at 12:34 PM, Erich Bremer <erich@ebremer.com <mailto:erich@ebremer.com>> wrote:
>> 
>> 
>> Can https://www.w3.org/TR/csv2rdf/ <https://www.w3.org/TR/csv2rdf/> be used to describe a binary file?  For example, take a binary file with one column as long and the second column as an int and with multiple records of (long+int) that would be 64 bits + 32 bits = 96 bits wide for each row.  I started to create a vocabulary to do this but CSVW seemed to be similar enough that it would work but perhaps a bit of a bending of the original intent of csvw.   I would think it would look something like this with CSVW/schema.org <http://schema.org/>:
>> 
>> {
>>     "@id" : "example.bin/",
>>     "@type" : "https://schema.org/MediaObject <https://schema.org/MediaObject>",
>>     "contentSize" : "34806000",
>>     "description" : "a simple two-column binary file long/int",
>>     "encodingFormat" : "application/octet-stream",
>>     "tableSchema" : "_:b137"
>> },
>> {
>>     "@id" : "_:b137",
>>     "column" : {
>>       "@list" : [ "example.bin/#col=0", "example.bin/#col=1" ]
>>     },
>>     "https://www.w3.org/ns/csvw/header <https://www.w3.org/ns/csvw/header>" : false
>> },
>> {
>>     "@id" : "example.bin/#col=0",
>>     "@type" : "https://www.w3.org/ns/csvw/Column <https://www.w3.org/ns/csvw/Column>",
>>     "datatype" : "xsd:unsignedLong"
>> }, {
>>     "@id" : "example.bin/#col=1",
>>     "@type" : "https://www.w3.org/ns/csvw/Column <https://www.w3.org/ns/csvw/Column>",
>>     "datatype" : "xsd:unsignedInt"
>> }
>> 
>> I think it would be useful to describe a binary file in RDF.  - Erich

Received on Saturday, 30 October 2021 19:10:53 UTC