Re: Convenience suggestion: Allow metadata in a CSV file

> On 30 Apr 2015, at 18:32 , Jeni Tennison <jeni@jenitennison.com> wrote:
> 
> Hi David,
> 
> Sorry, I looked at the example metadata that you provided and it includes data, so that misled me (I'll admit I didn't manage to watch the video as I was on a train but I now have and understand the goal there).
> 
> We did discuss CSV syntax for the metadata early on as well. I was initially keen on this idea myself.
> 
> However, the problem is that CSV is best suited to tabular information and metadata isn't tabular. You've encountered this in your example where you have rows that are mostly empty in which the column headers are completely irrelevant to the contents of the cells. If you try to support the entirety of the metadata vocabulary I think there will be a number of instances where the constraints of the tabular syntax start to really bite (multiple tables, derived datatypes and foreign keys are the ones that spring to mind). It certainly isn't impossible to support those things, but I think it is difficult and I don't think the result will be particularly user friendly.
> 
> I think it would be interesting to investigate alternative syntaxes for parts of tabular metadata (eg just schemas) and/or a specialist text-based format for the metadata (ala the compact syntax for RELAX NG or the Manchester syntax for ontologies). But I think these are substantial pieces of work and not things that we can take on right now.
> 
> I suspect that if supplying metadata for CSV files takes off we will find tools start to develop more user friendly syntaxes to save people writing JSON by hand, as you have done, and these could be used to inform standardisation of such a syntax.
> 
> So as you suspected, this isn't something that I think we can take on at this stage.
> 

For the records: I agree.

Note that the present metadata syntax is in JSON, but is also JSON-LD compatible. Ie, although not emphasized in the document, the metadata can be expressed in a number of other syntaxes, most obviously Turtle, without any change on the definition of the metadata. Ie, the possibilities for syntaxes are large...

Ivan

> Does that make sense? Is there anything that your like to see in the specs that leaves this possible future work in play?
> 
> Jeni
> 
> 
> On Thu, 30 Apr 2015 12:26 David Booth <david@dbooth.org> wrote:
> Hi Jeni,
> 
> This approach would *not* require the publisher to amend existing CSV
> files.  The metadata is provided in a *separate* CSV file requiring no
> changes whatsoever to existing CSV formats.  Was the video unclear about
> that?  (Apologies if so.)
> 
> Thanks,
> David Booth
> 
> On 04/30/2015 03:37 AM, Jeni Tennison wrote:
> > Hi David,
> >
> > Yes, we did discuss this a earlier on and you might be aware of similar
> > approaches in HXL [1] and Linked CSV [2].
> >
> > We decided to rule this out of scope for now, mostly because adoption
> > would require publisher effort to amend existing CSV files and we only
> > had time to address the 80% case.
> >
> > However, we have tried to ensure that the specifications support the
> > scenario where someone (maybe a future incarnation of the group) defines
> > a CSV-based syntax that includes embedded metadata. You'll see an
> > example of how that could work in [3].
> >
> > Can you confirm that you're content with this response?
> >
> > Thanks,
> >
> > Jeni
> >
> > [1] http://hxlstandard.org/
> > [2] http://jenit.github.io/linked-csv/
> > [3]
> > http://w3c.github.io/csvw/syntax/index.html#recognising-tabular-data-formats
> >
> > On 30 Apr 2015 03:43, "David Booth" <david@dbooth.org
> > <mailto:david@dbooth.org>> wrote:
> >
> >     I don't know if the working group has already considered this, but
> >     I'd like to suggest consider allowing CSV metadata to be specified
> >     in another CSV file, as an alternative to JSON.  I have found this
> >     approach to be quite convenient in a tool that I've been developing,
> >     and I think it could increase uptake of a CSV metadata standard.
> >
> >     Here is a very short mockup video (2 minutes 59 seconds) that
> >     illustrates this approach:
> >     https://www.youtube.com/watch?v=LmQWHdaN8_w
> >
> >     I realize that some CSV metadata authors may prefer JSON syntax.
> >     But as simple as JSON is, spreadsheet competence is far more
> >     widespread.  Also I would not blame anyone for being disinclined to
> >     consider this approach given the late date.  But this approach only
> >     involves different syntax -- not semantics -- and if it does indeed
> >     lower the adoption barrier then it seems to me that it would be
> >     worth considering.
> >
> >     What do others think?
> >
> >     Thanks,
> >     David Booth
> >
> >
> 


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

Received on Friday, 1 May 2015 08:27:56 UTC