W3C home > Mailing lists > Public > public-cwm-bugs@w3.org > September 2010

Re: tab2rdf.py doesn't generate Turtle

From: Jonathan Rees <jar@creativecommons.org>
Date: Tue, 7 Sep 2010 15:16:25 -0400
Message-ID: <AANLkTim3paJp5V6G1yrefRoYH8QFjU+w9x6VAzhF8o=U@mail.gmail.com>
To: Tim Berners-Lee <timbl@w3.org>
Cc: public-cwm-bugs@w3.org
Also if the last column(s) contain only spaces you get syntactically
incorrect output (stanza that ends in ; instead of .).

I now have my own version that works, but I can't give it to you
because you haven't given me a license to do so. (sorry, that's a
half-joke.) Any chance you can affix a license to the source file, or
put one in some other prominent location? I can't use this code in the
long run unless it's clear that I can redistribute it.

Best
Jonathan

On Wed, Aug 11, 2010 at 11:20 AM, Jonathan Rees <jar@creativecommons.org> wrote:
> Ah, there still is a bug, when a line with correct number of column
> ends in *two* tab characters / empty columns... (this is a real life
> example)
>
> Input:  (note 3rd line ends with two tab characters)
>
> head1   head2   head3   head4
> x11     x12     x13     x13
> x21     x22
>
> Output:
>
> # headings found:  4 ['head1', 'head2', 'head3', 'head4']
> []
>    :head1 "x11";
>    :head2 "x12";
>    :head3 "x13";
>    :head4 "x13".
> #  4 headings but 3 values
> []
>    :head1 "x21";
>    :head2 "x22";
> # Total number of records: 2
>
>
> On Wed, Aug 11, 2010 at 11:07 AM, Jonathan Rees <jar@creativecommons.org> wrote:
>> Ah, I think this is my problem... too few headings added to top of
>> file. The comment about getting a syntax error on output when there
>> are too many input columns and the last is empty still holds, but
>> that's really an input error so I'll take the blame. Sorry for the
>> false alarm
>>
>> Jonathan
>>
>> # headings found:  3 ['head1', 'head2', 'head3']
>> []
>>    :head1 "x11";
>>    :head2 "x12";
>>    :head3 "x13".
>> []
>>    :head1 "x21";
>>    :head3 "x23".
>> #  3 headings but 2 values
>> []
>>    :head1 "x31";
>>    :head2 "x32".
>> []
>>    :head1 "x41";
>>    :head2 "x42";
>> #  3 headings but 4 values
>> []
>>    :head1 "x51";
>>    :head2 "x52";
>>    :column3 "x54".
>> #  3 headings but 4 values
>> []
>>    :head1 "x61";
>>    :head2 "x62";
>>    :head3 "x63";
>>    :column3 "x64".
>> # Total number of records: 6
>>
>>
>>
>> On Wed, Aug 11, 2010 at 10:56 AM, Jonathan Rees <jar@creativecommons.org> wrote:
>>> Thanks for the quick fix. Now I'm having trouble with files that have
>>> empty columns (two tabs in a row).  It would be nice if the script
>>> interpreted that situation as meaning that the value is the empty
>>> string; or else as there being no value at all... anything that treats
>>> tabs as significant and keeps things aligned the way Excel would.
>>>
>>> (Also, if there is an extra column with an empty value, i.e. extra tab
>>> char at end of line, you get syntactically incorrect output.)
>>>
>>> Thanks
>>> Jonathan
>>>
>>> head1   head2   head3
>>> x11     x12     x13
>>> x21             x23
>>> x31     x32
>>> x41     x42
>>> x51     x52             x54
>>> x61     x62     x63     x64
>>>
>>> On Mon, Aug 2, 2010 at 6:04 PM, Tim Berners-Lee <timbl@w3.org> wrote:
>>>> SIgh.  A random and unnecessary departure of turtle from N3 IMHO.
>>>>
>>>> Changed the code anyway.  Almost 10 years on
>>>>
>>>> Tim
>>>>
>>>>
>>>> total revisions: 6;     selected revisions: 6
>>>> description:
>>>> ----------------------------
>>>> revision 1.6
>>>> date: 2010/08/02 22:01:22;  author: timbl;  state: Exp;  lines: +31 -8
>>>> See mail from JAR to public-cwm-talk mid:AANLkTingr3WLyjOFVQqz-v5AASMe4sFbqkhn4ap26DiG@mail.gmail.com
>>>> ----------------------------
>>>> revision 1.5
>>>> date: 2007/10/18 20:55:41;  author: timbl;  state: Exp;  lines: +38 -17
>>>> doublequote escaping
>>>> ----------------------------
>>>> revision 1.4
>>>> date: 2007/06/26 02:36:15;  author: syosi;  state: Exp;  lines: +55 -55
>>>> fix tabs
>>>> ----------------------------
>>>> revision 1.3
>>>> date: 2000/11/10 23:04:18;  author: timbl;  state: Exp;  lines: +1 -1
>>>> Starting basis for qualifiers
>>>> ----------------------------
>>>> revision 1.2
>>>> date: 2000/11/02 20:48:45;  author: timbl;  state: Exp;  lines: +10 -0
>>>> first schema hack
>>>> ----------------------------
>>>> revision 1.1
>>>> date: 2000/10/31 15:56:37;  author: timbl;  state: Exp;
>>>> Hacked TabDelimted-windows format to n3 converter
>>>> =============================================================================
>>>>
>>>> On 2010-08 -02, at 16:37, Jonathan Rees wrote:
>>>>
>>>>> I'm trying to use http://www.w3.org/2000/10/swap/tab2n3.py
>>>>> and have discovered the hard way that it doesn't generate Turtle...
>>>>> not that it claims to, but I felt like complaining.
>>>>>
>>>>> Apparently the Turtle grammar requires a predicate and object
>>>>> following each subject, and in the output of tab2n3 there are subjects
>>>>> with no predicate and object. E.g.
>>>>>
>>>>> # headings found:  3 ['strain_id', 'strain_name', 'strain_type']
>>>>> [
>>>>>    :strain_id "MGI:2164743";
>>>>>    :strain_name "(C57BL/6JEiJ x C3Sn.BLiA-Pde6b<+>)F1";
>>>>>    :strain_type "Not Specified";
>>>>> ] .
>>>>>
>>>>> Here's what the Turtle submission says:
>>>>>
>>>>> [6]   triples ::=     subject predicateObjectList
>>>>> [7]   predicateObjectList     ::=     verb objectList ( ';' verb objectList )* ( ';')?
>>>>>
>>>>> To generate correct Turtle is possible but awkward. You could say
>>>>> tab2n3 is working as designed, and was never meant to generate Turtle,
>>>>> only N3, but... wouldn't it be nice?
>>>>>
>>>>> Jonathan
>>>>>
>>>>
>>>>
>>>
>>
>
Received on Tuesday, 7 September 2010 19:17:03 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:52:01 UTC