W3C home > Mailing lists > Public > public-csv-wg@w3.org > February 2014

Re: implicit delimeter escaping, log files

From: Andy Seaborne <andy@apache.org>
Date: Wed, 26 Feb 2014 11:27:42 +0000
Message-ID: <530DCFAE.9050409@apache.org>
To: public-csv-wg@w3.org
Stasinos,

The "Tabular Data on the Web" doc now has a data model (section 2) as 
well as a concrete syntax (section 3).  CSV+ is very close to the data 
model.

The data model could be the target for describing the relationship of 
other existing syntaxes.  Hopeful, that would allow the metadata work to 
be reusable to such existing syntaxes.

Does that approach work for you?

	Andy

On 26/02/14 07:57, Stasinos Konstantopoulos wrote:
> Jeni, all,
>
> There is also data where the last column extends to the end-of-line
> regardless any unescaped/unquoted delimeters it might contain.
>
> There might be more examples, but the one that immediatelly springs to
> mind is log files such as those written by postgreSQL:
>
> 2014-02-14 05:58:55 EET LOG:  received fast shutdown request
> 2014-02-14 05:58:55 EET LOG:  aborting any active transactions
> 2014-02-14 05:58:55 EET LOG:  autovacuum launcher shutting down
> 2014-02-14 05:58:55 EET LOG:  shutting down
> 2014-02-14 05:58:55 EET LOG:  database system is shut down
> 2014-02-14 05:59:28 EET LOG:  database system was shut down at 2014-02-14 05:58:55 EET
> 2014-02-14 05:59:28 EET LOG:  incomplete startup packet
> 2014-02-14 05:59:28 EET LOG:  database system is ready to accept connections
> 2014-02-14 05:59:28 EET LOG:  autovacuum launcher started
>
> This format can be read in difference ways [1] so the example might not
> be perfect, but it is only meant to illustrate the point. I am sure
> there will more data like this, where everything left after the Nth
> character or the Mth delimeter is a single text field, no matter what it
> contains.
>
> The more general point for the group's consideration is whether log
> files in general in scope; regardless of whether we are discussing
> difficult ones or more CSV-behaved ones, such as the Common Logfile
> Format [2].
>
> Till later,
> stasinos
>
> [1] fixed length fields except the last, or two columns delimited by the
> the left-most occurence of the string "LOG:"
>
> [2] http://www.w3.org/Daemon/User/Config/Logging.html
>
Received on Wednesday, 26 February 2014 11:28:12 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:39 UTC