- From: Yakov Shafranovich <yakov-ietf@shaftek.org>
- Date: Thu, 20 Nov 2014 07:35:27 -0500
- To: Juergen Umbrich <juergen.umbrich@wu.ac.at>
- Cc: "public-csv-wg@w3.org" <public-csv-wg@w3.org>, Sebastian Neumaier <sebastian.neumaier@wu.ac.at>
- Message-ID: <CAPQd5oT7s5Z_=DoLxDYzhy_rcn=ofBnk2j-k6jPVtmAaDHQE4Q@mail.gmail.com>
Thanks On Thu, Nov 20, 2014 at 7:31 AM, Juergen Umbrich <juergen.umbrich@wu.ac.at> wrote: > Hi Yankov, > > > > > I am wondering if there is a correlation between the correct MIME type > being used and the software being used as identified by the "Server" > header. Is there any chance you may have that data? > Sure, this data is available and we can get the numbers hopefully > beginning of next week since i won't be able to compile the numbers during > this week. > > Best > Jürgen > > > > Thanks, > > Yakov > > > > On Wed, Nov 19, 2014 at 6:30 AM, Juergen Umbrich < > juergen.umbrich@wu.ac.at> wrote: > > Hi all, > > > > as "announced" last week, here is our first early report about our > findings by looking into 65k CSV files, published as OpenData on the Web. > > > > "This study reports on our findings about 74395 CSV files published on > the Web as Open Data. The documents are extracted from 91 Open Data CKAN > portals for which the meta data indicate a comma/character-separate-values > file. Our analysis includes the inspection of the HTTP response headers, > encoding detection and guessing of used delimiters. We also determine the > deviation of data tables compared to a canonical form [1]. > > > > Our findings show that the majority of the CSV files adhere to the > RFC4180 specification, meaning the use of csv as file extension, text/csv > as the HTTP response header content-type , and ',' as delimiter. We also > show that there exists nearly no information about the content encoding in > the HTTP head- ers. The major observed deviations are that data tables > contain rows in which one or several data cells occupy multiple columns and > that one or several data cells are empty." > > > > > > > > Best > > Jürgen > > > > -- > > Dr. Jürgen Umbrich > > WU Vienna, Institute for Information Business > > > > > > > > > > > > > > > > -- > Dr. Jürgen Umbrich > WU Vienna, Institute for Information Business > > > > > >
Received on Thursday, 20 November 2014 12:36:25 UTC