- From: Ivan Herman <ivan@w3.org>
- Date: Mon, 31 Mar 2014 11:48:27 +0200
- To: Andy Seaborne <andy@apache.org>
- Cc: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
- Message-Id: <664915A1-F1B5-49B3-8B5A-7972FF46CF80@w3.org>
On 31 Mar 2014, at 11:39 , Andy Seaborne <andy@apache.org> wrote: > On 31/03/14 08:33, Ivan Herman wrote: >> >> On 31 Mar 2014, at 01:24 , Yakov Shafranovich <yakov-ietf@shaftek.org> wrote: >> >>> see: >>> >>> http://data.gov.il/data?title=&category=All&type=All&ministry=All&file_type=csv >>> >>> Looks like the columns are going in reverse order >>> >> >> This may be an interesting use case to investigate a bit further (beyond this), because: >> >> - I am not sure what encoding is used. If I download a file (I tried[1]) and read into iWork Number or simply look at it in a text editor, I get gibberish, although programs on Macs do usually handle UTF-8 natively, afaik. The question is, then, how does one find out what encoding is used. Note that "curl --head" on [1] does not reveal any more information. Yakov, I presume you succeeded to get it in Hebrew, how did you get the right results? > > Content-type is application/octet-stream. > > When I set the character set to ISO-8859-8 (Hebrew), it displays in Firefox. LibreOffice can read it if I tell it that it's ISO-8859-8 Ah, indeed. So the question is where this encoding is to be specified. I guess something to be pushed, at the minimum, into the metadata of a file... (and also in the return HTTP header, but that does not help if the file is downloaded...) > > Does the locale of the client can affect the displayed column order? In my screen editor it does not. (Interestingly, I did not find a way to specify the encoding in iWorks Number:-( Ivan > > Andy > >> - The JSON file is also published alongside the CSV files ([2]). Some notes on that one: >> - the structure is very much what one would expect (each row a separate object) >> - the Hebrew text is now correctly in Hebrew >> - the column names are in English (that may be the case in the CSV file, but I could not read it) >> - all records are collected into one big Array labeled as "Mishmorah" (I do not know what that means), but there is no "row number" in the individual objects for rows. I presume using an array is a more natural way of preserving the order of the rows... Is it something we should take into account for our JSON conversion? >> >> Definitely something to be added to the use case list I believe. Thanks Yakov! >> >> Ivan >> >>> Yakov >>> >> >> [1] http://www.justice.gov.il/MojHeb/DataGov/Custody/Custody_Court_Decisions_2006-2010.csv >> [2] http://www.justice.gov.il/MojHeb/DataGov/Custody/Custody_Court_Decisions_2006-2010.json >> >> ---- >> Ivan Herman, W3C >> Digital Publishing Activity Lead >> Home: http://www.w3.org/People/Ivan/ >> mobile: +31-641044153 >> GPG: 0x343F1A3D >> FOAF: http://www.ivan-herman.net/foaf ---- Ivan Herman, W3C Digital Publishing Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D FOAF: http://www.ivan-herman.net/foaf
Received on Monday, 31 March 2014 09:48:58 UTC