Re: Encoding of CSV results of a SELECT query

John - is your question about a specific implementation. If so, then 
their users list/forum is probably a better place to ask.

Inline...

On 19/01/18 23:31, John Walker wrote:
> Greetings SPARQLers,
> 
> I have an RDF dataset that contains Unicode UTF-8 characters in literal 
> values (but theoretically could also be in an IRI).
> 
> I do a query against the dataset using SPARQL 1.1 Protocol and specify I 
> want CSV results with this header:
> 
>      Accept: text/csv
> 
> The SPARQL 1.1 Query Results CSV and TSV Formats recommendation states:
> 
>  > Systems providing these formats should note that the content types 
> for CSV is text/csv and for TSV text/tab-separated-values.
> 
>  > Being text/*, the default character set is US-ASCII. The charset 
> parameter should be used in conjunction with SPARQL Results;
> 
>  > UTF-8 is recommended: text/csv; charset=utf-8 and 
> text/tab-separated-values; charset=utf-8.
> As the Accept header in the request does not specify a charset, should 
> the server respond with `text/csv; charset=utf-8` or default to 
> `text/csv; charset=iso-8859-1`?

RFC 6838, section 4.2.1, changes the default behaviour:

"""
relying on the US-ASCII default defined in Section 4.1.2 of [RFC2046] is 
no longer permitted.
"""

so check Content-Type: of the response.

> In case the latter, should the server return an error if the results 
> include UTF-8 characters?

That would be tricky - CSV can be streamed back in which case the header 
is already sent before the results are serialized. Any error 
subsequently can't be communicated by HTTP mechanisms.

Hence there is an implementation choice to be made - buffer all results 
to check them first, which is potentially costly, or assume the result 
are OK and send them.

     Andy

> 
> Regards,
> 
> John Walker
> 
> Principal Consultant & co-founder
> 
> Semaku B.V. | Torenallee 20 (SFJ 3D) | 5617 BC Eindhoven | T +31 6 475 
> 22030 | https://semaku.com/
> 
> KvK: 58031405 | BTW: NL852842156B01 | IBAN: NL94 INGB 0008 3219 95
> 

Received on Saturday, 20 January 2018 00:42:50 UTC