Re: ASCII Control/Unicode C0 chars in SPARQL result format

On 20 Aug 2012, at 18:03, Gregory Williams wrote:

> On Aug 20, 2012, at 12:49 PM, Steve Harris wrote:
> 
>> How do other implementations represent the C0 control chars in SPARQL XML result format?
>> 
>> They're not legal in XML 1.0 (http://en.wikipedia.org/wiki/Valid_characters_in_XML#XML_1.0), and it seems that many XML libraries choke on XML 1.1 data.
>> 
>> This is a bit unfortunate if you have C0 chars in your literals.
>> 
>> Things we've considered:
>> 
>> * try to conneg XML 1.1 so at least our clients can take it (doesn't appear to be easy/obvious how, and some things are not even legal in XML 1.1)
>> * replace C0 chars with something else from unicode, and return a 203 status, or something similar
>> * give an error
>> 
>> None of these is terribly satisfactory though.
> 
> I'm sure my system breaks on control chars, but my initial thought after reading your email was to use the replacement character (U+FFFD) in place of the control chars. I agree it's not terribly satisfying, though.

This is what we've gone with, and returning a 200 code, on that basis that the U+FFFD chars should be enough of a clue that there were representation issues.

- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
+44 7854 417 874  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

Received on Monday, 10 September 2012 14:15:51 UTC