- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Sun, 18 May 2008 17:18:16 +0000
- To: Manu Sporny <msporny@digitalbazaar.com>
- CC: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Benjamin Nowack <bnowack@semsol.com>, Dave Beckett <dave@dajobe.org>
> -----Original Message----- > From: Manu Sporny [mailto:msporny@digitalbazaar.com] > Sent: 18 May 2008 17:11 > To: Seaborne, Andy > Cc: RDFa mailing list; Benjamin Nowack; Dave Beckett > Subject: Re: 2 RDFa SPARQL Test Harness Issues > > Seaborne, Andy wrote: > >> We currently have two test cases that use UTF-8 characters (TC#60 and > >> TC#108). The SPARQL.org and ARC SPARQL engines both die processing > >> queries containing multi-byte UTF-8 characters: > >> > > > > It starts with "\ufeffASK", i.e. a BOM. > > ... > > Remove the BOM and the bomb will not go off. > > *sigh* - Thanks Andy - turns out that both SPARQL queries in the RDFa > Test Suite start off with that BOM... which is why we were seeing those > Test Cases react in a similar manner. > > We could remove it - but it's valid[1][2] UTF-8, isn't it? Technically, > we should be able to feed that to SPARQL and the engine should deal with > it, right? I am not an expert on Unicode - but not by my reading of the Unicode - it's in the middle of the URL string. Hence, just placing the contents of the file, %-ified with BOM, into the query is not right here. http://unicode.org/faq/utf_bom.html#28 """ Note that some recipients of UTF-8 encoded data do not expect a BOM. Where UTF-8 is used transparently in 8-bit environments, the use of a BOM will interfere with any protocol or file format ... """ Even treated (specially) as a zero width non-breaking space as mentioned in the FAQ does not work because a zero width non-breaking space is not whitespace (space, tab, newline, linefeed) as in separates tokens in SPARQL or is ignored as usual. So, the parser it looks much like: "xASK ..." for some character x and xASK is not legal at this point. Andy > > -- manu > > [1] http://unicode.org/faq/utf_bom.html#29 > [2] http://www.rfc-editor.org/rfc/rfc3629.txt > > -- > Manu Sporny > President/CEO - Digital Bazaar, Inc. > blog: DB Launches Medical Record Sales Service with Shepherd Medical > http://blog.digitalbazaar.com/2008/02/24/health2trade/
Received on Sunday, 18 May 2008 17:19:04 UTC